Yugabyte Founder and Co-CEO Karthik Ranganathan on why distributed PostgreSQL is the future, what agentic AI demands from data infrastructure, and the 3 a.m. rule every engineer needs.
Most technology conversations today begin and end at the model layer — the LLMs, the GPUs, the agents. Rarely does anyone pause to ask what’s underneath all of it.
Karthik Ranganathan, Founder and Co-CEO of Yugabyte, has spent his career thinking about exactly that. Before co-founding Yugabyte, he helped build and run Cassandra and HBase at Facebook — not just architecting them, but operating them at a scale where failure wasn’t a possibility to plan for, it was a certainty to design around.
Today, as enterprises race to become AI-ready, Ranganathan argues the bottleneck isn’t the model. It’s the foundation. In a wide-ranging conversation, he makes the case for why distributed databases are no longer an engineering preference — they’re a strategic imperative — and why the industry is dangerously underestimating what agentic AI will demand from data infrastructure.
Excerpts from the interview;
You helped build Cassandra and HBase at Facebook before co-founding Yugabyte. What did working at that scale teach you that most database companies get wrong from the start?
Three things stand out. The first is simple: never do anything that wakes you up at 3 a.m. that you’d rather not be doing at 10 a.m. It sounds obvious, but it’s easy to forget.
At Facebook (now Meta), our team was unusual — we weren’t just building Cassandra and HBase, we were running them in production for massive workloads. Typically, those are two different teams, and when they’re separate, you get friction. The building team says, “Here’s what I built, now run it.” The running team says, “I didn’t ask for any of this.” When it’s the same team, you learn very quickly which features are actually hurting you.
The second principle is that you’re always building for failure. At scale, even low-probability events become near-certainties because you’re multiplying that probability by an enormous number. Failure becomes the norm. So you cannot build a system that depends on someone heroically waking up and fixing things. You have to build resilience into the architecture itself.
The third is that operational complexity must be encoded into APIs and strict guarantees—not left to human judgment, especially at 3 a.m. The most painful example I’ve seen was a team doing a large Cassandra migration over the Christmas break. Sleepless nights, flawless execution — and then they dropped the wrong database. Not malice, not incompetence. Just exhaustion. That’s what happens when you leave critical operations to tired people instead of automating the guardrails.
So: know which features to build and which to leave out. Design for failure from day one. And make your operations API-driven so that humans — especially tired ones — don’t become the weakest link.
PostgreSQL has become the default database for modern application development, yet it was originally designed for a very different era. Why has distributed PostgreSQL emerged as such a major category only recently?
It helps to trace the arc of database history. In the 70s and 80s, you had commercial relational databases — Oracle, DB2 — built to accelerate enterprise application development. Then the internet arrived in the 90s, and those commercial databases were too expensive for the wave of low-value web applications being built. That’s when open-source emerged. MySQL became the backbone of the LAMP stack. PostgreSQL grew alongside it.
By the mid-2000s, data volumes exploded — mobile arrived, JSON became important — and MySQL and PostgreSQL couldn’t keep up with scale. That’s when NoSQL took off. Cassandra, HBase, MongoDB — we built a lot of these because the existing architecture simply wasn’t designed for what was coming.
Then in the mid-2010s, two things converged. Cloud-native architecture became the standard, and that required distributed systems. At the same time, Oracle acquired MySQL, and enterprises started to feel the risk of being locked into a fully commercial structure. PostgreSQL — robust, fully open-source, with decades of features — became the obvious answer. Those two forces, cloud-native architecture demanding distribution and PostgreSQL winning as the open API of choice, are precisely what made distributed PostgreSQL a major category.
Also Read: Snowflake Bets the Enterprise on AI Agents — and Means It
As hyperscalers expand their database offerings, where do you see independent database companies still able to create durable differentiation?
Hyperscalers build looking backward. Independent database companies have to look ahead.
Hyperscalers optimize for the mass market. They see what millions of customers are already using, and they build for that. So you get standard PostgreSQL, then slightly cloud-optimized PostgreSQL, then a scalable NoSQL option. Each step follows adoption — it’s always behind the innovation curve.
We made a decision in 2016 that distributed PostgreSQL was the future. At that point, the market hadn’t fully moved to the cloud, and PostgreSQL hadn’t yet become the default API. We were building for where things were going, not where they were. That’s why we exist and why we’re growing today.
There’s also a structural advantage that hyperscalers simply cannot offer: multi-cloud. A cloud provider will always build for their own cloud. But enterprises need to know what happens if an AWS or Azure region goes down. They need portability across private, public, and hybrid environments. Hyperscalers are structurally unable to give them that. Independent database companies can — and that becomes a durable differentiator.
Everyone’s talking about GPUs, LLMs, and vector stores. But what happens to the database when an agentic system needs to read and write at machine speed, across regions, in real time? Is the industry underestimating this?
Absolutely — and it’s one of the most underappreciated risks right now.
Think about a building or a train that looks spectacular from the outside, but inside everything is broken and rusting. That’s what it looks like when you run modern AI workloads on traditional database infrastructure. AI doesn’t pause. It doesn’t wait for an hour while you upgrade. It expects the infrastructure to be always available and always fast. If the foundation isn’t ready, the AI isn’t the bottleneck — the infrastructure is.
The second point is about experimentation speed. There’s a compounding analogy I like: if you improve by 10% every day for a year, that’s 1.1 to the power of 365 — an astronomically large number. But if you’re flat or slightly declining, the result is negligible or worse. In AI development, the difference between 0.9, 1.0, and 1.1 iteration speed is the entire game. And to iterate at that speed, you need infrastructure that is robust, fast, and able to catalog and continuously build on results.
The third dimension is efficiency. Once you’re past the experimentation phase and running AI at scale, you cannot afford infrastructure costs that exceed what the business is generating. It has to be efficient, not just fast.
Databases are built to solve exactly these problems — repeatability, efficiency, simplicity. The conversation has moved to newer layers, but the foundation has to keep pace.
Also Read: 100 Things Google Announced at I/O 2026
Several infrastructure companies, including CockroachDB, HashiCorp, and Redis, have moved away from traditional open-source models. What does that say about the economics of infrastructure software, and how do you see Yugabyte navigating that tension?
We made our decision on this in 2019, and it has served us well.
The companies that walked away from open source were trying to solve a real problem: hyperscalers were taking their revenue. They built the software, the cloud providers packaged it, and customers went there. That frustration is legitimate. But the fix they chose — restricting open source — misses the point. Open source drives adoption in the first place.
What’s happened since tells the story. Redis moved away from open source. The cloud providers forked it and built Valkey. Elasticsearch did the same and got OpenSearch. Once you abandon open source, someone takes the last open version and continues from there. You don’t solve the problem — you just create a competitor.
For CockroachDB, the challenge is even more structural. They’re competing in the relational database space where PostgreSQL is the dominant open-source option. Trying to sell against PostgreSQL commercially is an uphill battle by design.
Our view has always been different. If a cloud provider takes YugabyteDB, that’s fine. We’re all operating in a larger market built around PostgreSQL. As long as we keep innovating, there’s a clear place for us. And that’s proven to be true.
The database sits at the center of every enterprise’s data strategy but rarely comes up in the boardroom. How do you make the case that getting it right is a competitive decision, not just a technical one?
It’s coming up more than it used to — and the reason is acceleration.
Ten years ago, there was enough slack in the system. A company could take a year or two to rethink its database strategy. The business would choose to move forward and revisit infrastructure later. That was a reasonable tradeoff when the cost of delay was manageable.
That calculus no longer holds. When everything is becoming agentic and digital, you’re not looking at two times the infrastructure spend to fix a mistake — you’re looking at ten times. You’re also seeing innovation held back, and aggressive pricing from legacy vendors like Oracle compounding the problem. When all of that converges, it becomes a board-level conversation because staying competitive and staying cost-effective now depend on the same decision.
The broader shift is that technology is moving into the business itself. Smaller, faster companies will disrupt established players if those players don’t modernize their foundations. The database is at the center of that. It’s not a technical preference anymore — it’s a strategic one.
Many enterprises will never operate at Facebook or Google scale. How should technology leaders decide whether distributed databases are a strategic necessity or simply engineering overkill?
The comparison to Facebook or Google scale is a bit misleading. The right way to think about it is this: at a company like Google or Facebook, you have a small number of applications running at a petabyte scale. In a large enterprise, you have a far larger number of applications — hundreds of them — running at terabyte scale. Net-net, the total data footprint is comparable. It just looks different.
Take India’s financial sector as an example. NPCI, the National Payments Corporation of India, uses YugabyteDB for UPI payments. HDFC Bank runs multiple use cases on it. These aren’t Facebook-scale deployments in the traditional sense, but they are extremely high-availability, mission-critical workloads where downtime is not an option.
The other thing technology leaders need to account for is trajectory. You don’t know what tomorrow will bring. If you get everything right and your application takes off — going from ten million users to a hundred million — you cannot pause and tell your users to wait six months while you rebuild the infrastructure. By the time you’re back, they’re gone. Success cannot become the reason for failure.
The goal is an infrastructure that is efficient at a lower scale, capable of handling large-scale operations, and genuinely pay-as-you-grow. That is what scalable actually means.
Also Read: Google I/O 2026 Day 1: The Agentic Era Is Here
AI agents are creating a world where software, not humans, becomes the primary user of databases. How does that change the way databases need to be designed, secured, and operated?
Significantly — and it’s part of why we’re building our second open-source data infrastructure product, Meko.
Our hypothesis, based on real usage and our own experience, is that if you put agentic workloads on top of a traditional database, you will end up with something inefficient, expensive, and opaque. Agents designing schema on the fly will not produce optimal data models — not because the agents are poorly built, but because expressing the full complexity of a system’s requirements in natural language is genuinely hard. If I asked you to describe everything you’ve learned in your career, you could share some of it, but it would never be complete. The same constraint applies to agents building dynamic workflows.
When those imperfect data models lead to poor performance or unexpected behavior, you’ve now added a layer of complexity that’s difficult to debug and expensive to fix.
So our view is that databases are a piece of the puzzle for agentic systems, but not the entire puzzle. There needs to be a set of layers built on top to make agent workflows efficient, auditable, and reliable. At the same time, the database itself has to evolve — handling not just relational data but vector, graph, and other data types that agent workflows will increasingly demand.
The infrastructure has to grow as what’s built on top of it grows. That’s the work in front of us.


