What Is Database Sharding? When Your AI-Built App Outgrows a Single Database

Q: Can PostgreSQL do sharding natively?

PostgreSQL supports table partitioning natively (splitting a table into smaller pieces on the same server), but true multi-server sharding requires extensions like Citus (now part of Azure) or tools like Vitess (for MySQL). Many managed services like CockroachDB, PlanetScale, and Neon handle sharding automatically. For most AI-built apps, managed services are the practical path — they handle the routing, rebalancing, and failover that make sharding so complex to do yourself.

TL;DR: Database sharding is splitting one massive table across multiple database servers so no single server has to handle everything. Think of it like opening a second warehouse when the first one is full — orders get routed to whichever warehouse has the item. It makes queries faster at massive scale, but it adds serious complexity. Most apps under 10 million rows don't need sharding — they need better indexes, caching, or connection pooling. But when you do need it, nothing else will save you.

Why AI Coders Need to Know This

Here's a scenario that's becoming more common every month. You built something with AI — a SaaS app, a marketplace, a social platform. It worked great at launch. But now you've got real users. Tens of thousands. Maybe hundreds of thousands. Your PostgreSQL database has a table with 20 million rows and growing.

Queries that used to take 50 milliseconds now take 3 seconds. Your dashboard page times out. Your users are complaining. You open Claude and type:

"My PostgreSQL database has 20 million rows in the orders table and queries are getting really slow. How do I fix this?"

Claude might mention indexing. It might suggest caching. But at some point in that conversation — especially if you describe high write volume — it's going to say the word sharding.

And if you don't know what that means, you'll either implement it too early (wasting weeks on complexity you didn't need) or too late (your app is already on fire). Neither is great.

You don't need to become a distributed systems engineer. But you need to understand what sharding does, when it's the right answer, and — just as importantly — when it's the wrong one. Because AI will suggest it. And knowing when to push back on that suggestion is a skill that separates builders who ship from builders who sink.

The Real Scenario: Your Multi-Tenant SaaS Is Crushing the Database

Let's make this concrete. You built a project management tool — think a simplified Notion — using Next.js and PostgreSQL. Each company (tenant) gets their own workspace. Users create tasks, upload files, leave comments.

At 500 companies, everything is fine. At 5,000 companies, you notice the tasks table has 15 million rows. The main dashboard query — "show me all tasks for this workspace, sorted by due date" — is taking 4 seconds. Your indexes are set up correctly. You've already added Redis caching for the most-hit endpoints. You've optimized your queries.

But the writes are killing you. Every time a user creates a task, updates a status, or leaves a comment, those writes compete with the reads. The single PostgreSQL server is maxed out at 80% CPU. Adding a read replica helped with reads, but your write volume is still hammering the primary.

This is where sharding enters the conversation. You need to split the load across multiple database servers, not just for reads — but for writes.

Prompt That Triggers This

My multi-tenant SaaS app uses PostgreSQL. The tasks table has
15 million rows across 5,000 company workspaces. Dashboard queries
take 4+ seconds even with indexes. Write volume is high —
thousands of inserts/updates per minute. Read replicas helped
with reads but writes are still bottlenecked on the primary.
How do I scale this? Should I implement sharding?

What AI Generated: A Sharding Implementation

When you give Claude that prompt, it generates something like this — a tenant-based sharding setup where each company's data lives on a specific database shard:

// shard-manager.ts — Routes queries to the correct database shard

import { Pool } from 'pg';

// Define your shard map — which database server handles which tenants
const shardConfig = {
  shard_0: {
    connectionString: process.env.SHARD_0_URL,  // Companies 1-2500
    pool: null as Pool | null,
  },
  shard_1: {
    connectionString: process.env.SHARD_1_URL,  // Companies 2501-5000
    pool: null as Pool | null,
  },
};

// Initialize connection pools for each shard
export function initShards() {
  for (const [name, config] of Object.entries(shardConfig)) {
    config.pool = new Pool({
      connectionString: config.connectionString,
      max: 20,              // 20 connections per shard
      idleTimeoutMillis: 30000,
    });
    console.log(`Initialized shard: ${name}`);
  }
}

// Determine which shard a tenant belongs to
function getShardForTenant(tenantId: number): Pool {
  // Simple range-based sharding
  if (tenantId <= 2500) {
    return shardConfig.shard_0.pool!;
  } else {
    return shardConfig.shard_1.pool!;
  }
}

// Query the correct shard for a given tenant
export async function shardQuery(
  tenantId: number,
  query: string,
  params: any[]
) {
  const pool = getShardForTenant(tenantId);
  const result = await pool.query(query, params);
  return result.rows;
}

// Cross-shard query (when you need data from ALL tenants)
export async function crossShardQuery(query: string, params: any[]) {
  const results = await Promise.all(
    Object.values(shardConfig).map(shard =>
      shard.pool!.query(query, params)
    )
  );
  // Merge results from all shards
  return results.flatMap(r => r.rows);
}

// api/tasks.ts — Using the shard manager in an API route

import { shardQuery } from '../lib/shard-manager';

// GET /api/tasks?workspaceId=1234
export async function GET(request: Request) {
  const url = new URL(request.url);
  const workspaceId = parseInt(url.searchParams.get('workspaceId')!);

  // The shard manager routes this to the correct database
  const tasks = await shardQuery(
    workspaceId,
    `SELECT id, title, status, due_date, assignee_id
     FROM tasks
     WHERE workspace_id = $1
     ORDER BY due_date ASC
     LIMIT 50`,
    [workspaceId]
  );

  return Response.json({ tasks });
}

// POST /api/tasks — Create a new task
export async function POST(request: Request) {
  const body = await request.json();
  const { workspaceId, title, assigneeId, dueDate } = body;

  // Write goes to the correct shard automatically
  const [task] = await shardQuery(
    workspaceId,
    `INSERT INTO tasks (workspace_id, title, assignee_id, due_date, status)
     VALUES ($1, $2, $3, $4, 'todo')
     RETURNING *`,
    [workspaceId, title, assigneeId, dueDate]
  );

  return Response.json({ task }, { status: 201 });
}

-- Schema: each shard has the SAME table structure
-- Shard 0 and Shard 1 both run this migration

CREATE TABLE tasks (
    id            SERIAL PRIMARY KEY,
    workspace_id  INTEGER NOT NULL,
    title         VARCHAR(500) NOT NULL,
    status        VARCHAR(50) DEFAULT 'todo',
    assignee_id   INTEGER,
    due_date      DATE,
    created_at    TIMESTAMPTZ DEFAULT NOW(),
    updated_at    TIMESTAMPTZ DEFAULT NOW()
);

-- Index on workspace_id — critical for shard-local queries
CREATE INDEX idx_tasks_workspace_id ON tasks(workspace_id);
CREATE INDEX idx_tasks_workspace_due ON tasks(workspace_id, due_date);

-- Each shard only contains rows for its assigned tenants
-- Shard 0: workspace_id 1-2500
-- Shard 1: workspace_id 2501-5000

Understanding Each Part

That's a lot of code. Let's break down what's actually happening and why each piece matters.

The Shard Map: Routing Traffic to the Right Database

The shardConfig object is the brain of the whole system. It maps shard names to database connection strings. Each shard is a completely separate PostgreSQL server (or instance) with its own resources — CPU, RAM, disk, connections.

Think of it like having two filing cabinets instead of one. Cabinet A holds folders for companies 1-2500. Cabinet B holds companies 2501-5000. The shard map is the label on each cabinet telling you what's inside.

The Shard Key: How You Decide Where Data Lives

The getShardForTenant() function implements the shard key logic. The shard key is the column you use to determine which shard a row belongs to. In this case, it's workspace_id (the tenant ID).

This is the single most important decision in any sharding implementation. The shard key determines:

How evenly data is distributed — If company #1 has 5 million tasks and company #2 has 50 tasks, a tenant-based shard key means shard 0 is massively overloaded while shard 1 sits idle. That's called a "hot shard."
Which queries can stay on one shard — If you always query by workspace_id, every query hits exactly one shard. Fast. But if you need to query across all workspaces (like an admin dashboard showing total tasks), you need a cross-shard query, which is slower and more complex.
How you handle joins — Data on different shards can't be joined with a simple SQL JOIN. If a task on shard 0 references a user on shard 1, you're in trouble.

Connection Pools Per Shard

Notice that each shard gets its own connection pool. This is essential — you're now managing connections to multiple databases, not just one. If you have 3 shards with 20 connections each, your app is maintaining 60 total database connections. That's more operational overhead.

Cross-Shard Queries: The Expensive Part

The crossShardQuery() function shows why sharding adds complexity. When you need data from all shards — like "how many total tasks exist across all workspaces?" — you have to:

Send the same query to every shard
Wait for all of them to respond
Merge the results in your application code

This is slower than a single query against a single database. And the more shards you have, the slower it gets. Any feature that requires cross-shard data (global search, analytics dashboards, leaderboards) becomes significantly more complex to build and slower to run.

Sharding Strategies: There's More Than One Way to Split

The example above uses range-based sharding — tenants 1-2500 on shard 0, 2501-5000 on shard 1. But that's just one approach. AI might suggest different strategies depending on your use case:

1. Range-Based Sharding

Split by ranges of the shard key. Companies 1-2500 on shard A, 2501-5000 on shard B.

Pros: Simple to understand. Easy to add new shards (just extend the range).
Cons: Uneven distribution. If early tenants are power users, shard A gets hammered. New tenants cluster on the latest shard.

2. Hash-Based Sharding

Run the shard key through a hash function. shard = hash(tenant_id) % number_of_shards.

function getShardForTenant(tenantId: number): Pool {
  // Hash-based: distributes tenants evenly across shards
  const shardIndex = tenantId % Object.keys(shardConfig).length;
  return shardConfig[`shard_${shardIndex}`].pool!;
}

Pros: Even distribution regardless of tenant creation order.
Cons: Adding a new shard changes the hash output, meaning you need to re-distribute (rebalance) existing data. This is painful.

3. Directory-Based Sharding

Maintain a lookup table that maps each tenant to a specific shard.

// A separate "directory" database maps tenants to shards
// tenant_id → shard_name
// 1234      → shard_0
// 5678      → shard_2
// 9012      → shard_1

async function getShardForTenant(tenantId: number): Promise<Pool> {
  const result = await directoryDb.query(
    'SELECT shard_name FROM tenant_shards WHERE tenant_id = $1',
    [tenantId]
  );
  return shardConfig[result.rows[0].shard_name].pool!;
}

Pros: Maximum flexibility. Move individual tenants between shards without rebalancing everything. Put your biggest customer on their own shard.
Cons: The directory database is a single point of failure. Every query requires a lookup first. More moving parts.

4. Geographic Sharding

Split by region. US customers on US servers, EU customers on EU servers.

Pros: Reduces latency for users. Helps with GDPR compliance (EU data stays in EU).
Cons: What happens when a US company has an EU employee? Cross-region queries are slow.

For most AI-built SaaS apps: If you do need sharding, tenant-based sharding with a directory lookup is the most practical approach. It lets you move noisy tenants to dedicated shards and scale incrementally. Ask your AI to implement this pattern specifically.

What AI Gets Wrong About Database Sharding

AI is remarkably good at generating sharding code. It's remarkably bad at telling you whether you should use it. Here's what to watch for:

1. AI Suggests Sharding Way Too Early

This is the biggest issue. You tell Claude your database is slow, and it jumps to sharding. But in 90% of cases, the real problem is:

Missing indexes. A single composite index on your most-queried columns can turn a 4-second query into a 10-millisecond query. AI-generated code often creates tables without proper indexes.
N+1 queries. Your code fetches a list of items, then fetches related data for each item individually. That's 101 queries instead of 2.
No caching. Data that barely changes (user profiles, workspace settings) gets fetched from the database on every request. Redis caching would eliminate most of these queries entirely.
No connection pooling. Your app is wasting time opening and closing connections instead of reusing them. Connection pooling is a 5-minute fix.

The rule of thumb: If you haven't maxed out indexes, caching, connection pooling, and read replicas — you don't need sharding. Period.

2. AI Ignores the Operational Complexity

The code AI generates looks clean. What it doesn't show you is the operational nightmare:

Backups are now N times harder. Two shards means two backup schedules, two monitoring dashboards, two sets of alerts.
Migrations run on every shard. Adding a column? You deploy to every shard. One fails? Now your shards are out of sync.
Rebalancing. When shard 0 is full and shard 1 is empty, you need to move data between them — without downtime. This is incredibly difficult.
Debugging. "The user says their tasks are missing." Which shard are they on? Is the routing correct? Did a migration fail on one shard but not the other?

3. AI Generates Application-Level Sharding When You Need Infrastructure-Level

The code example above is application-level sharding — your Node.js code decides which shard to query. This works but means your app code is tightly coupled to your database topology.

In practice, most teams at scale use infrastructure-level sharding — tools that handle sharding transparently:

Citus (PostgreSQL extension) — Distributes tables across nodes. Your queries look like normal PostgreSQL. Citus handles the routing.
Vitess (MySQL) — Originally built for YouTube. Handles sharding, rebalancing, and failover.
CockroachDB — Distributed SQL that auto-shards. You write normal SQL; it handles distribution.
PlanetScale — Managed Vitess. Sharding as a service.

Better Prompt: Ask AI to Evaluate First

My PostgreSQL tasks table has 15 million rows. Queries are
slow even with indexes. Before suggesting sharding, analyze
whether I should try these first:

1. Better composite indexes for my common queries
2. Redis caching for read-heavy endpoints
3. Read replicas for distributing read load
4. Table partitioning (not full sharding)
5. Archiving old data to a separate table

Show me the expected performance improvement from each
option. Only suggest sharding if none of these are sufficient.

4. AI Forgets About Transactions Across Shards

In a single database, you can wrap multiple operations in a transaction — either all of them succeed, or none do. Across shards? You can't use a normal SQL transaction. You need distributed transactions (like two-phase commit), which are slow, complex, and often avoided entirely.

AI will generate code that happily inserts into shard 0 and shard 1 without mentioning that if shard 1 fails, the shard 0 insert already committed. Your data is now inconsistent.

5. AI Doesn't Mention Partitioning as an Alternative

PostgreSQL has built-in table partitioning — splitting a table into smaller pieces on the same server. This gives you many of sharding's query-speed benefits without the operational complexity of multiple servers:

-- PostgreSQL native partitioning — same server, split table
CREATE TABLE tasks (
    id            SERIAL,
    workspace_id  INTEGER NOT NULL,
    title         VARCHAR(500) NOT NULL,
    status        VARCHAR(50) DEFAULT 'todo',
    created_at    TIMESTAMPTZ DEFAULT NOW()
) PARTITION BY RANGE (workspace_id);

-- Create partitions
CREATE TABLE tasks_0_2500 PARTITION OF tasks
    FOR VALUES FROM (0) TO (2501);

CREATE TABLE tasks_2501_5000 PARTITION OF tasks
    FOR VALUES FROM (2501) TO (5001);

CREATE TABLE tasks_5001_7500 PARTITION OF tasks
    FOR VALUES FROM (5001) TO (7501);

-- Queries still use the normal table name
-- PostgreSQL automatically routes to the correct partition
SELECT * FROM tasks WHERE workspace_id = 3000;
-- → Only scans tasks_2501_5000, not the entire table

Partitioning gives you faster queries (PostgreSQL skips partitions that don't match your WHERE clause) without needing multiple servers, cross-shard routing, or distributed transactions. For many apps, this is all you need.

How to Debug Sharding Issues with AI

If you've already implemented sharding (or inherited a sharded system), here are the most common issues and how to debug them with AI's help.

Problem: Hot Shards (Uneven Load)

One shard is running at 90% CPU while others idle at 10%.

Debug Prompt

My sharded database has 4 shards. Shard 2 is at 90% CPU
while the others are under 20%. We shard by tenant_id using
hash-based routing. How do I identify which tenants are
causing the hot shard, and what are my options to rebalance
without downtime?

AI will typically suggest:

Query pg_stat_user_tables on the hot shard to find which tables/tenants have the most activity
Move the largest tenants to their own dedicated shard (directory-based routing)
Split the hot shard into two using a finer-grained hash

Problem: Cross-Shard Queries Are Killing Performance

Your admin dashboard runs analytics across all tenants and takes 30 seconds to load.

Debug Prompt

My sharded app needs to show an admin dashboard with:
- Total tasks across all workspaces
- Tasks created in the last 24 hours (all shards)
- Top 10 most active workspaces

Currently I query all 4 shards and merge in the app.
It takes 30+ seconds. How can I make cross-shard analytics
fast without restructuring the entire system?

Common solutions AI will propose:

Materialized views or summary tables: Each shard maintains running counts that update on writes. The dashboard reads the pre-computed summaries instead of scanning all data.
Event streaming: Writes on each shard publish events to a central analytics database (like ClickHouse or BigQuery) purpose-built for cross-shard aggregation.
Caching the dashboard: Compute the dashboard once per minute and cache the result in Redis. Stale by 60 seconds, but fast.

Problem: Data Ended Up on the Wrong Shard

A bug in your routing logic sent tenant 3000's data to shard 0 instead of shard 1.

Debug Prompt

Due to a routing bug, some rows for tenant 3000 ended up
on shard_0 instead of shard_1. I need to:
1. Find all misrouted rows on shard_0
2. Copy them to shard_1
3. Delete the originals from shard_0
4. Verify data integrity after the move

Generate a migration script that does this safely with
proper error handling. The tables affected are: tasks,
comments, and file_uploads.

The Decision Framework: Do You Actually Need Sharding?

Here's a checklist to work through before you commit to sharding. Be honest with yourself — every "no" means you probably have a simpler fix available.

Question	If No, Try This Instead
Do you have proper indexes on all slow queries?	Add composite indexes
Are you caching frequently-read data?	Add Redis caching
Are you using connection pooling?	Set up connection pooling
Have you added read replicas for read-heavy load?	Add PostgreSQL read replicas
Have you tried table partitioning?	Use PostgreSQL native partitioning
Have you archived old/unused data?	Move old rows to archive tables
Is your write volume too high for a single primary server?	Now consider sharding
Does a single server lack disk space for all your data?	Now consider sharding

If you answered "yes" to all the early questions and you're still bottlenecked — specifically on write volume or data volume — then sharding is the correct solution. Everything else is just throwing complexity at a problem that has a simpler answer.

The honest truth: If your app has fewer than 10 million rows in any single table, you almost certainly don't need sharding. PostgreSQL can handle tens of millions of rows on a single server — with proper indexes and hardware — without breaking a sweat. Instagram ran on a single PostgreSQL server far longer than most people realize. Load test your actual bottleneck before assuming you need to distribute your data.

The Practical Path: What to Actually Do

If you've decided sharding is genuinely what you need, here's the practical path for an AI-enabled builder:

Option 1: Use a Managed Service (Recommended)

Don't build sharding yourself. Use a database that handles it for you:

CockroachDB — Distributed SQL that auto-shards. Write normal PostgreSQL-compatible queries. It handles distribution, rebalancing, and replication.
PlanetScale — Managed MySQL with Vitess under the hood. Branching workflow for schema changes. Built-in sharding.
Citus (on Azure) — PostgreSQL extension that distributes tables. Your existing PostgreSQL code mostly works unchanged.
Neon — While not sharded per se, Neon's architecture separates storage and compute, letting you scale independently.

Migration Prompt

I'm running a PostgreSQL 16 database with 50 million rows
across 10 tables. I need to migrate to a sharded solution.
Compare the migration path for:
1. Citus extension on my existing PostgreSQL
2. CockroachDB migration from PostgreSQL
3. PlanetScale migration from PostgreSQL

For each, show: data migration steps, code changes required,
expected downtime, and ongoing cost considerations.

Option 2: PostgreSQL Table Partitioning (Good Middle Ground)

If your problem is query speed on large tables but you don't need multi-server distribution, PostgreSQL partitioning is the sweet spot. Same server, same tooling, dramatically faster queries on partitioned data.

Option 3: Application-Level Sharding (Full Control)

Build it yourself, like the code example above. This gives you maximum control but maximum responsibility. Only do this if you have specific requirements that managed services can't handle — or if you want to learn the concepts deeply.

What to Learn Next

Now that you understand what database sharding is and when you actually need it, here are the concepts that make the full picture click:

What Is PostgreSQL? — Understand the database engine that most AI tools generate code for, and why it handles more than you think before you need to shard.
What Are Database Indexes? — The single most impactful performance fix. Master this before even thinking about sharding.
What Is Connection Pooling? — Handle more concurrent users without adding database servers. Often solves the problem people think requires sharding.
What Is Caching? — Eliminate redundant database queries entirely. Redis caching is the #2 performance fix after indexes.
What Is Load Testing? — Prove whether you actually need sharding by simulating real production load. Don't guess — measure.
Supabase vs Neon — Choosing your serverless database platform. Both handle scaling differently — understand the tradeoffs before your data grows.

Frequently Asked Questions

What is database sharding in simple terms?

Database sharding is splitting one big database table across multiple smaller databases. Instead of storing all 50 million users in one table on one server, you split them — users A-M go to database server 1, users N-Z go to database server 2. Each piece is called a "shard." Queries only hit the shard that has the data they need, so they stay fast even as your total data grows.

When do I actually need database sharding?

Most apps never need sharding. You likely need it when: your database has hundreds of millions of rows and queries are slow even with proper indexes, you've already tried read replicas and caching but write volume is still too high, or a single database server physically can't store all your data. If you have under 10 million rows, you almost certainly don't need sharding — you need better indexes, query optimization, or caching.

What is a shard key and why does it matter?

A shard key is the column you use to decide which shard a row lives on. For example, if you shard by user_id, all of user 42's data goes to the same shard. Picking the wrong shard key is the #1 mistake in sharding. A bad key creates "hot shards" where most traffic hits one shard while others sit idle, defeating the entire purpose. Good shard keys distribute data evenly and match your most common query patterns.

Can PostgreSQL do sharding natively?

PostgreSQL supports table partitioning natively — splitting a table into smaller pieces on the same server. This handles many of the same problems as sharding. For true multi-server sharding, you need extensions like Citus or managed solutions like CockroachDB, PlanetScale, or Vitess. For most AI-built apps, managed services are the practical choice because they handle the routing, rebalancing, and failover that make DIY sharding so complex.

What should I try before sharding?

Before sharding, try these in order: 1) Add proper indexes to your slow queries — this alone fixes 90% of performance problems. 2) Add caching with Redis for frequently-read data. 3) Set up read replicas to distribute read queries. 4) Use connection pooling to handle more concurrent users. 5) Archive old data you don't query often. 6) Try PostgreSQL table partitioning. Only after exhausting these should you consider full sharding.