What Is DynamoDB? Amazon's Serverless Database Explained for AI Builders

When you tell Claude "build this on AWS," it reaches for DynamoDB almost every time. Here's what it actually is, how it thinks differently than a regular database, and when it's the right call — or the wrong one.

TL;DR

DynamoDB is Amazon's serverless NoSQL database. You don't manage servers — AWS handles scaling, backups, and availability automatically. It stores flexible JSON-like documents, charges per request instead of per hour, and handles massive traffic without blinking. The catch: it's picky about how you structure your data upfront, because searching by anything other than your chosen key is slow and expensive. Great for apps on AWS at scale. Overkill for most first projects.

Why AI Coders Need to Know About DynamoDB

Here's the scenario: you ask an AI assistant to help you build a backend for your new app on AWS. Within the first few responses, it's already writing DynamoDB code — creating tables, defining partition keys, using the AWS SDK. You follow along and it works. But you have no idea why it chose DynamoDB, what the partition key is for, or whether this is even the right database for your situation.

This matters because DynamoDB is fundamentally different from PostgreSQL or MongoDB. It's not just a different flavor of database — it requires a completely different way of thinking about your data. If AI sets it up without you understanding the basics, you will hit confusing limitations later:

  • You try to search by an email address and it's inexplicably slow or expensive
  • You want to find all users in a given city and DynamoDB can't do it efficiently
  • You need to change your data structure and realize you have to migrate everything
  • Your AWS bill has a mysterious line item for "DynamoDB Scan operations"

Understanding what DynamoDB does — and what it doesn't do — saves you from all of these surprises. And since AI reaches for it constantly on AWS projects, this is essential context for any vibe coder building in the AWS ecosystem.

If you're completely new to databases, read What Is a Database? first. If you want to understand the broader difference between document-style databases and relational ones, check out NoSQL vs SQL.

Real Scenario: Building a Serverless Task App on AWS

You're building a simple productivity app. Users can create tasks, mark them complete, and see their task history. You want to host it on AWS and keep the infrastructure as simple as possible — no servers to manage, automatic scaling, and you only pay when people actually use the app.

Your AI Prompt

"I'm building a serverless task management app on AWS. Users can create tasks, view their tasks, mark tasks complete, and delete them. Each task has a title, description, status, and due date. Set up the DynamoDB table and write the basic CRUD operations using the AWS SDK v3 in Node.js."

This is exactly the kind of prompt where AI will immediately reach for DynamoDB and generate a complete working setup. Let's look at what it builds — and then understand every piece of it.

What AI Generated

Step 1: Create the DynamoDB Table (AWS CDK / CloudFormation)

// infrastructure/dynamo-table.ts
// Using AWS CDK to define the DynamoDB table as code

import * as cdk from 'aws-cdk-lib';
import * as dynamodb from 'aws-cdk-lib/aws-dynamodb';

export class TasksTable extends cdk.Stack {
  public readonly table: dynamodb.Table;

  constructor(scope: cdk.App, id: string) {
    super(scope, id);

    this.table = new dynamodb.Table(this, 'TasksTable', {
      tableName: 'tasks',

      // Partition key: the main identifier
      // All tasks for a user are grouped under their userId
      partitionKey: {
        name: 'userId',
        type: dynamodb.AttributeType.STRING,
      },

      // Sort key: lets you query/sort within a partition
      // taskId makes each item unique within a user's partition
      sortKey: {
        name: 'taskId',
        type: dynamodb.AttributeType.STRING,
      },

      // On-demand: pay per request, scale automatically
      billingMode: dynamodb.BillingMode.PAY_PER_REQUEST,

      // Point-in-time recovery: restore to any second in the last 35 days
      pointInTimeRecovery: true,

      // Auto-delete table when stack is destroyed (dev only — remove for prod)
      removalPolicy: cdk.RemovalPolicy.DESTROY,
    });

    // Global Secondary Index: lets you query by status
    // Without this, finding all "pending" tasks requires a slow table scan
    this.table.addGlobalSecondaryIndex({
      indexName: 'StatusIndex',
      partitionKey: {
        name: 'userId',
        type: dynamodb.AttributeType.STRING,
      },
      sortKey: {
        name: 'status',
        type: dynamodb.AttributeType.STRING,
      },
      projectionType: dynamodb.ProjectionType.ALL,
    });
  }
}

Step 2: Install the AWS SDK and Set Up the Client

# Install AWS SDK v3 (modular — only install what you need)
npm install @aws-sdk/client-dynamodb @aws-sdk/lib-dynamodb
// lib/dynamo-client.ts
// Create a reusable DynamoDB client

import { DynamoDBClient } from '@aws-sdk/client-dynamodb';
import { DynamoDBDocumentClient } from '@aws-sdk/lib-dynamodb';

// DynamoDBClient handles the raw connection
const client = new DynamoDBClient({
  region: process.env.AWS_REGION || 'us-east-1',
});

// DynamoDBDocumentClient is the higher-level wrapper:
// automatically converts JS objects to/from DynamoDB's native format
// (you don't want to deal with the raw format — trust me)
export const docClient = DynamoDBDocumentClient.from(client);

Step 3: Create, Read, Update, Delete Operations

// lib/tasks.ts
// All CRUD operations for the tasks table

import {
  PutCommand,
  GetCommand,
  QueryCommand,
  UpdateCommand,
  DeleteCommand,
} from '@aws-sdk/lib-dynamodb';
import { docClient } from './dynamo-client';
import { randomUUID } from 'crypto';

const TABLE_NAME = process.env.TASKS_TABLE_NAME || 'tasks';

// ---- CREATE ----
export async function createTask(userId: string, task: {
  title: string;
  description?: string;
  dueDate?: string;
}) {
  const taskId = randomUUID();
  const now = new Date().toISOString();

  const item = {
    userId,              // Partition key
    taskId,              // Sort key
    title: task.title,
    description: task.description || '',
    status: 'pending',   // pending | complete
    dueDate: task.dueDate || null,
    createdAt: now,
    updatedAt: now,
  };

  await docClient.send(new PutCommand({
    TableName: TABLE_NAME,
    Item: item,
    // Prevent overwriting if item already exists
    ConditionExpression: 'attribute_not_exists(taskId)',
  }));

  return item;
}

// ---- READ ONE ----
export async function getTask(userId: string, taskId: string) {
  const result = await docClient.send(new GetCommand({
    TableName: TABLE_NAME,
    Key: { userId, taskId },  // Must provide BOTH keys
  }));

  return result.Item || null;
}

// ---- READ ALL (for a user) ----
export async function getUserTasks(userId: string) {
  // Query — efficient because we're using the partition key
  const result = await docClient.send(new QueryCommand({
    TableName: TABLE_NAME,
    KeyConditionExpression: 'userId = :uid',
    ExpressionAttributeValues: { ':uid': userId },
    // Sort by taskId (alphabetical/UUID order) — newest last
  }));

  return result.Items || [];
}

// ---- READ BY STATUS (uses GSI) ----
export async function getUserTasksByStatus(userId: string, status: string) {
  const result = await docClient.send(new QueryCommand({
    TableName: TABLE_NAME,
    IndexName: 'StatusIndex',  // Use the Global Secondary Index
    KeyConditionExpression: 'userId = :uid AND #s = :status',
    ExpressionAttributeNames: { '#s': 'status' },  // 'status' is a reserved word
    ExpressionAttributeValues: {
      ':uid': userId,
      ':status': status,
    },
  }));

  return result.Items || [];
}

// ---- UPDATE ----
export async function updateTask(userId: string, taskId: string, updates: {
  title?: string;
  description?: string;
  status?: string;
  dueDate?: string;
}) {
  const now = new Date().toISOString();

  // Build the update expression dynamically
  const updateExpressions: string[] = ['updatedAt = :updatedAt'];
  const expressionValues: Record = { ':updatedAt': now };

  if (updates.title !== undefined) {
    updateExpressions.push('title = :title');
    expressionValues[':title'] = updates.title;
  }
  if (updates.status !== undefined) {
    updateExpressions.push('#s = :status');
    expressionValues[':status'] = updates.status;
  }
  if (updates.dueDate !== undefined) {
    updateExpressions.push('dueDate = :dueDate');
    expressionValues[':dueDate'] = updates.dueDate;
  }

  const result = await docClient.send(new UpdateCommand({
    TableName: TABLE_NAME,
    Key: { userId, taskId },
    UpdateExpression: `SET ${updateExpressions.join(', ')}`,
    ExpressionAttributeNames: { '#s': 'status' },
    ExpressionAttributeValues: expressionValues,
    // Return the updated item
    ReturnValues: 'ALL_NEW',
    // Fail if item doesn't exist
    ConditionExpression: 'attribute_exists(taskId)',
  }));

  return result.Attributes;
}

// ---- DELETE ----
export async function deleteTask(userId: string, taskId: string) {
  await docClient.send(new DeleteCommand({
    TableName: TABLE_NAME,
    Key: { userId, taskId },
    ConditionExpression: 'attribute_exists(taskId)',
  }));
}

Understanding Each Part

What DynamoDB Actually Is

DynamoDB is a fully managed, serverless key-value and document database from Amazon Web Services. Breaking that down:

  • Fully managed — AWS handles the servers, storage, replication across data centers, software updates, and backups. You never SSH into a database server or worry about disk space.
  • Serverless — There's no "running" database server you pay for by the hour. You pay per operation (each read, each write). When no one uses your app, you pay nothing.
  • Key-value and document — It's like a giant dictionary. You store items (JSON documents) and retrieve them by key. Unlike a SQL database, there's no rigid schema — different items in the same table can have different attributes.

It's part of the same serverless ecosystem as AWS Lambda. The combination of Lambda (code that runs on-demand) + DynamoDB (database that scales on-demand) is the standard AWS stack for apps that need to handle unpredictable traffic without managing any infrastructure.

Compared to MongoDB — which is also a document database — DynamoDB is more opinionated about how you access your data. MongoDB lets you query by any field. DynamoDB is optimized for specific access patterns you define in advance.

Tables and Items (Not Rows and Columns)

DynamoDB uses different terminology than SQL databases. Here's the mapping:

DynamoDB TermSQL EquivalentWhat It Means
TableTableContainer for your data — but much more flexible
ItemRowOne piece of data (a JSON-like document)
AttributeColumnA field within an item — but items can have different attributes
Partition KeyPrimary KeyThe main identifier used to find and distribute items
Sort Key(composite key)Secondary identifier — enables range queries within a partition

The biggest departure from SQL: there's no enforced schema. Your "tasks" table can store items where one task has a dueDate and another doesn't, where some items have extra fields others don't. DynamoDB doesn't care. This flexibility is powerful but means your application code is responsible for data consistency.

Partition Keys: The Core Concept

The partition key is the most important design decision in DynamoDB, and the one that trips up nearly every first-time user.

Here's what's actually happening under the hood: DynamoDB stores your data across many physical servers. When you write an item, DynamoDB hashes the partition key to decide which server gets it. When you read that item back, it hashes the key again to find the right server. This is how DynamoDB achieves single-digit millisecond latency at any scale — it always knows exactly where your data lives.

The consequence: you can only efficiently query by partition key. If you design your table with userId as the partition key:

  • ✅ "Get all tasks for user user-123" — instant, uses the partition key
  • ✅ "Get the specific task task-456 for user user-123" — instant, uses both keys
  • ❌ "Find all tasks due tomorrow across all users" — requires scanning every item in the table
  • ❌ "Find all tasks with status 'pending'" — requires a scan (or a Global Secondary Index)

This is the fundamental trade-off of DynamoDB: you get incredible speed and scale for the access patterns you design for, and you pay a significant performance penalty for access patterns you didn't plan for.

Sort Keys: Range Queries Within a Partition

When you add a sort key (like taskId), each item is identified by the combination of partition key + sort key. But sort keys do more than just make items unique — they enable powerful range queries within a partition.

Imagine you store tasks with a sort key that starts with the date: 2026-03-29#task-uuid. Now you can query:

  • "Get all tasks for user-123 due in March 2026" — BETWEEN '2026-03' AND '2026-04'
  • "Get the 10 most recent tasks" — query with ScanIndexForward: false (reverse chronological)
  • "Get all tasks created after a certain date" — BEGINS_WITH or comparison operators

This technique — embedding query-meaningful data into your keys — is called key design, and it's the art form of DynamoDB. Experienced DynamoDB engineers spend significant time on key design because it determines everything you can efficiently query later. Changing it later means migrating all your data.

On-Demand vs. Provisioned Capacity

ModeHow You PayBest ForGotcha
On-Demand Per request (~$0.25 per million reads, ~$1.25 per million writes) New projects, spiky traffic, MVPs, low/unpredictable usage More expensive at consistent high volume
Provisioned Per capacity unit reserved (~$0.00013/hr per read unit, ~$0.00065/hr per write unit) High, consistent traffic where you know your load Throttled if you exceed provisioned capacity; wasteful if traffic drops
Provisioned + Auto-Scaling Per capacity unit, scales automatically within bounds you set Production apps with predictable-ish traffic patterns Takes a few minutes to scale up, so sudden spikes still throttle briefly

The practical guidance: Start with on-demand. It costs slightly more at scale, but it's zero-configuration and never throttles. Once your app has consistent traffic and you can predict your read/write patterns, switch to provisioned to cut costs. AI almost always generates on-demand mode, which is the right call for new projects.

Global Secondary Indexes (GSIs): Your Escape Hatch

A Global Secondary Index (GSI) is like creating a second copy of your table organized around a different key. In the code above, the StatusIndex GSI organizes items by userId + status, so you can efficiently query "all pending tasks for user-123" without scanning the whole table.

GSIs cost extra (you pay for the extra storage and write capacity they consume), but they're how you handle query patterns that don't fit your primary key. Think of them like database indexes in SQL — they trade extra storage for faster queries.

Key GSI facts:

  • You can add GSIs after table creation (unlike changing primary keys, which requires migration)
  • A table can have up to 20 GSIs
  • GSIs are eventually consistent — they update in milliseconds after a write, but there's a brief window where the GSI might not reflect the latest write
  • If you forget to add a GSI and write a Scan query instead, AWS will politely charge you for reading every item in your table

DynamoDB vs. the Alternatives

DatabaseBest ForQuery FlexibilityManagement
DynamoDB AWS apps, massive scale, serverless Low (designed access patterns only) Zero — fully managed
PostgreSQL Relational data, complex queries, JOINs Very high (query anything, any way) Medium (managed options available)
MongoDB Document data, flexible schema, cross-cloud High (query any field) Low (Atlas managed service)
Redis Caching, sessions, real-time data Very low (key-based only) Low (managed options available)

What AI Gets Wrong About DynamoDB

1. Designing the Schema Without Knowing the Queries

This is the biggest, most consequential mistake. AI will happily create a DynamoDB table with partition key userId and sort key taskId — then later, when you ask for "get all tasks due this week across all users," it will write a Scan operation that reads your entire table.

In PostgreSQL, you can add a WHERE dueDate BETWEEN x AND y clause any time. In DynamoDB, if you didn't plan for that query in your key design or add a GSI, you're stuck paying for a full table scan forever — or migrating your data.

Fix: Before generating DynamoDB code, give AI your complete list of query patterns:

Better Prompt

"Before designing the DynamoDB table, here are ALL the ways I need to query this data: 1) Get all tasks for a user, 2) Get a specific task by ID, 3) Get all pending tasks for a user, 4) Get tasks due in the next 7 days for a user. Design the primary keys and GSIs to support all of these efficiently WITHOUT using Scan."

2. Using Scan Instead of Query

A Scan reads every single item in the table and filters afterward. It's the DynamoDB equivalent of reading an entire book to find one sentence. AI generates Scan operations surprisingly often — especially when you add new query requirements after the initial design.

// ❌ AI sometimes generates this (reads ENTIRE table):
const result = await docClient.send(new ScanCommand({
  TableName: TABLE_NAME,
  FilterExpression: 'userId = :uid AND #s = :status',
  ExpressionAttributeNames: { '#s': 'status' },
  ExpressionAttributeValues: { ':uid': userId, ':status': 'pending' },
}));

// ✅ Use Query with the partition key (reads only items for this user):
const result = await docClient.send(new QueryCommand({
  TableName: TABLE_NAME,
  IndexName: 'StatusIndex',
  KeyConditionExpression: 'userId = :uid AND #s = :status',
  ExpressionAttributeNames: { '#s': 'status' },
  ExpressionAttributeValues: { ':uid': userId, ':status': 'pending' },
}));

Rule: If you see ScanCommand in AI-generated DynamoDB code, ask AI: "Can this be replaced with a Query using the partition key or a GSI?" The answer is almost always yes.

3. Forgetting Reserved Words in Expressions

DynamoDB has over 500 reserved words — including common ones like status, name, type, size, count, month, and year. If you use one as an attribute name in an expression, DynamoDB throws a cryptic validation error. AI forgets to handle this surprisingly often.

// ❌ This throws: "Invalid ConditionExpression: Attribute name is a reserved word; reserved word: status"
UpdateExpression: 'SET status = :s'

// ✅ Use ExpressionAttributeNames to alias the reserved word:
UpdateExpression: 'SET #s = :s',
ExpressionAttributeNames: { '#s': 'status' },
ExpressionAttributeValues: { ':s': 'complete' }

4. Not Handling Pagination

DynamoDB returns a maximum of 1 MB of data per Query or Scan. If you have more results, it returns a LastEvaluatedKey — a cursor you pass back to get the next page. AI-generated code frequently ignores this, meaning your app silently returns incomplete results once a user has enough data.

// ❌ Incomplete — stops at 1 MB or 1 page of results:
const result = await docClient.send(new QueryCommand({
  TableName: TABLE_NAME,
  KeyConditionExpression: 'userId = :uid',
  ExpressionAttributeValues: { ':uid': userId },
}));
return result.Items;

// ✅ Paginate through all results:
async function getAllUserTasks(userId: string) {
  const allItems = [];
  let lastKey: Record | undefined;

  do {
    const result = await docClient.send(new QueryCommand({
      TableName: TABLE_NAME,
      KeyConditionExpression: 'userId = :uid',
      ExpressionAttributeValues: { ':uid': userId },
      ExclusiveStartKey: lastKey,
    }));

    allItems.push(...(result.Items || []));
    lastKey = result.LastEvaluatedKey as Record | undefined;
  } while (lastKey);

  return allItems;
}

5. Building Multiple Tables When One Table Should Do

Coming from SQL, AI's instinct is to create separate tables for separate entities: a users table, a tasks table, a projects table. In DynamoDB, this is often the wrong approach.

Advanced DynamoDB users practice single-table design: storing multiple entity types in one table using a naming convention for keys (e.g., partition key is USER#user-123, sort key is TASK#task-456). This allows fetching a user and all their tasks in a single request, with no JOINs needed.

Single-table design has a steep learning curve, but it's how DynamoDB was designed to be used at scale. For your first project, multiple tables is fine — but know that "best practice DynamoDB" looks very different from what AI generates by default.

How to Debug DynamoDB Issues

Problem: ValidationException — "Reserved word" Error

// Error:
// ValidationException: Value provided in ExpressionAttributeNames must begin
// with '#'. Also: 'status' is a reserved keyword.

// Fix: alias ALL attribute names that could be reserved words
ExpressionAttributeNames: {
  '#s': 'status',
  '#n': 'name',
  '#t': 'type',
}
Debug Prompt

"I'm getting a DynamoDB ValidationException about a reserved word. Here's my UpdateCommand: [paste]. Identify any reserved words in my attribute names and add ExpressionAttributeNames aliases for all of them."

Problem: ProvisionedThroughputExceededException

This means you're sending more reads or writes per second than your table's provisioned capacity allows. The request gets throttled (rejected with an error).

// If you're on provisioned mode and hitting this, either:
// 1. Switch to on-demand:
billingMode: dynamodb.BillingMode.PAY_PER_REQUEST

// 2. Or increase capacity units and enable auto-scaling:
readCapacity: 10,
writeCapacity: 5,
// + configure auto-scaling in CDK/console

// 3. Implement exponential backoff on retries (AWS SDK does this automatically
//    if you configure maxAttempts):
const client = new DynamoDBClient({
  region: 'us-east-1',
  maxAttempts: 3,  // Retries with exponential backoff
});

Problem: Items Missing From Query Results

You write an item, immediately query for it, and it's not there. This is usually one of two things:

  • GSI eventual consistency — You wrote an item and immediately queried a GSI. GSIs take milliseconds to update. If you're querying a GSI right after writing, add a small delay in tests, or accept eventual consistency in your app logic.
  • Wrong key values — DynamoDB lookups are case-sensitive and exact. User-123 and user-123 are different partition keys. Check that the key you're querying with exactly matches the key you wrote.
Debug Prompt

"My DynamoDB Query returns zero results, but I can see the item in the AWS console. Here's my PutCommand that wrote the item: [paste]. Here's my QueryCommand: [paste]. Compare the key values in both — are there any case mismatches, type mismatches (string vs number), or key attribute name differences?"

Problem: Unexpected AWS Costs From DynamoDB

DynamoDB bills can spike unexpectedly, almost always from one of three causes:

  1. Scan operations — A Scan on a large table bills for every item read, even items filtered out. Find Scans in your code and replace with Queries.
  2. Large items — DynamoDB bills per 4 KB for reads and per 1 KB for writes. Storing large JSON blobs (like entire AI responses) adds up. Consider storing large content in S3 and keeping only a reference in DynamoDB.
  3. Hot partitions — If all your traffic goes to items with the same partition key, DynamoDB can throttle that partition even if the overall table has capacity. Distribute your writes across partition keys where possible.

What to Learn Next

Frequently Asked Questions

What is DynamoDB in simple terms?

DynamoDB is Amazon's fully managed NoSQL database. You don't install it, configure servers, or worry about scaling — you just create a table, put data in, and AWS handles everything else. It stores data as flexible JSON-like documents instead of rows and columns, scales automatically from one user to millions without any code changes, and charges per request rather than for a running server. Think of it as a giant, infinitely scalable dictionary in the cloud: you look things up by key, and AWS guarantees you'll get the result in single-digit milliseconds no matter how much traffic you're getting.

When should I use DynamoDB instead of PostgreSQL?

Use DynamoDB when you need massive scale with predictable, low-latency reads and writes; when you're building entirely on AWS and want managed, serverless infrastructure; or when your access patterns are simple and well-defined in advance. Use PostgreSQL when you need complex queries with JOINs across multiple tables, when your data is highly relational, when you don't yet know all your query patterns, or when you're not committed to AWS. For most first projects, PostgreSQL is the better starting point — it's more flexible, better documented for beginners, and doesn't punish you for not knowing your access patterns upfront. Use DynamoDB when scale or the AWS ecosystem specifically calls for it.

What is a partition key in DynamoDB?

A partition key is the primary identifier for an item in DynamoDB — and the only way to efficiently look up data. DynamoDB uses the partition key to decide which physical server stores your data. This is how it achieves single-digit millisecond latency at massive scale: it always knows exactly where your data is. The trade-off is that you can only query efficiently by partition key (or by sort key within a partition, or via a Global Secondary Index). If you try to search by any other attribute — like "find all users in California" — DynamoDB scans every item in the table, which is slow and expensive. This is the biggest design difference from SQL databases, where you can add indexes to any column at any time.

What does 'on-demand' vs 'provisioned' mean in DynamoDB?

On-demand mode means you pay per request — each read or write costs a fraction of a cent, and DynamoDB scales automatically to handle any traffic level. Provisioned mode means you specify upfront how many reads and writes per second you expect (called Read Capacity Units and Write Capacity Units), and you pay for that reservation whether you use it or not. On-demand is simpler and better for projects with unpredictable traffic, MVPs, or low usage — you can't over-provision and waste money. Provisioned is cheaper if you have consistent, high-volume traffic and can accurately predict your load. For vibe-coded projects, always start with on-demand. You can switch to provisioned later if DynamoDB costs become significant and you understand your traffic patterns.

Does AI know how to use DynamoDB correctly?

AI can generate working DynamoDB code, but it frequently makes costly design mistakes. The most dangerous: designing the schema without knowing all your query patterns first — in DynamoDB, you can't efficiently add new query patterns later without migrating data or adding GSIs. AI also tends to generate Scan operations instead of Query (Scan reads your entire table and is expensive at scale), miss reserved word conflicts in expressions, skip pagination handling, and default to a multi-table design when single-table design would be more efficient. The fix: always tell AI your complete list of access patterns before schema design, and always ask "does this use Scan anywhere, and can we replace it with a Query?"