Serverless vs Containers: Which Should Your AI Build Use?

TL;DR: Serverless is great for simple APIs, webhooks, and bursty traffic — zero cost when idle, zero servers to manage. Containers are the right call when you need WebSockets, long-running tasks, or a persistent process. If you're not sure, start serverless and switch to containers when you hit a wall.

The Scenario That Breaks Everything

You're building a side project: an AI writing assistant. Users paste in a draft, click "Improve," and your app calls the OpenAI API to rewrite it. You ask Claude Code to wire up the backend and it scaffolds a clean Express server with a single /api/improve route. Then you ask how to deploy it.

AI gives you two options almost simultaneously. Option A: deploy to Vercel — just push to GitHub and it's live, free tier, zero config. Option B: add a Dockerfile and push to Railway — $5/month, full control.

You pick Vercel because it's free. It works perfectly for a week. Then you add a feature: a long-running document analysis that takes 45 seconds. Users click the button, wait, and get a timeout error. Vercel Functions have a 10-second execution limit on the free tier. The same code that ran fine locally now fails in production every time.

This is the serverless vs containers question in the real world. It's not academic. Pick the wrong one and you'll hit invisible walls. Pick the right one and you won't even think about infrastructure.

The Mental Model: Pop-Up Stalls vs. Restaurants

Think of serverless like a pop-up food stall at a market. When a customer walks up, the vendor spins into action — takes the order, cooks it, hands it over. When there are no customers, nothing is running. There's no rent, no electricity bill ticking over, no staff standing around waiting. If 500 customers show up at once, the market somehow spawns 500 stalls instantly. You pay only when food is being made.

Think of containers like a restaurant with a kitchen that runs 24/7. The lights are on whether customers are there or not. You pay for the building and the staff regardless of traffic. But the kitchen can do things the pop-up stall can't: keep a soup simmering for hours, take reservations, run a persistent tab for regulars, handle complex multi-course meals. The ongoing cost buys you capabilities.

Neither is universally better. A pop-up stall is the wrong choice if you need a reservation system. A restaurant is overkill if you're selling three coffees a day.

What Is Serverless (and What AI Deploys You To)

When AI deploys your app to Vercel, it's using serverless functions. Each API route becomes an independent function. When a request hits /api/users, the cloud provider boots up a tiny environment, runs your handler function, returns the response, and shuts it down. No persistent server. No always-on process.

The main serverless platforms you'll encounter:

Vercel Functions — the default for Next.js, Nuxt, and SvelteKit apps. Your api/ folder becomes serverless functions automatically. Free tier with generous limits.
AWS Lambda — the original serverless platform. Powers most of the internet's backend logic. AI generates Lambda functions for things like Stripe webhooks, image resizing on upload, and scheduled jobs.
Cloudflare Workers — runs at the edge (close to your users), starts up in milliseconds with virtually no cold starts. Great for lightweight logic that needs to be fast globally.

What they all share: you write a function, the platform handles everything else. No server to SSH into, no Docker image to build, no capacity to manage. They scale automatically — from zero requests to a million — and you pay per execution.

What Are Containers (and When AI Reaches for Docker)

A container is a packaged-up version of your app — code, dependencies, runtime, config — that runs identically anywhere. Docker is the tool that creates containers. Docker doesn't care if you're on a Mac, a Linux server, or a cloud VM. The container runs the same way everywhere.

When AI writes you a Dockerfile, it's creating a blueprint for building that container image. When you push to Railway or Fly.io, those platforms build the image and run it as a persistent process on a real server.

Unlike serverless, a containerized app is always running. Your Express server starts, opens a port, and waits for requests indefinitely. It maintains state between requests. It can open a WebSocket connection and keep it alive. It can run a background worker that processes a queue every 30 seconds.

The tradeoff: you pay for that always-on process even when no one is using your app. A Railway container starts at around $5/month for a small instance. It never goes to zero.

Cold Starts Explained Without the Jargon

This is the concept that confuses most people about serverless, and it's actually simple once you see the pop-up stall analogy play out.

Remember the pop-up stall? When the first customer arrives, there's setup time. The vendor has to unpack equipment, fire up the grill, get organized. This setup — before the first burger gets made — is a cold start.

In serverless terms: the cloud provider has to boot up a new environment for your function, load your code and dependencies, and initialize everything before handling the request. This adds latency — anywhere from 200 milliseconds to 2 full seconds depending on the platform and your code size.

After that first request, the environment stays "warm" for a few minutes. Subsequent requests hit the already-running environment and are fast. The cold start only happens again after a period of inactivity.

When cold starts actually matter: Real-time APIs where 1 second of extra latency is noticeable. WebSocket handshakes. Stripe webhooks where the payment gateway has a tight timeout window. Most CRUD APIs and webhook handlers handle cold starts just fine — users rarely notice a half-second delay.

Cloudflare Workers have near-zero cold starts because they use a different execution model (V8 isolates instead of full Node.js environments). If cold starts are genuinely a problem for your app, Workers is worth looking at.

What AI Generates for Each Approach

Prompt I Would Type

Build a webhook handler for Stripe payment events — process
successful payments and update the user's subscription in
my database. Deploy-ready for Vercel.

For a webhook handler, AI will reach for serverless immediately. Here's a typical Vercel Function it generates:

Serverless: Vercel Function for a Stripe Webhook

// api/webhooks/stripe.js
import Stripe from 'stripe';

const stripe = new Stripe(process.env.STRIPE_SECRET_KEY);

export default async function handler(req, res) {
  if (req.method !== 'POST') {
    return res.status(405).json({ error: 'Method not allowed' });
  }

  // Verify the webhook signature
  const sig = req.headers['stripe-signature'];
  let event;

  try {
    event = stripe.webhooks.constructEvent(
      req.body,          // Raw body — needs rawBody middleware
      sig,
      process.env.STRIPE_WEBHOOK_SECRET
    );
  } catch (err) {
    return res.status(400).json({ error: `Webhook signature failed: ${err.message}` });
  }

  // Handle the event
  switch (event.type) {
    case 'checkout.session.completed':
      const session = event.data.object;
      await updateUserSubscription(session.customer_email, 'active');
      break;

    case 'customer.subscription.deleted':
      const subscription = event.data.object;
      await updateUserSubscription(subscription.customer, 'cancelled');
      break;

    default:
      console.log(`Unhandled event type: ${event.type}`);
  }

  res.json({ received: true });
}

async function updateUserSubscription(identifier, status) {
  // Database update logic here
}

This is serverless at its best. The function wakes up when Stripe sends a POST, does its work (under 10 seconds easily), and goes back to sleep. Cost: essentially zero. Infrastructure management: zero. This is exactly the right tool for this job.

Prompt I Would Type

Build a real-time collaborative editing app — users should
see each other's changes live. Deploy to Railway.

For WebSockets and real-time features, AI pivots to containers. Here's what it generates:

Container: WebSocket Server on Railway

// server.js
import express from 'express';
import { createServer } from 'http';
import { WebSocketServer } from 'ws';

const app = express();
const server = createServer(app);
const wss = new WebSocketServer({ server });

// Track all connected clients and their document rooms
const rooms = new Map();

wss.on('connection', (ws, req) => {
  const roomId = new URL(req.url, 'ws://base').searchParams.get('room');

  // Add client to room
  if (!rooms.has(roomId)) rooms.set(roomId, new Set());
  rooms.get(roomId).add(ws);

  ws.on('message', (data) => {
    const message = JSON.parse(data);

    // Broadcast change to all other clients in the same room
    rooms.get(roomId)?.forEach((client) => {
      if (client !== ws && client.readyState === 1) {
        client.send(JSON.stringify(message));
      }
    });
  });

  ws.on('close', () => {
    rooms.get(roomId)?.delete(ws);
    if (rooms.get(roomId)?.size === 0) rooms.delete(roomId);
  });
});

server.listen(process.env.PORT || 3000, () => {
  console.log('WebSocket server running');
});

# Dockerfile
FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
EXPOSE 3000
CMD ["node", "server.js"]

This server needs to stay alive — it maintains persistent WebSocket connections. A serverless function that shuts down after each request would drop every live connection. Containers are not optional here.

When Serverless Breaks (Know These Before You Ship)

1. Long-Running Tasks

Serverless functions have hard execution limits. Vercel free tier: 10 seconds. Vercel Pro: 60 seconds. AWS Lambda: up to 15 minutes (but you pay for every second). Cloudflare Workers: 30 seconds CPU time.

If your AI build calls an LLM, processes a video, generates a PDF report, or does anything that might run longer than your platform's limit — serverless will timeout. You'll see this error: Error: Function execution timeout. The fix is a container, a background job queue, or breaking the work into smaller chunks.

2. WebSockets and Persistent Connections

WebSockets require a persistent, open connection between the client and server. Serverless functions can't maintain this — they terminate after handling a request. If you need live updates, real-time collaboration, or a chat feature, you need a container (or a managed WebSocket service like Pusher or Ably that runs the persistent layer for you).

3. Large Payloads

Vercel Functions have a 4.5MB request body limit. AWS Lambda has a 6MB payload limit for synchronous invocations. If users are uploading large files directly to your function endpoint, they'll get silent failures for anything over the limit. The fix: upload directly to S3 or Cloudflare R2 using presigned URLs, then trigger processing separately.

4. Database Connection Pooling

This is the subtle one that bites people at scale. Traditional databases like PostgreSQL allow a limited number of simultaneous connections — usually 100 by default. A container holds a persistent connection pool. A serverless function creates a new connection on every cold start.

Under heavy traffic, hundreds of serverless functions fire simultaneously, each trying to open a new database connection. You hit your connection limit. Queries start failing. The fix: use a connection pooler like PgBouncer, Neon's built-in pooling, or PlanetScale's serverless-native driver that handles this automatically.

5. Background Workers

Serverless functions are request-response: something calls them, they respond, they stop. If you need a process that runs every minute to check a queue, sync data, or send scheduled emails — you need a persistent process. That's either a container with a cron job, or a dedicated service like AWS EventBridge triggering a Lambda on a schedule. Either way, you'll want monitoring and logging set up so you know when those background jobs fail.

When Containers Are Overkill

The flip side is real: AI sometimes reaches for Docker when a Vercel deployment would have been free and simpler.

Simple REST APIs and CRUD Apps

If your app is just hitting a database and returning JSON — creating users, reading posts, updating settings — serverless handles this perfectly. You don't need a persistent server. Each request is stateless. Vercel or any serverless platform will work, cost less, and require zero maintenance.

Webhook Handlers

Stripe events, GitHub webhooks, Twilio callbacks — these are textbook serverless workloads. They're short, stateless, and happen at unpredictable intervals. A container sitting idle 23 hours a day waiting for webhooks is wasted money.

Low-Traffic Side Projects

If your app gets 50 requests a day, a $5/month Railway container costs $60/year. The equivalent serverless cost on Vercel: $0, covered by the free tier. For personal projects and MVPs, serverless almost always wins on cost until you have real traffic.

Static Sites with a Thin API Layer

A marketing site, a blog, a portfolio with a contact form — these don't need containers. Vercel, Netlify, or Cloudflare Pages handle the static files from a CDN, and a few serverless functions cover the dynamic bits (form submission, newsletter signup). No server, no Docker, no bill.

Cost Comparison: The Numbers That Actually Matter

Here's what the cost curves look like in practice, not theory:

At Zero to Low Traffic (0–10k requests/month)

Vercel Functions (free tier): $0. Includes 100GB-hours of function execution.
Cloudflare Workers (free tier): $0. 100,000 requests/day free.
AWS Lambda (free tier): $0. 1 million requests/month free, forever.
Railway container: ~$5/month minimum. You pay whether you get requests or not.
Fly.io (shared CPU): ~$2–5/month for a small persistent instance.

Winner at low traffic: Serverless. It's not close.

At Medium Traffic (100k–1M requests/month)

Serverless starts costing money once you exceed free tiers. Vercel Pro is $20/month and covers most apps comfortably. AWS Lambda at 1 million requests with 512MB memory and 1-second average duration costs around $5–10/month. A Railway container handles this volume easily on a $10–20/month plan.

At this scale, costs are comparable. Pick based on features, not price.

At High Traffic (10M+ requests/month)

This is where containers often win. A single Railway or Fly.io container on a $50/month plan might handle 10M simple requests. The equivalent serverless bill depends heavily on execution duration — a Lambda function averaging 500ms costs around $50/month at 10M requests. But if your functions are memory-hungry or slow, serverless gets expensive fast.

At scale, profile your actual usage before assuming either is cheaper.

What AI Gets Wrong About This Decision

1. Always Reaching for Docker

Many AI models default to Docker for every backend because it's the "traditional" answer. They'll write you a Dockerfile for a simple three-route API that absolutely doesn't need one. Before accepting a Docker-based deployment, ask: "Does this app need persistent connections, long-running tasks, or background workers?" If the answer is no, push back and ask for a Vercel or Cloudflare Workers deployment instead.

2. Not Setting the Right Environment Variables

AI generates deployment configs that assume your environment variables are available but doesn't always tell you where to set them. On Vercel, environment variables go in the dashboard under Project Settings. On Railway, they go in the service's Variables tab. A function that works locally but returns 500 in production is almost always a missing environment variable.

3. Ignoring the Execution Timeout

AI will generate serverless functions that call external APIs, do heavy computation, or process large datasets — without checking if the execution will fit within the platform's time limit. Always ask: "How long will this function take to run? Does it stay under the platform's timeout?" If you're calling OpenAI or Claude with a long prompt, account for 10–30 seconds of API response time.

4. Forgetting Raw Body for Webhooks

Stripe signature verification requires the raw request body — the exact bytes Stripe sent, before any JSON parsing. Vercel and Express both parse JSON by default, which breaks signature verification. AI frequently generates webhook handlers that fail with "Webhook signature verification failed" because of this. The fix:

// vercel.json — tell Vercel not to parse the body for this route
{
  "functions": {
    "api/webhooks/stripe.js": {
      "memory": 1024
    }
  }
}

// In the function, access raw body via req.body as a Buffer
// Make sure bodyParser is disabled for this route

5. Using Serverless for WebSocket Chat Without Warning You

This is the most painful one. AI will sometimes scaffold a WebSocket server and deploy it to Vercel without telling you that Vercel doesn't support persistent WebSocket connections in serverless functions. The code looks right, it deploys, and then WebSocket connections immediately drop. The symptom: client connects, server logs the connection, 30 seconds later the connection closes with no activity. Use Railway, Fly.io, or a managed WebSocket service for anything requiring persistent connections.

The Decision Framework for Vibe Coders

Run through these questions in order. Stop at the first "yes."

Question 1: Do you need WebSockets or Server-Sent Events?
Yes → Container (Railway, Fly.io). Serverless can't maintain persistent connections.

Question 2: Will any single operation take longer than 30 seconds?
Yes → Container, or redesign with a queue + background worker.

Question 3: Do you need to run a background process continuously?
Yes → Container. Serverless doesn't run between requests.

Question 4: Are you processing files larger than 5MB in your API?
Yes → Either a container, or upload directly to object storage (S3/R2) from the client.

Question 5: Do you need more than 100 simultaneous database connections?
Yes → Container with a connection pool, or a serverless-native database (Neon, PlanetScale).

None of the above? → Start with serverless.
Vercel for Next.js/Node apps. Cloudflare Workers for edge speed. AWS Lambda for fine-grained AWS integration.

How to Debug Deployment Issues with AI Tools

In Cursor

When your serverless function works locally but fails in production, paste the error log and ask: "My Vercel function returns 500 in production but works fine locally. The error log shows [paste error]. Check for environment variables I might be missing, execution timeout issues, and any imports that might not be available in the serverless environment."

Cursor is especially useful for spotting code that accidentally imports Node.js built-ins (like fs or path) in a way that breaks on Cloudflare Workers, which runs a limited JavaScript environment.

In Windsurf

Use Cascade to analyze your entire deployment config at once: "Review my project's deployment setup — check vercel.json, package.json, and my API routes. Flag any functions that might timeout, any WebSocket usage that won't work in serverless, and any missing environment variable references." Windsurf's multi-file context makes it good at catching cross-file issues that a single-file chat session would miss.

In Claude Code

For choosing between serverless and containers, describe what your app needs to do and ask directly: "My app needs to do X, Y, and Z. Should I deploy to Vercel Functions or use a container on Railway? Walk me through the tradeoffs for my specific use case." Claude Code reasons well about architectural decisions when you give it concrete requirements rather than abstract questions.

For debugging cold start issues specifically: "My Lambda function is slow on the first request. Walk me through how to reduce cold start time — check bundle size, unnecessary imports, and whether I should use provisioned concurrency."

The Quick Debugging Checklist

When a deployment is failing and you don't know why, run through this:

# Check 1: Environment variables
# Are all the env vars from your .env file set in the platform dashboard?
# Vercel: Project Settings → Environment Variables
# Railway: Service → Variables

# Check 2: Execution timeout
# Add a timer log to see how long your function actually takes
const start = Date.now();
// ... your logic ...
console.log(`Execution time: ${Date.now() - start}ms`);

# Check 3: Bundle size (serverless cold starts)
# Large dependencies = slower cold starts
# Check your deployment bundle size in Vercel's function inspector

# Check 4: Platform-specific limits
# Vercel: 4.5MB request body max
# Lambda: 6MB payload, 15min timeout max
# Cloudflare Workers: 128MB memory, no Node.js built-ins

Platform Quick Reference

When AI mentions a platform, here's what you're actually getting:

Vercel — Serverless functions + CDN for frontend. Best for Next.js. Free tier is generous. Use for: REST APIs, webhooks, SSR, static sites.
AWS Lambda — Serverless functions in the AWS ecosystem. Steeper setup, but integrates with everything AWS. Use for: event-driven processing, scheduled jobs, Stripe/payment webhooks.
Cloudflare Workers — Serverless at the edge with near-zero cold starts. Limited to Web APIs (no Node.js built-ins). Use for: globally fast APIs, auth middleware, A/B testing.
Railway — Containers with a Heroku-like experience. Push code, it runs. Starts at ~$5/month. Use for: any app that needs WebSockets, background workers, or a persistent server.
Fly.io — Containers deployed globally close to users. More control than Railway, more setup required. Use for: latency-sensitive apps, multi-region deployments, apps that need to run near specific users.

What to Learn Next

What Is Serverless? → What Is Docker? → What Is Vercel? → What Is Railway? → What Is Fly.io? → What Is Monitoring? → What Is Logging? →

Frequently Asked Questions

Serverless functions (like AWS Lambda or Vercel Functions) run your code on demand and scale to zero when not in use — you pay only for execution time. Containers (like Docker on Railway or Fly.io) run your app continuously on a persistent server you control. Serverless is cheaper at low traffic; containers are more predictable at scale and support long-running processes.

A cold start happens when a serverless function hasn't been called recently and the cloud provider needs to spin up a fresh environment before running your code. This adds 200ms–2 seconds of latency to that first request. Subsequent requests in a short window hit a "warm" instance and are fast. Cold starts are rarely a real problem for most apps, but they matter for real-time APIs and WebSocket connections.

Use serverless for webhook handlers, simple REST APIs, scheduled jobs, form submissions, and anything that handles bursty or unpredictable traffic. Serverless shines when you want zero infrastructure management and low cost at low traffic. Switch to containers when you need WebSockets, long-running background jobs, large file processing, or a database connection pool.

Vercel runs your API routes and backend logic as serverless functions (similar to AWS Lambda). Your frontend assets are served from a CDN. Each API route is an independent function that scales automatically and costs nothing when idle. Vercel does not run persistent container processes by default.

Railway runs your app in containers on persistent servers, similar to Heroku. Use Railway when you need WebSockets, long-running tasks, background workers, a built-in database, or any feature that serverless can't support. Railway starts at around $5/month for a small container and scales up from there.