TL;DR: A background job is any task your app runs outside a normal web request — sending emails, resizing images, generating PDFs, syncing data. You need them when a task takes longer than a web request can live (roughly 10–30 seconds on most platforms). The three patterns are queues (triggered by events), cron jobs (run on a schedule), and webhooks (triggered by external services). The easiest tool to start with is Inngest — it works without extra infrastructure and AI generates its code well.

Why AI Coders Need This

When you first build with AI, every feature feels like a single request. User clicks button → server does thing → user sees result. That model works great for most of your app. Then you hit a feature that breaks it.

You ask AI to add profile picture uploads with automatic resizing. It generates the code. You test it locally — it works. You deploy. Users upload large images and the page just spins until the browser times out. Vercel logs show: Function execution timed out after 10000ms.

Or you add a "send invoice" button. Works fine in dev. In production, the email service takes 3–4 seconds to respond, the PDF generator takes another 8 seconds, and you're already past Vercel's default function limit before anything reaches the user.

Or you want to send a daily digest email to your users every morning. There's no "daily" in a request/response cycle — there's no user action to trigger it.

These are all background job problems. The work is real and needs to happen — it just can't happen inline with the user's click. Understanding background jobs is the mental model shift that unlocks a whole class of app features.

Real-World Scenario: The Image Upload

Let's walk through exactly what goes wrong and why background jobs fix it. This scenario is one of the most common places vibe coders first need this concept.

What the user does

Chuck uploads a profile picture. He's on a decent connection. His image file is 4 MB — a DSLR photo he took at a barbecue. Your app needs to:

  • Accept the upload and store the original
  • Resize it to three sizes: 512×512 for profile display, 128×128 for avatars, and 64×64 for tiny thumbnail contexts
  • Optimize each for web (convert to WebP, compress, strip metadata)
  • Update the database with the new image URLs
  • Optionally: run it through a content moderation API to check for inappropriate images

What happens if you do it inline

The request comes in. Your server starts processing. Image manipulation is CPU-intensive. A Node.js server using Sharp to resize one 4 MB image might take 2–5 seconds. Three sizes, plus a WebP conversion, plus an external API call to a moderation service — you're looking at 10–20 seconds easily. On Vercel's Hobby plan, serverless functions have a 10-second timeout. On Pro, you get 60 seconds. Either way, if this happens inside the web request, you're dancing with fire.

Even if you don't time out, holding the HTTP connection open for 15 seconds is terrible UX. Chuck is sitting there watching a spinner, wondering if the upload worked, considering clicking the button again.

What background jobs let you do instead

With a background job pattern:

  1. Chuck's browser uploads the file to cloud storage (S3, Cloudflare R2, Supabase Storage)
  2. Your API route receives a notification that the upload completed — this takes milliseconds
  3. Your API route puts an "process this image" job into a queue and immediately responds to Chuck: "Upload received, your profile picture will be ready shortly."
  4. A worker (a separate process, or a serverless function triggered by the queue) picks up the job and does the slow work: resize, optimize, call the moderation API
  5. When it's done, the worker updates the database and optionally notifies Chuck via a websocket or the next time he refreshes

Chuck got an instant response. The slow work happened in the background. If the moderation API was temporarily down and the job failed, the queue system retries it automatically. Nothing is lost.

What Background Jobs Actually Do

Strip away the complexity and a background job system has three components:

1. The producer — something that creates work

Your API route, a webhook handler, or a scheduled trigger puts a job into the system. The job is typically a small message: "resize image user_123/original.jpg" or "send welcome email to chuck@example.com". It contains just enough information for the worker to do the job — usually IDs and parameters, not the full data.

2. The queue or scheduler — something that holds the work

The job gets stored somewhere until a worker is ready to process it. For queues this is usually Redis or a managed service. For cron jobs, it's a scheduler that knows to fire at a specific time. The queue is the buffer between "work was requested" and "work was done".

3. The worker — something that does the work

A worker is code that reads jobs from the queue and executes them. It can be a long-running Node.js process, a serverless function that spins up on demand, or a cloud function triggered by a message. Workers can run at a different scale, in a different region, or on different hardware than your web server.

The key insight: the worker doesn't have to respond to the user within any time limit. It can take 5 minutes to generate a PDF. Nobody is waiting on the other end of an HTTP connection. This removes the entire class of "function timeout" problems.

The Common Patterns

Background jobs aren't one thing — there are three distinct patterns. Understanding which one applies to your situation is half the battle.

Queues — event-driven async work

A queue is a list of tasks waiting to be processed. Your app adds tasks; workers consume them. Tasks are processed roughly in order (FIFO — first in, first out), though priorities can change this.

When to use it: Anything triggered by a user action that's too slow to do inline. Sending emails, processing uploads, generating documents, calling slow external APIs, syncing data to a third-party service.

The core guarantee: Every job in the queue will be attempted at least once. Good queue systems guarantee at-least-once delivery and retry on failure. This means your job code needs to be idempotent — safe to run twice if it gets retried. (More on what AI gets wrong about this later.)

See also: What Are Message Queues?

Cron jobs — scheduled recurring work

Cron is the Unix scheduler — named after the Greek word for time. A cron job runs a piece of code on a schedule: every minute, every hour, every day at 3 AM, every Monday at 9 AM. There's no user action. It just runs.

When to use it: Recurring maintenance tasks. Purging expired sessions from the database. Generating weekly reports. Sending a daily digest email. Syncing data from an external API on a regular cadence. Checking for overdue invoices every morning.

Cron syntax: Cron schedules use a five-field syntax that looks scary but is learnable:

# minute  hour  day-of-month  month  day-of-week
# ┌────── ┬──── ┬──────────── ┬───── ┬───────────
  *       *     *             *      *

# Every day at 9 AM UTC:
  0       9     *             *      *

# Every Monday at midnight UTC:
  0       0     *             *      1

# Every hour:
  0       *     *             *      *

# Every 15 minutes:
  */15    *     *             *      *

You don't need to memorize this — crontab.guru lets you paste a schedule and get a plain-English translation. AI also generates cron expressions reliably when you describe the schedule in plain English.

Webhooks — jobs triggered by external events

A webhook is an HTTP callback. An external service (Stripe, GitHub, Twilio) calls a URL on your server when something happens on their end: a payment succeeds, a pull request is opened, an SMS is received. Your webhook handler receives the event and kicks off a background job in response.

When to use it: Responding to external events you don't control. When a Stripe payment succeeds, provision the user's subscription. When a GitHub push happens, trigger a deploy. When a Twilio SMS arrives, process the message and reply.

The webhook handler itself should be fast — receive the event, validate it, put a job in the queue, return a 200. The actual work (provisioning, deploying, replying) happens in the background job. Stripe and other services will retry webhook delivery if they don't get a 200 back quickly, so you never want to do slow work inside a webhook handler.

See also: What Is a Webhook?

Tool Options: Inngest, BullMQ, node-cron, Vercel Cron

When AI generates background job code, it reaches for one of a handful of tools. Here's what each one is and when it makes sense.

Inngest — the vibe coder's background job system

Inngest is a managed background job platform that runs your job functions inside your existing serverless infrastructure. You write your job code as regular TypeScript functions, decorate them with Inngest's SDK, and deploy them alongside your app. No separate worker processes. No Redis to manage. No new infrastructure.

What makes Inngest special for AI-generated apps:

  • Durable execution — jobs survive server restarts and cold starts. If a serverless function times out mid-job, Inngest resumes from where it left off.
  • Built-in retries — automatic exponential backoff retry logic without you writing it.
  • Sleep and wait — a job can step.sleep("1 day") and wake back up tomorrow without holding any resources. Try doing that with a regular setTimeout.
  • Observability — Inngest's dashboard shows every job, its status, retry history, and the exact input/output at each step. Invaluable for debugging.

Best for: Vercel/Next.js apps, apps where you want to avoid infrastructure management, complex workflows with multiple steps. If you're a vibe coder and not sure which tool to use, start here.

See also: What Is Inngest?

Prompt I Would Type

Add a background job using Inngest that processes uploaded images. When
a user uploads a profile picture, the job should resize it to 512x512
and 128x128 using Sharp, save both to Cloudflare R2, and update the
user's profile in the database with the new URLs.

AI will generate something like this Inngest function:

// inngest/functions/process-profile-image.ts

import { inngest } from '@/inngest/client';
import sharp from 'sharp';
import { r2 } from '@/lib/r2';       // your R2 client
import { db } from '@/lib/db';

export const processProfileImage = inngest.createFunction(
  {
    id: 'process-profile-image',
    retries: 3,                        // retry up to 3 times on failure
  },
  { event: 'image/profile.uploaded' }, // triggered by this event name
  async ({ event, step }) => {
    const { userId, originalKey } = event.data;

    // step.run isolates each piece of work — if this step succeeds,
    // it won't re-run on retry even if a later step fails
    const resized = await step.run('resize-images', async () => {
      const originalBuffer = await r2.getObject(originalKey);

      const [size512, size128] = await Promise.all([
        sharp(originalBuffer).resize(512, 512).webp().toBuffer(),
        sharp(originalBuffer).resize(128, 128).webp().toBuffer(),
      ]);

      return { size512, size128 };
    });

    const urls = await step.run('upload-to-r2', async () => {
      const key512 = `avatars/${userId}/512.webp`;
      const key128 = `avatars/${userId}/128.webp`;

      await Promise.all([
        r2.putObject(key512, resized.size512),
        r2.putObject(key128, resized.size128),
      ]);

      return {
        url512: `https://assets.yourdomain.com/${key512}`,
        url128: `https://assets.yourdomain.com/${key128}`,
      };
    });

    await step.run('update-database', async () => {
      await db.user.update({
        where: { id: userId },
        data: {
          avatarUrl: urls.url512,
          avatarThumbUrl: urls.url128,
        },
      });
    });

    return { userId, urls };
  }
);

And the API route that triggers it:

// app/api/upload/complete/route.ts — called after upload finishes

import { inngest } from '@/inngest/client';
import { NextResponse } from 'next/server';

export async function POST(request: Request) {
  const { userId, originalKey } = await request.json();

  // Send the event to Inngest — this returns immediately
  // The actual image processing happens in the background
  await inngest.send({
    name: 'image/profile.uploaded',
    data: { userId, originalKey },
  });

  return NextResponse.json({ message: 'Processing started' });
}

BullMQ — the battle-tested queue library

BullMQ is a Node.js job queue library built on Redis. It's the successor to Bull (the original) and has been around long enough to be production-proven at serious scale. You add jobs to a queue, run one or more worker processes that consume from the queue, and BullMQ handles the rest: priorities, delays, rate limiting, repeatable jobs, job events.

Requires: A Redis instance (local for dev, managed Redis in production — Upstash Redis works well and has a generous free tier).

Best for: Apps that are already running a persistent Node.js server (not purely serverless), apps that need fine-grained control over queue behavior, high-throughput job processing.

// lib/queues.ts — define the queue
import { Queue, Worker } from 'bullmq';
import { redis } from '@/lib/redis';  // your Redis connection

// The queue — other parts of the app add jobs here
export const imageQueue = new Queue('image-processing', {
  connection: redis,
  defaultJobOptions: {
    attempts: 3,                   // retry up to 3 times
    backoff: { type: 'exponential', delay: 1000 }, // wait 1s, 2s, 4s between retries
  },
});

// The worker — runs in a separate process, consumes jobs
export const imageWorker = new Worker(
  'image-processing',
  async (job) => {
    const { userId, originalKey } = job.data;
    // ... do the image processing here
  },
  { connection: redis }
);

node-cron — simple scheduled jobs for Node.js

node-cron is a lightweight library that runs cron-style scheduled functions inside your Node.js process. No external service, no extra infrastructure. You define a schedule and a function, and it runs on that schedule as long as your server is running.

Best for: Simple recurring tasks in apps with a long-running Node.js server. Not suitable for serverless environments (the process doesn't stay running between requests).

// lib/cron.ts — schedule recurring jobs
import cron from 'node-cron';
import { purgeExpiredSessions } from '@/lib/auth';
import { sendDailyDigest } from '@/lib/emails';

// Purge expired sessions every hour
cron.schedule('0 * * * *', async () => {
  await purgeExpiredSessions();
});

// Send daily digest at 9 AM UTC every morning
cron.schedule('0 9 * * *', async () => {
  await sendDailyDigest();
});

Vercel Cron — scheduled jobs for Vercel apps

If you're on Vercel, you can configure cron jobs directly in your vercel.json file. Vercel will call one of your API routes on the specified schedule. No separate infrastructure — it's just an HTTP request to your own app on a timer.

// vercel.json — cron configuration
{
  "crons": [
    {
      "path": "/api/cron/daily-digest",   // Vercel calls this URL
      "schedule": "0 9 * * *"             // every day at 9 AM UTC
    },
    {
      "path": "/api/cron/purge-sessions",
      "schedule": "0 * * * *"             // every hour
    }
  ]
}
// app/api/cron/daily-digest/route.ts — the cron endpoint
import { NextResponse } from 'next/server';

export async function GET(request: Request) {
  // Verify the request came from Vercel, not a random caller
  const authHeader = request.headers.get('authorization');
  if (authHeader !== `Bearer ${process.env.CRON_SECRET}`) {
    return NextResponse.json({ error: 'Unauthorized' }, { status: 401 });
  }

  await sendDailyDigest();
  return NextResponse.json({ success: true });
}

Best for: Simple scheduled tasks in Vercel-hosted apps. Not suitable for long-running jobs (still subject to function timeouts) or high-frequency crons (Vercel's Hobby plan allows one cron job max).

Tool Type Infrastructure Best For
Inngest Queue + workflow None (managed) Serverless, complex workflows
BullMQ Queue Redis required High throughput, persistent server
node-cron Scheduler None Simple recurring tasks, long-running server
Vercel Cron Scheduler None Scheduled tasks on Vercel

What AI Gets Wrong About Background Jobs

It doesn't make jobs idempotent

This is the big one. Background job systems retry failed jobs. If your job crashes halfway through, the system runs it again from the start. If your job sends an email on step one, then fails on step two, and retries — it sends two emails. Idempotent means "safe to run multiple times with the same result." AI-generated jobs often aren't.

The fix: check before doing. Before sending an email, check if it was already sent. Before creating a database record, check if it already exists. Inngest's step.run() model helps here because each named step is only executed once per job run even on retries. If you're using BullMQ, add your own checks.

It puts too much data in the job payload

Job queues are designed to carry small messages — IDs and parameters. AI sometimes generates code that serializes entire objects into the job payload. This causes problems: queue systems often have message size limits (BullMQ/Redis typically caps at around 512 MB but smaller is better), and large payloads slow down the queue. The pattern: store data in the database, put the ID in the job. The worker looks up the data fresh when it runs.

It ignores job failure entirely

AI-generated worker code often has no error handling and no alerting when jobs fail. Jobs fail silently. You discover the problem when a user complains their profile picture never updated, three days later. At minimum: log failures with enough context to debug them, and consider setting up alerts (most job platforms can ping a webhook or Slack channel when jobs hit their retry limit).

It chooses the wrong tool for the deployment environment

Ask AI to add background jobs to a Vercel app and it might suggest BullMQ. BullMQ needs a worker process running continuously — that doesn't exist on Vercel's serverless infrastructure. The worker would never run. Inngest or Vercel Cron is the correct answer for Vercel. When you get AI-generated job code, verify the tool matches your deployment environment. See also: What Is Serverless?

It doesn't handle job concurrency

Multiple workers can process jobs simultaneously. If two workers pick up overlapping jobs for the same user (two separate image uploads that both try to update the same database row), you can get race conditions or duplicate work. AI rarely adds locking or concurrency guards. For most beginner apps this won't bite you, but be aware the risk exists when the same resource can be modified by multiple concurrent jobs.

It generates cron jobs that run in serverless environments

node-cron only works in a process that stays running. AI sometimes generates node-cron code inside an API route handler — that runs once per request and disappears, never scheduling anything. For serverless apps, use Vercel Cron or Inngest's scheduled functions instead.

When You Need Background Jobs vs. When You Don't

Not everything needs background jobs. Overcomplicating a simple app with job queues makes it harder to build and debug. Here's a practical rubric:

You probably need background jobs when:

  • A task takes longer than 5 seconds (to stay comfortably inside any platform's timeout limits)
  • A task calls an external API that might be slow or unreliable (payment processors, email services, third-party data sources)
  • A task needs to be retried if it fails — email sends, payment webhooks, data syncs
  • A task needs to run on a schedule without user interaction — daily reports, cleanup jobs, reminders
  • A task fans out to multiple users — send notification to 5,000 users when a new post is published
  • A task involves CPU-heavy work — image processing, PDF generation, video transcoding

You probably don't need background jobs when:

  • The task is fast (<2 seconds) and unlikely to fail
  • You're just reading from a database and returning data — that's what API routes are for
  • The user needs the result immediately to continue — you can't put that in the background
  • You're prototyping or in early dev — don't add infrastructure complexity until you've validated the feature
  • Your app has very low traffic and failures are acceptable — a direct call is simpler and easier to debug

The rule of thumb: Start with inline code. When you hit timeouts, failures, or scheduling needs, reach for background jobs. Don't architect for problems you haven't had yet. For performance improvements that don't require background jobs, see What Is Caching?

What to Learn Next

Background jobs connect to several other concepts you'll encounter as your app grows:

Frequently Asked Questions

A background job is a task your app runs outside of the normal request/response cycle. When a user clicks a button, your server normally does the work immediately and sends back a response. A background job instead puts the work in a queue or schedules it separately, so the user gets an instant response and the slow work happens afterwards — seconds, minutes, or hours later.

A queue is triggered by an event — a user action or another piece of code puts a task in the queue, and a worker picks it up and runs it. A cron job runs on a schedule regardless of user actions — every hour, every night at midnight, every Monday. Use queues for things triggered by users (send this email, process this upload). Use cron for recurring maintenance (purge old sessions, generate a weekly report).

Web servers and hosting platforms set maximum request durations to protect against runaway processes that could freeze the server. Vercel serverless functions time out at 10-60 seconds depending on your plan. Traditional servers often time out HTTP connections around 30 seconds by default. Browsers may also cancel requests they've been waiting on too long. Background jobs sidestep this entirely by running outside the HTTP request lifecycle.

Not necessarily. Inngest runs your job functions inside your existing serverless functions — no separate infrastructure. BullMQ needs a Redis instance but can run workers as part of your existing Node server. Vercel Cron triggers your own API routes on a schedule. You can also use managed platforms like Railway or Render to run a dedicated worker process without managing servers. The simpler you start, the better.

Good job systems retry automatically. Inngest has built-in exponential backoff retry logic. BullMQ lets you configure retry counts and delays. If a job fails after all retries, it usually moves to a dead-letter queue where you can inspect it and manually retry. Without retries, a network hiccup or temporary API outage causes permanent data loss — the task just silently disappears. Always check how your chosen tool handles failure before going to production.