TL;DR: Caching means storing a copy of data somewhere fast so you do not have to fetch it again from somewhere slow. It exists at multiple layers — your browser, a CDN, your server, your database — and each layer has its own rules for how long data stays fresh. The biggest mistake AI makes with caching is setting it too aggressively, so users see stale content after you deploy.

Why AI Coders Need to Know This

When you ask Claude or Cursor to build an app, the first version usually works — but it works slow. Every user request hits the database. Every API call to a third-party service waits for a fresh response. Every image loads from scratch. This is fine when you are the only user testing locally. It falls apart when real people use your app.

Caching is the single highest-leverage performance fix you can make. A page that takes 2 seconds without caching can load in 50 milliseconds with it. But caching also introduces one of the most frustrating bugs in all of web development: you deployed new code, but users still see the old version.

AI tools are pretty good at generating caching code. They are not as good at explaining when to use each type, what the tradeoffs are, or why your site still shows old content after you pushed a fix. That gap is what this guide fills.

You will encounter caching in almost every meaningful backend topic: REST APIs, CDNs, Redis, Nginx, and middleware. Understanding the basics here makes all of those click faster.

The Workbench Analogy

Think about a construction job. You need the same tools dozens of times per day — tape measure, pencil, speed square. You could walk back to the storage room every time you need one. Or you could keep your most-used tools on the workbench right next to you.

That is caching. The storage room is your database or a remote API — accurate and complete, but slow to reach. The workbench is your cache — fast to grab from, but you have to make sure the tools on it are the right ones for today's job.

The whole art of caching is deciding: which tools go on the workbench (what to cache), how long they stay there before you swap them out (TTL — time to live), and when you need to immediately go get a fresh one even if there is already one on the bench (cache invalidation).

Real Scenario

You asked Claude to build a real estate listings site. It fetches properties from a database, renders the results page, and displays property detail pages. It works perfectly in local testing.

Then you deploy it. Users start complaining it is slow. You check the server logs and see the database is getting hammered — the same queries for the same popular properties running hundreds of times per minute. You ask AI for help.

Prompt I Would Type

My Express app hits the database on every request and it's getting slow.
I have a /api/listings endpoint that runs the same database query for
every visitor. The listing data only changes once a day.

Add caching so we don't hit the database on every request.
Show me the simplest approach first, then show me how to use Redis
if I need something more robust. Explain what each part does.

The phrase "explain what each part does" is important. Without it, AI will generate working caching code but skip the part where it tells you that setting a 24-hour cache on user-specific data will show User A's private listings to User B. Always ask for explanations alongside the code.

What AI Generated

Here is the in-memory cache approach — the simplest starting point:

// Simple in-memory cache for Express
// Works great for a single server. Does NOT survive restarts.

const cache = new Map(); // Stores cached data in memory

app.get('/api/listings', async (req, res) => {
  const cacheKey = 'all-listings';
  const cached = cache.get(cacheKey);

  if (cached) {
    // Cache HIT — return the stored data immediately, no database call
    console.log('Cache hit — serving from memory');
    return res.json(cached.data);
  }

  // Cache MISS — nothing stored yet, go fetch the real data
  console.log('Cache miss — querying database');
  const listings = await db.query('SELECT * FROM listings WHERE active = true');

  // Store the result with an expiry timestamp
  cache.set(cacheKey, {
    data: listings,
    expiresAt: Date.now() + (60 * 60 * 1000) // 1 hour from now
  });

  res.json(listings);
});

// Check expiry on every cache read (run this before the route above)
function getCached(key) {
  const item = cache.get(key);
  if (!item) return null;
  if (Date.now() > item.expiresAt) {
    cache.delete(key); // Expired — remove it
    return null;
  }
  return item;
}

And here is the Redis version for when you need something more durable:

// Redis caching with ioredis
// npm install ioredis

const Redis = require('ioredis');
const redis = new Redis(); // Connects to Redis on localhost:6379

app.get('/api/listings', async (req, res) => {
  const cacheKey = 'listings:all';

  // Try to get cached data from Redis
  const cached = await redis.get(cacheKey);

  if (cached) {
    // Redis stores strings — parse it back to JSON
    return res.json(JSON.parse(cached));
  }

  // Nothing in cache — query the database
  const listings = await db.query('SELECT * FROM listings WHERE active = true');

  // Store in Redis with a TTL of 3600 seconds (1 hour)
  // EX means "expire after this many seconds"
  await redis.set(cacheKey, JSON.stringify(listings), 'EX', 3600);

  res.json(listings);
});

Both do the same thing conceptually: check if the data is already stored somewhere fast, return it if so, fetch from the slow source if not. The difference is where the data lives and whether it survives a server restart.

Understanding Each Part

The cache key

A cache key is the name you use to look something up. Think of it like a label on a box. 'listings:all' is the label for "all active listings." When you want to retrieve or overwrite that data, you use the same label.

Cache keys matter a lot more than they seem. If you cache /api/listings without including any filters in the key, a user who searches for "3-bedroom houses" will get the same cached result as a user who searches for "studios." Your AI will sometimes generate oversimplified keys like 'listings' when it should be 'listings:city:austin:type:house'. Always sanity-check the cache key logic.

TTL — Time To Live

TTL is how long cached data is considered fresh before you need to fetch a new copy. It is measured in seconds. A TTL of 3600 means the cached value is used for one hour, then discarded.

Choosing a TTL is a judgment call:

  • Too long: Users see stale data. You deploy a fix and it does not show up for hours.
  • Too short: You get cache misses constantly and the performance gain disappears.
  • Just right: Based on how often your data actually changes. Listing data that updates daily? A 1-hour TTL is reasonable. A live sports score? Maybe 10 seconds.

Cache hit vs. cache miss

A cache hit means the data was found in the cache — you get a fast response with no database call. A cache miss means the data was not there (either never cached, or expired), so you fetch it fresh and then store it for next time. Your logs will show which is happening. If you are seeing constant cache misses, your TTL might be too short or your cache keys are too specific.

Cache invalidation

Cache invalidation means removing or updating a cached item before its TTL expires, because the underlying data changed. This is the hardest part of caching. Phil Karlton famously said: "There are only two hard things in computer science: cache invalidation and naming things."

Example: A property listing gets marked as sold. Your cache still has it as available for the next 59 minutes. You need to invalidate (delete) that specific cache entry the moment the listing status changes:

// When a listing is updated, delete its cache entry
app.put('/api/listings/:id', async (req, res) => {
  const { id } = req.params;
  await db.query('UPDATE listings SET ...', [...]);

  // Invalidate the cache for this specific listing
  await redis.del(`listing:${id}`);
  // Also invalidate the "all listings" cache since the data changed
  await redis.del('listings:all');

  res.json({ success: true });
});

Every Layer of Caching

Caching does not happen in just one place. It happens at multiple layers between your server and your user, and each layer has different rules. Here is what each one actually does.

Browser cache — why hard refresh exists

When a user visits your site, their browser saves a copy of your HTML, CSS, JavaScript, and images on their device. Next time they visit, instead of downloading everything again, the browser checks: "Do I already have this? Is my copy still fresh?" If yes, it loads from the device — which is almost instantaneous.

This is why you see "I deployed but the site still shows old content" constantly. The user's browser has a cached copy and is not going back to the server for a new one.

The fix for users: hard refresh. On Mac: Cmd+Shift+R. On Windows: Ctrl+Shift+R. This tells the browser to skip the cache and re-download everything fresh.

The fix for you as the developer: control browser caching with proper cache headers (covered below) and use versioned file names so browsers always fetch new files when you deploy.

CDN cache — speed for users everywhere

A CDN (Content Delivery Network) is a network of servers scattered around the world. When you put your app behind a CDN like Cloudflare or AWS CloudFront, your static files — images, CSS, JavaScript — get copied to servers near your users. A user in Sydney gets files from a Sydney server, not from your origin server in Virginia. That is milliseconds instead of hundreds of milliseconds.

The CDN cache is separate from the browser cache. The browser cache lives on the user's device. The CDN cache lives on servers between the user and your origin. Both can serve stale content after a deploy, but you fix them differently:

  • Browser cache: Use versioned file names (main.v2.js) or cache-busting query strings (main.js?v=20260318)
  • CDN cache: Trigger a cache purge from your CDN dashboard or use their API to invalidate specific files after a deploy

Server-side cache — Redis and in-memory

This is what AI is usually talking about when it suggests adding caching to your REST API. Instead of hitting the database on every request, your server stores a copy of the result in memory (or in Redis) and returns that copy for subsequent requests until the TTL expires.

In-memory caching (using a JavaScript Map or a library like node-cache) is simpler to set up but has two limitations:

  • It disappears when the server restarts
  • It is not shared between multiple server instances — if you have three servers, each has its own separate cache

Redis solves both problems. It is a separate server that stores data in memory but can also save it to disk. All your app servers connect to the same Redis instance, so they share one cache. If you are running more than one server instance, Redis is the right tool.

API response caching

When your app calls a third-party API — a weather service, a maps API, a payment processor's pricing endpoint — that call takes time and often costs money per request. Caching the response means you make the external call once and serve the stored result to all subsequent requests until it expires.

// Cache an external API response
async function getWeatherForCity(city) {
  const cacheKey = `weather:${city}`;
  const cached = await redis.get(cacheKey);

  if (cached) return JSON.parse(cached);

  // External API call — slow and costs money
  const response = await fetch(`https://api.weather.com/current?city=${city}`);
  const data = await response.json();

  // Cache for 10 minutes — weather does not change that fast
  await redis.set(cacheKey, JSON.stringify(data), 'EX', 600);

  return data;
}

Cache headers — telling browsers and CDNs what to do

Cache headers are instructions your server sends in HTTP responses that tell browsers and CDNs exactly how to cache the content. You have seen these if you have ever looked at the HTTP response headers in your browser's DevTools Network tab.

The main ones you need to know:

# Cache-Control — the main instruction header
Cache-Control: max-age=3600              # Cache for 1 hour
Cache-Control: no-cache                  # Always check with server before using cached copy
Cache-Control: no-store                  # Do not cache at all — for sensitive data
Cache-Control: public, max-age=86400     # Any cache (browser, CDN) can store this for 1 day
Cache-Control: private, max-age=3600     # Only browser cache — not CDN (user-specific data)

# ETag — a fingerprint of the content
ETag: "abc123"
# Browser sends this on next request: If-None-Match: "abc123"
# Server says 304 Not Modified if content hasn't changed → browser uses its cached copy

The private vs public distinction matters a lot. If you have a page that shows the logged-in user's dashboard, it should never be cached by a CDN with public — or every user would see the same cached dashboard. Use Cache-Control: private for anything user-specific.

In Express, setting these headers looks like this:

// Cache static assets aggressively — they have versioned names
app.use('/static', express.static('public', {
  maxAge: '1y', // 1 year — the file name changes when content changes
  immutable: true // Tell browsers: this file will never change at this URL
}));

// Cache API responses for 5 minutes
app.get('/api/listings', (req, res) => {
  res.set('Cache-Control', 'public, max-age=300');
  // ... fetch and return listings
});

// Never cache user-specific or sensitive data
app.get('/api/account', (req, res) => {
  res.set('Cache-Control', 'no-store');
  // ... fetch and return account data
});

Nginx can also handle cache headers at the web server level, before requests even reach your application code — useful if you want to cache static file responses without touching your app.

What AI Gets Wrong About Caching

Caching user-specific data as public

This is the most dangerous caching mistake. Your AI will sometimes generate code that caches a response with a generic key like 'user-profile' — meaning the first user's profile gets served to every subsequent user until the cache expires. Always ask: "Is this data the same for all users, or specific to one user?" If it is user-specific, the cache key must include the user ID, and the Cache-Control header must say private.

// ❌ Wrong — caches one user's data for everyone
const cached = await redis.get('user-profile');

// ✅ Right — each user gets their own cache entry
const cached = await redis.get(`user-profile:${req.user.id}`);

Over-caching dynamic data

Your AI will sometimes suggest caching data that changes frequently or that users expect to be live — things like inventory counts, chat messages, or notification badges. Setting a 1-hour TTL on "how many items are in stock" means users could try to buy something that sold out an hour ago. Match your TTL to how often the data actually changes and how much it matters if the displayed value is slightly stale.

Forgetting to invalidate on writes

AI almost always generates the cache-on-read logic correctly and forgets the cache-invalidate-on-write logic. If you cache listings but never clear the cache when a listing is updated, deleted, or added, users will see the stale version until the TTL expires naturally. Every write operation that changes cached data needs to delete the relevant cache entries.

Not understanding TTL units

Redis uses seconds for TTL. JavaScript Date.now() uses milliseconds. AI occasionally mixes these up — generating code that sets a Redis TTL to 3600000 (thinking in milliseconds) when it means 1 hour (3600 seconds). That would cache data for over 41 days. Always check: are you working in seconds or milliseconds?

// ❌ Wrong — this is 41 days, not 1 hour
await redis.set(key, data, 'EX', 3600000);

// ✅ Right — 3600 seconds = 1 hour
await redis.set(key, data, 'EX', 3600);

Caching before knowing what to cache

AI will sometimes eagerly add caching to every route in your app when you ask it to "add caching." Not every endpoint benefits from caching. A fast in-memory query does not need a Redis round-trip. Cache the things that are actually slow: expensive database aggregations, external API calls, and static content. Your AI will get this wrong sometimes — here is how to spot it: if adding caching makes your endpoint slower, you are caching something that was already faster than the cache lookup itself.

The Golden Rule of Caching

Never cache data you are not prepared to serve stale. If showing an outdated value would confuse or deceive users — pricing, inventory, auth state, personal data — either do not cache it, use a very short TTL, or invalidate it immediately on every write.

Security Considerations

Caching introduces security risks that AI rarely mentions unprompted. Here are the ones that matter most for vibe coders.

Never cache authentication tokens or passwords

Cached authentication responses should use Cache-Control: no-store. If an auth token or session gets cached by a CDN or shared cache, a different user could receive it. The auth middleware should run on every request — never short-circuit it with a cache.

Poisoned cache attacks

If your cache key includes user-controlled input — like a query parameter — and you are not careful, an attacker could craft a request that stores malicious content in the cache under a key that legitimate users will hit. Always sanitize and validate any input that becomes part of a cache key.

Private data in CDN cache

CDNs are shared infrastructure. If you accidentally set Cache-Control: public on a page containing private user data, the CDN will cache that page and may serve it to anyone who requests the same URL. Use Cache-Control: private, no-store for anything tied to a specific user's session.

Sensitive data in Redis

If you cache user records, session data, or anything private in Redis, treat that Redis instance with the same security you give your main database. Do not expose it to the public internet, use authentication, and encrypt connections in production.

How to Debug With AI

"I deployed but the site still shows old content"

This is the #1 caching complaint. Work through the layers:

  1. Browser cache: Hard refresh first (Cmd+Shift+R / Ctrl+Shift+R). If that fixes it, only you were seeing the old version — your users will get the new one.
  2. CDN cache: If a hard refresh does not fix it, your CDN is serving the old version. Log into your CDN dashboard (Cloudflare, CloudFront, etc.) and purge the cache for the affected URL or file.
  3. Server-side cache: If you have Redis or in-memory caching, the cached HTML or API response might be stale. Check your TTL and manually delete the relevant cache key.

Check cache headers in DevTools

Open Chrome DevTools → Network tab → click any request → look at the Response Headers. You will see:

  • Cache-Control — what caching instructions the server sent
  • Age — how many seconds old the cached response is (set by CDNs)
  • ETag — the content fingerprint
  • cf-cache-status: HIT or MISS — Cloudflare's cache status

If you see Age: 3542, that means you are getting a cached response that is almost an hour old. If you see cf-cache-status: HIT when you expected fresh content, you need to purge the CDN.

The debugging prompt

Debug Prompt

My site is showing old content after a deploy. Here is my current
cache setup: [paste relevant server/CDN config]

Walk me through each layer of caching that could be serving stale
content. For each layer, tell me how to check if it is the problem
and how to clear it.

Also check my cache-control headers — am I accidentally caching
something I should not be?

Testing your cache

# Check what cache headers your server is actually sending
curl -I https://yoursite.com/api/listings

# Check for CDN cache status (Cloudflare)
curl -I https://yoursite.com/ | grep -i "cf-cache"

# Check a Redis key and its TTL
redis-cli GET "listings:all"
redis-cli TTL "listings:all"   # Returns seconds remaining, -1 means no expiry, -2 means key doesn't exist

# Delete a specific Redis cache key to force a fresh fetch
redis-cli DEL "listings:all"

A practical caching setup to ask AI for

Prompt I Would Type

Add caching to my Express app with these rules:
- /api/listings — cache for 5 minutes, same for all users, use Redis
- /api/listings/:id — cache for 5 minutes per listing ID
- /api/account — never cache (user-specific)
- /static files — cache aggressively with versioned names

When a listing is updated or deleted, invalidate only that listing's
cache key and the all-listings key. Show me the cache-control headers
for each route too.

What to Learn Next

Caching connects to almost every performance and infrastructure topic in backend development. These are the most useful next reads:

  • What Is Redis? — The dedicated deep-dive on Redis: what it is, how to set it up, and what else it can do beyond caching.
  • What Is a CDN? — How CDNs work, why they make sites faster globally, and how to configure caching rules in Cloudflare or CloudFront.
  • What Is a REST API? — Caching API responses is one of the biggest performance wins you can make. Understanding REST is the prerequisite.
  • What Are HTTP Status Codes? — The 304 Not Modified status code is how ETag-based caching works. Makes much more sense once you understand status codes.
  • What Is Middleware? — Cache logic in Express is often implemented as middleware. Understanding the pattern makes it easier to add and remove caching from your routes.
  • What Is Nginx? — Nginx can serve as a caching layer in front of your application, handling cache headers and serving cached responses before requests ever hit your Node app.

Next Step

The next time your AI generates a slow app, ask it: "What are the three slowest operations in this app and which ones are safe to cache?" That one question will point you to exactly where caching will have the most impact — without blindly caching everything and ending up with stale data headaches.

FAQ

Start with a hard refresh (Cmd+Shift+R on Mac, Ctrl+Shift+R on Windows). If that fixes it, your browser had a cached copy — your other users are fine. If it is still stale after a hard refresh, your CDN or server-side cache is serving the old version. Log into your CDN dashboard and purge the cache, or delete the relevant Redis/in-memory cache key on your server.

Browser cache stores files on each visitor's device — it only speeds up repeat visits for that one person. CDN cache stores files on servers around the world, close to your users — it speeds up the first visit for everyone in that region. Both reduce load times and both can serve stale content after a deploy, but you clear them differently: hard refresh for browser cache, CDN purge for CDN cache.

Not necessarily. A JavaScript Map or a library like node-cache works fine for small single-server apps. Redis becomes important when you have multiple server instances (they need to share a cache) or when you need the cache to survive server restarts. Start simple. Add Redis when you outgrow in-memory caching.

TTL stands for Time To Live — how long a cached item stays valid before the next request has to fetch a fresh copy. Redis and most cache systems measure TTL in seconds. A TTL of 3600 means the cached data is used for one hour, then discarded. Choosing the right TTL comes down to one question: how much does it matter if users see data that is this many seconds old?

It depends on the layer. For browser cache: use versioned file names like main.v2.js — browsers treat a new URL as a new file, so they always download it fresh. For CDN cache: trigger a cache purge from your CDN's dashboard or API after each deploy (Cloudflare, CloudFront, and most CDNs support this). For Redis: delete specific cache keys or flush the entire cache with redis-cli FLUSHDB. For in-memory cache: it clears automatically when the server restarts.