How to Cache API Responses Without Breaking Freshness

The browser cache is one of the most powerful performance tools available to a web application, and one of the most frequently misused. A team will reach for caching because their app feels slow, ship something that returns stale data for the next forty-eight hours, and then disable caching entirely the moment a user complaint lands. The middle path is real, but it requires more deliberate thinking about what "fresh" actually means for each response.

This piece walks through the practical patterns for caching API responses on the client side. Where the browser's built-in caching helps, where it does not, and how to reason about cache invalidation so that the work pays off without producing the freshness bugs that almost always show up later.

A neatly organized server rack with bundled cables behind glass
Photo by Justin Bautista on Unsplash

Why naive caching fails

The naive approach to client-side API caching is to add Cache-Control: max-age=3600 to your API response and call it done. The browser will obediently cache the response for an hour, and the next request within that window will be served from the local cache without hitting the network.

This works perfectly for endpoints that genuinely do not change inside an hour. It fails badly for anything that does. The user updates their profile and the UI keeps showing the old name. An admin invalidates a permission and the user still has access for fifty-nine minutes. A product price changes and a stale page sits in the user's cache through the entire promotional window.

The problem is not the caching. The problem is treating "freshness" as a function of wall-clock time when it is almost always a function of upstream state.

The three caching strategies that actually work

There are three patterns worth knowing. Each addresses a different relationship between cache and source-of-truth.

Stale-while-revalidate. The browser serves the cached response immediately, then fires a background request to update the cache. The user sees fast loads on every request, and the data converges to fresh within one cycle. The trade-off is that the first paint can be a few seconds out of date.

ETag-based conditional requests. The browser stores the response with its ETag header, and every subsequent request includes If-None-Match: <etag>. If the server's data has not changed, it returns a 304 with no body, which is cheap. If it has changed, the server returns the new response.

Server-pushed invalidation. The cache stays valid indefinitely, but the server proactively notifies clients (via WebSocket, Server-Sent Events, or push notification) when underlying data changes. The client invalidates the cache on receiving the notification. Most complex, most exact.

A real application typically uses different strategies for different endpoints. Static reference data uses long max-age. User-specific state uses stale-while-revalidate. Critical mutable state uses ETag plus server-pushed invalidation. The error is treating "caching" as a single decision rather than a per-endpoint design.

How the browser cache actually works

The HTTP caching specification, documented in RFC 9111, defines the rules every modern browser implements. The two headers that matter most are Cache-Control and ETag.

Cache-Control controls how long the response is considered fresh. The directives worth knowing:

max-age=N says the response is fresh for N seconds.
s-maxage=N says the same, but only applies to shared (CDN) caches.
no-cache says the response can be stored but must be revalidated before use.
no-store says the response must not be stored at all. Use this for sensitive data.
private says only the user's browser may cache, not any shared CDN.
stale-while-revalidate=N says the response can be served stale for N seconds while a background revalidation fires.

The combination most APIs should default to: Cache-Control: private, max-age=0, stale-while-revalidate=60. This says "always revalidate, but if the revalidation is in flight, you may serve the previous response for up to a minute." Fast for the user, fresh within sixty seconds.

ETag is an opaque identifier the server attaches to a response. The convention is to derive it from a hash of the response body, or from a version number on the underlying data. When the browser revalidates, it sends If-None-Match: <etag>. The server either returns 304 Not Modified (cache stays valid) or 200 with a new response and a new ETag.

A data center hallway with rows of servers under cool lighting
Photo by panumas nikhomkhai on Pexels

The Cache API and service workers

The HTTP cache is the browser's automatic layer. For more control, the Cache API (accessed via service workers) lets you implement custom caching logic that lives entirely in JavaScript.

A service worker intercepts every fetch from your page, decides whether to satisfy it from the Cache storage or to forward it to the network, and writes new responses back into the cache as appropriate. The pattern is more code than HTTP headers but it unlocks behavior the headers cannot express:

Caching responses keyed by a normalized request (ignoring query parameters that do not affect the response).
Returning cached responses on network error (offline-tolerant UI).
Pre-warming the cache with responses for routes the user has not visited yet.
Programmatic cache invalidation based on application events (a successful POST clears the cache for related GETs).

The Cache API is wide and not every team needs it. The point of mentioning it is that the browser's built-in caching is the floor, not the ceiling. If your needs go past the floor, the next layer is available without third-party dependencies.

"Most API-caching bugs come from treating the cache as set-and-forget rather than as something the application architecture has to actively coordinate with. The freshness contract is part of the API design." - Dennis Traina, founder of 137Foundry

Invalidation patterns that actually fire when they should

The single largest category of caching bugs is invalidation that does not fire when it should. The user updates their profile, the cache entry for /api/user/profile should be invalidated, and somewhere in the code, it is not.

Three patterns reduce this failure mode:

Co-locate the invalidation with the mutation. Every place in your client code that calls a mutating API (POST, PUT, DELETE) should explicitly invalidate the cache entries that the mutation affects. Libraries like TanStack Query and SWR make this convention almost automatic, exposing a queryClient.invalidateQueries(...) or mutate(...) call right next to the mutation.

Tag-based invalidation. Instead of invalidating individual URLs, attach tags to cache entries. Every entry related to user 42 gets a user:42 tag. A mutation to that user's data invalidates everything tagged user:42 in one call. This scales better than maintaining per-URL invalidation lists.

Server-confirmed invalidation. For critical data, the mutation response includes an explicit list of invalidated keys. The client honors the list, not its own inference. The server is the source of truth for what changed; the client trusts it.

A network operations center showing wall monitors with status grids
Photo by Brett Sayles on Pexels

How to test caching behavior without going insane

The hardest thing about caching is testing it. The cache is invisible most of the time, and bugs only surface in production traffic patterns that local development rarely reproduces.

Three habits help:

Add cache-state logging in development. Every cache hit, miss, and revalidation should log to the console with the URL and a reason. The pattern of hits and misses becomes obvious within a few minutes of using the app.

Use a deliberate stale-data tester page. A simple internal page that displays the same data fetched twice (once cached, once with ?nocache=1) side by side. Discrepancies become visible at a glance.

Run a synthetic test that mutates and immediately reads. A real end-to-end test that creates a record, reads it, updates it, reads it again, and asserts that both reads reflect the latest state. If this fails in a staging environment, the caching invalidation logic has a bug. This is the kind of test that catches the class of bugs that production telemetry will never surface clearly.

A close-up of fiber optic cables with light strands behind glass panels
Photo by Brett Sayles on Pexels

When to skip the browser cache entirely

Some endpoints should not be cached at all. The list is short but important:

Authentication endpoints. The token response is single-use; caching it is a security risk.
Endpoints that include sensitive data the user should not see after logout. Use Cache-Control: no-store to ensure the browser does not write the response to disk.
Endpoints with side effects on the server. A POST should never be cached, and most browsers know this automatically, but be explicit anyway.
Real-time endpoints where every read should hit the source. Cache misses here are the goal.

The default for these should be Cache-Control: no-store plus Pragma: no-cache for legacy HTTP/1.0 proxies that still exist in some corporate networks. Belt and suspenders for the cases where staleness has a real cost.

A pragmatic default policy for most APIs

If you are designing an API caching policy from scratch and you do not have a strong opinion yet, the following default works for most cases:

GET endpoints returning user-specific data: Cache-Control: private, max-age=0, stale-while-revalidate=60, with ETags.
GET endpoints returning public reference data that changes daily: Cache-Control: public, max-age=3600, stale-while-revalidate=86400.
GET endpoints returning truly static assets: Cache-Control: public, max-age=31536000, immutable. Use content-hashed URLs.
POST, PUT, DELETE: no caching headers. Browsers do not cache these by default.

This policy gives you fast loads on the common case, reasonable freshness on user-specific data, and aggressive caching on the static assets that benefit from it. It is not the optimum for every situation, but it is a sensible starting point that you can tune from.

For the underlying HTTP specification, RFC 9110 covers the broader semantics, and the MDN HTTP caching documentation walks through every directive with examples. The 137Foundry web development service page covers some related architectural decisions on the API side as well, and the 137Foundry services hub lists the rest of the technical work we cover. For more on related topics, the 137Foundry homepage is the entry point.

Why naive caching fails

The three caching strategies that actually work

How the browser cache actually works

The Cache API and service workers

Invalidation patterns that actually fire when they should

How to test caching behavior without going insane

When to skip the browser cache entirely

A pragmatic default policy for most APIs

More Articles

How to Design Inline Form Validation That Actually Helps Users

How to Handle Late-Arriving Data in a Streaming Integration Pipeline Without Corrupting Downstream Reports

How to Design a Mobile App Permission Request Flow That Users Actually Accept