What is API Caching?
API caching is the practice of storing the result of API responses in a fast-access storage layer, typically memory, so that subsequent requests for the same data can be served without making another network call. From a system design perspective, caching shifts load away from the network and backend services toward memory, which is orders of magnitude faster and cheaper.
In modern frontend applications, caching is not limited to storing raw data. It also involves managing data freshness, invalidation strategies, background synchronization, and consistency guarantees across multiple components and views.
Why API Caching Matters
- Latency reduction: Cached responses eliminate network round trips and render instantly.
- Backend protection: Fewer requests reduce load on APIs, databases, and third-party services.
- Scalability: Read-heavy traffic can be absorbed without linear backend scaling.
- User experience: Cached data enables instant UI rendering and smoother transitions.
- Cost efficiency: Reduced API calls lower infrastructure and external service costs.
React Query (TanStack Query)
React Query treats server state as a cacheable resource rather than application state. Each query is identified by a stable query key, which maps to a cached value along with metadata such as freshness, error state, and last update time.
Its core strength lies in deterministic caching, background refetching, and precise invalidation. This makes React Query a strong choice for REST-based APIs and microservice architectures where consistency and predictability matter.
const { data, isLoading, error } = useQuery({
queryKey: ["users", userId],
queryFn: () => fetchUser(userId),
staleTime: 5 * 60 * 1000, // data is fresh for 5 minutes
cacheTime: 30 * 60 * 1000 // cache persists for 30 minutes
})- Query keys must be designed carefully to avoid cache collisions.
- staleTime controls freshness, not cache lifetime.
- cacheTime defines how long unused data stays in memory.
- Targeted invalidation prevents unnecessary refetching.
SWR
SWR is based on the stale-while-revalidate caching strategy. It returns cached data immediately and triggers a background request to fetch fresh data, prioritizing perceived performance over strict freshness.
SWR maintains a global cache shared across components and supports custom cache providers, making it suitable for lightweight applications and UI-driven data flows.
const { data, error, isLoading } = useSWR(
"/api/profile",
fetcher,
{ revalidateOnFocus: true }
)- Best suited for read-heavy UI applications.
- Global cache enables automatic data sharing.
- Less control over cache lifecycle compared to React Query.
Apollo Client
Apollo Client uses a normalized in-memory cache tailored for GraphQL. Instead of caching entire responses, it stores individual entities indexed by unique identifiers, allowing partial query resolution without network calls.
This enables advanced patterns such as automatic UI updates when related entities change, optimistic UI updates, and fine-grained cache control. However, it also increases complexity and requires careful schema and cache policy design.
useQuery(GET_USER, {
fetchPolicy: "cache-and-network"
})- cache-first: Default, serves from cache if available.
- network-only: Always fetches from the server.
- cache-and-network: Uses cache, then refreshes in background.
- no-cache: Bypasses cache completely.
Choosing the Right Caching Strategy
There is no universal caching solution. REST-heavy applications typically benefit from React Query, UI-first applications align well with SWR, and GraphQL-based systems gain the most leverage from Apollo's normalized cache.
Final Takeaway
At scale, caching is not an optimization but a foundational architectural decision. Senior engineers are expected to reason about cache keys, invalidation strategies, data freshness, and consistency trade-offs. Mastery of frontend caching libraries directly translates to building performant, resilient, and scalable systems.