Case Study: How One Startup Cut TTFB by 60% with Layered Caching
A real-world case study showing how a SaaS startup used layered caching—browser, CDN, edge, and origin—to reduce TTFB, lower costs, and improve user satisfaction.
Case Study: How One Startup Cut TTFB by 60% with Layered Caching
This case study describes how a mid-stage SaaS startup reduced Time To First Byte (TTFB) from 450ms to 180ms by adopting a layered caching strategy. The project combined front-end changes, CDN configuration, edge logic, and origin-side optimizations.
"Layered caching isn't about doing one thing perfectly—it's about composing layers so each reduces latency and origin load."
Initial Situation
The product was a collaborative web application with a global user base. Users reported slow initial loads, and the team saw origin CPU spikes during marketing-driven traffic surges. The application served HTML pages, large JS bundles, images, and JSON APIs for interactive features.
Goals
- Reduce median TTFB by at least 40%.
- Lower origin request volume by 50% to reduce operational cost.
- Maintain data consistency for user-specific content.
Strategy Overview
They implemented:
- Static asset versioning and long TTLs for CSS/JS/images.
- Edge caching for HTML with surrogate keys and short TTLs plus background revalidation.
- API-level caching for non-user-personalized endpoints and cache hints for client-side caching.
- Origin-side caching using Redis for computed fragments and session caching.
- Observability for cache hit ratios and purge latencies.
Implementation Details
Frontend builds were updated to produce hashed filenames. The CDN was configured with Cache-Control: public, max-age=31536000, immutable for these assets. HTML was treated as semi-dynamic: the CDN used s-maxage=60, stale-while-revalidate=30, and pages were tagged with surrogate keys like page:user-123 for fine-grained purges.
APIs exposing public catalog data used a 120-second TTL and ETag headers. User-specific APIs remained private and cached in Redis at the origin where applicable (e.g., computed aggregates), but client-side storage handled frequently read, low-sensitivity state.
Results
- Median TTFB improved from 450ms to 180ms (60% reduction).
- Origin request rate dropped by 58% during normal traffic windows due to high CDN hit ratios.
- Operational cost declined as origin CPU utilization reduced by two-thirds.
- User satisfaction improved as measured by session lengths and NPS for loading experience.
Lessons Learned
- Instrument everything: Hit ratios and purge latencies were critical to validate improvements.
- Segment content: Not everything should be cached equally—identify safe-to-cache paths.
- Automate purges: Tie purge/tagging to deploys and content updates to avoid stale edge content.
- Use soft purges: They prevented user-facing latency spikes during revalidation windows.
Wrap-Up
Layered caching transformed the startup's user experience and cost profile. The key was combining good defaults (hashed static assets) with targeted, tactical approaches (CDN surrogate keys, Redis for fragments) and clear observability to iterate safely.