// Services

Spike Testing

Ticket release. Flash sale. Television mention. Government announcement. Some traffic doesn't ramp politely — it arrives all at once. Spike testing measures whether you survive the moment of impact.

What it is

A spike test applies a near-instantaneous step change in load — commonly 3× to 10× normal traffic within seconds — then observes three things: immediate behaviour during the surge, stability while it sustains, and recovery once it passes.

Why ramped tests don't cover this

Gradual ramps give every layer time to adapt: auto-scalers add capacity, caches populate, JIT compilers warm, connection pools grow gently. A real spike removes all of that grace. Auto-scaling typically takes minutes to add capacity — the spike's damage is done in the first 30 seconds. Cold caches turn 1 backend query per request into 10. Thread pools fill and requests queue at every tier simultaneously.

What we evaluate

Time-to-first-error and error budget consumed during the surge; queue and backlog behaviour at each tier; auto-scaling reaction time versus damage window; effectiveness of protective mechanisms — rate limiting, waiting rooms, load shedding, static fallbacks; and whether the system returns to baseline cleanly or requires intervention.

Profile: 8× spike, ticket-release scenario
  baseline : 300 req/s steady
  spike    : 300 → 2,400 req/s in < 10 s
  sustain  : 2,400 req/s for 10 min
  release  : back to 300 req/s
  observe  : errors in first 60 s, autoscale lag,
             queue depth, time to return to baseline p95

Common recommendations

Pre-scaling ahead of known events; queue-based waiting rooms with honest user feedback; aggressive caching of the surge path (often a single product or content page); request prioritisation so revenue-critical transactions survive when others are shed; and retry policies with jitter so clients don't synchronise into secondary spikes.