// Knowledge Base

Key Performance Metrics

Four families of metrics describe almost everything that matters under load: latency, throughput, errors and saturation. Most performance-testing mistakes trace back to measuring the wrong one — or summarising it wrongly.

Latency: percentiles or nothing

Response-time distributions are long-tailed, so the mean is dominated by the bulk of fast requests and hides the slow tail entirely. A service can report a 120 ms average while 1% of users wait 8 seconds. We report:

PercentileMeaningUse
p50Typical experienceSanity baseline
p90/p95Unlucky-but-common experiencePrimary SLO target
p99The tail users churn overSecondary SLO; tail-health signal
maxWorst single requestTimeout and outlier investigation

Tail latencies compound: a page that fans out to 10 backend calls hits each backend's p99 far more often than 1% of the time. At 10 parallel calls, ~10% of pages experience at least one p99 backend response. Tails are a fan-out problem, not an edge case.

Throughput: requests vs transactions

Requests/second is what tools report; business transactions/second is what stakeholders mean. A checkout might be 14 requests. We model and report both, with the mapping explicit. Throughput is only meaningful alongside latency: any system can do 10,000 tps if you don't care how long responses take.

Errors: rate, type and honesty

Error rate under load is a first-class result, not a footnote. We classify by mechanism — timeouts vs connection refusals vs HTTP 5xx vs semantic failures (200-with-error-body) — because each implicates a different layer. Semantic failures are the silent killer: status-code-only checks under stress routinely miss them.

Saturation: the leading indicator

Latency and errors are symptoms; saturation is the cause. We track utilisation and queue depth on every constrained resource: CPU, memory/GC, connection pools, thread pools, disk and network I/O, database locks. The queue depth metrics matter more than utilisation — a resource at 80% utilisation with a growing queue is in worse shape than one at 95% with none. (The mechanism: Little's Law & queueing theory.)