// Services

Scalability Testing

"Just add more servers" is a hypothesis, not a strategy. Scalability testing measures how much capacity each added resource actually buys you — and where adding more stops helping.

What it is

Scalability testing runs the same workload across multiple resource configurations — varying instance counts, instance sizes, or both — and measures how maximum sustainable throughput changes. The output is a scaling curve: capacity as a function of resources.

Linear in theory, sublinear in practice

Perfect scaling (2× nodes → 2× capacity) is rare. Shared resources — databases, caches, message brokers, distributed locks — serialise some fraction of every request, and that serial fraction caps total speed-up (Amdahl's law). Coordination costs grow with node count. A typical finding: the web tier scales near-linearly to 12 nodes, but the primary database saturates at the equivalent of 7, making further web-tier spend pure waste.

Scaling curve (example, measured):
  2 nodes :  9,800 tps   (1.00× per node baseline)
  4 nodes : 19,100 tps   (0.97×)
  8 nodes : 34,400 tps   (0.88×)
 16 nodes : 41,200 tps   (0.53×)  ← DB write saturation
 Conclusion: scale-out effective to ~10 nodes;
 beyond that, invest in the database tier.

Auto-scaling validation

For cloud platforms we additionally test the dynamics of scaling: are the scaling metrics and thresholds right? How long from threshold breach to serving capacity? Does scale-in behave safely under sustained load? Do scaling events themselves cause latency spikes (cold starts, cache repopulation, connection storms against the database)?

What you get

A measured scaling curve with per-node efficiency; identification of the first non-scaling bottleneck; cost-per-unit-of-capacity at each configuration so finance and engineering can agree on a target; and tuned auto-scaling policies with evidence behind every threshold.