What is Throughput?

Question

What is Throughput?

Accepted Answer

How many requests a system can handle per unit of time — the volume capacity, as opposed to how fast any single request is. A system might have low latency (each request is fast) but low throughput (it can only handle a few at once). At production scale, you need both: fast individual responses and the ability to handle many users simultaneously. Throughput is what determines whether your AI feature works fine for 10 users and falls apart at 10,000.