What is Latency?

Question

What is Latency?

Accepted Answer

The time between sending a request to Claude and getting a response back. Low latency makes AI feel fast and responsive. High latency makes users feel like they're waiting. Latency depends on the model (larger models are slower), the length of the response, and network conditions. Streaming — showing Claude's response as it's being generated rather than waiting for the full answer — is the most common way to make high-latency responses feel faster.