Foundation Models & LLMsDevelopers
Tokenization
The step where AI breaks your text into small pieces — called tokens — before it can process anything. A token is roughly three-quarters of a word. "Hello, how are you?" is about 6 tokens. This matters practically because API costs and usage limits are measured in tokens, not words or characters. The more text you send (and receive), the more tokens you use.
◎
In practice
Before Claude can process your message, it gets broken into tokens — small text chunks, roughly ¾ of a word each. "Unbelievable" might be one token. "AI" might be one token. This matters practically: a 10,000-word document is roughly 13,000 tokens, which affects both cost and whether it fits in the context window.
Related concepts