Glossary
What is Token?
The basic unit of text processing in LLMs. Tokens can be words, subwords, or characters depending on the model's tokenizer.
LLMs process text as tokens, not characters or words. A token is roughly 4 characters or 3/4 of a word in English. Pricing is based on tokens: both input (prompt) and output (completion) tokens are counted. Understanding tokenization is crucial for cost estimation and staying within context limits.
Examples
- → 'Hello, world!' is typically 3-4 tokens
- → GPT-4 charges $30/1M input tokens, $60/1M output tokens
- → Context windows are measured in tokens (e.g., 128K tokens)
Related Terms
Ready to implement token?
ScaleMind provides everything you need.
Get Started Free →