Glossary

What is Context Window?

The maximum number of tokens an LLM can process in a single request, including both input and output.

The context window determines how much text you can send to and receive from an LLM. Longer context windows allow processing larger documents but cost more. GPT-4 has 128K tokens, Claude 3 has 200K, and Gemini 1.5 Pro has 1M. AI gateways can route based on context length to optimize costs.

Examples

→ GPT-4: 128K tokens (~300 pages of text)
→ Claude 3: 200K tokens (~500 pages)
→ Gemini 1.5 Pro: 1M tokens (~2,500 pages)

Related Terms

Token Model Routing

Ready to implement context window?

ScaleMind provides everything you need.

Get Started Free →