• English
    • Français
  • English
  • Français
Login Get Amplify

Glossary

Key concepts in AI infrastructure.

AI Gateway

A proxy layer between your application and LLM providers that handles routing, caching, failover, and observability.

Context Window

The maximum number of tokens an LLM can process in a single request, including both input and output.

Fallback / Failover

Automatically routing requests to a backup LLM provider when the primary provider fails or is unavailable.

LLM Observability

The ability to monitor, debug, and understand LLM application behavior through logging, metrics, and tracing.

LLM Proxy

A server that forwards LLM API requests on behalf of your application, adding features like caching, logging, and failover.

Model Routing

The practice of automatically selecting which LLM model handles each request based on criteria like cost, capability, or latency.

Prompt Engineering

The practice of designing and optimizing prompts to get better results from LLMs.

Rate Limiting

Controlling the number of API requests allowed within a time period to prevent overuse and manage costs.

Semantic Caching

A caching technique that returns stored responses for semantically similar queries, not just exact matches.

Token

The basic unit of text processing in LLMs. Tokens can be words, subwords, or characters depending on the model's tokenizer.

© 2025 ScaleMind. All rights reserved.

twitter (x)

A website template crafted with love by Cosmic Themes