Sequence length or context length or context window is the maximum number of tokens that can be input into a transformer.
Here are a few examples:
| Model | Sequence Length | Reference |
|---|---|---|
| GPT-4o | 128K | |
| Llama-3.1 405b | 128K | |
| DeepSeek-R1 | 128K | |
| GPT-4.1 | 1M | 1 |
| Gemini 2.5 Pro | 1M | 2 |
Footnotes
-
Gemini models. Gemini 2.5 Pro actually has a sequence length of tokens. ↩