Sequence length or context length or context window is the maximum number of tokens that can be input into a transformer.
Here are a few examples:
Model | Sequence Length | Reference |
---|---|---|
GPT-4o | 128K | |
Llama-3.1 405b | 128K | |
DeepSeek-R1 | 128K | |
GPT-4.1 | 1M | 1 |
Gemini 2.5 Pro | 1M | 2 |
Footnotes
-
Gemini models. Gemini 2.5 Pro actually has a sequence length of tokens. ↩