Glenn's Digital Garden

❯

xAI Grok-1

Sep 22, 2025

artificial-intelligence/model
seedling

Grok-1 is a 314B parameter MOE LLM. It has:

8192 sequence length
8 experts with two active per token
64 layers
48 attention heads for queries, 8 for keys/values
6144 embedding size (48 $\times$ 128)
Used SentencePiece tokenizer with 128Ki vocabulary
Rotary embeddings (RoPE)

It was trained using JAX, and the model checkpoint was shipped as 770 shards, totaling 318 GB. Parameters are 8-bit quantized.

Graph View

Backlinks

JAX

Created with Quartz v4.4.0 © 2025

glennklockwood.com
@glennklockwood.com