Glenn's Digital Garden

❯

LatentMoE

Mar 24, 2026

artificial-intelligence

LatentMoE is a variant of the mixture of experts architecture that reduces the size of the activation matrix before it gets routed to experts.

Taking Nemotron 3 Super as an example,¹

The above nomenclature comes from the Huggingface model configuration.²

Footnotes

nemotron-3-super-120b-a12b Model by NVIDIA | NVIDIA NIM ↩
config.json · nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-Base-BF16 at main ↩

Graph View

Backlinks

Nemotron 3

Created with Quartz v4.5.2 © 2026

glennklockwood.com
@glennklockwood.com