Glenn's Digital Garden

Home

❯

SGLang

SGLang

Jun 03, 2026

  • artificial-intelligence/inference
  • product

SGLang is an inferencing framework that is the successor to vLLM. It shares the same founding DNA with vLLM and is a collaboration between Berkeley, Stanford, UCSD, CMU, and MBZUAI.

SGLang implements RadixAttention for KV cache offload

Users

  • Microsoft AI used SGLang to fine-tune MAI-Thinking-1. See MAI-1 > Fine-tuning MAI-Thinking-1.

Graph View

Backlinks

  • KV cache offload cost model
  • MAI-1
  • inferencing frameworks

Created with Quartz v5.0.0 © 2026

  • glennklockwood.com
  • @glennklockwood.com