SGLang

SGLang is an inferencing framework that is the successor to vLLM. It shares the same founding DNA with vLLM and is a collaboration between Berkeley, Stanford, UCSD, CMU, and MBZUAI.

SGLang implements RadixAttention for KV cache offload

Users

Microsoft AI used SGLang to fine-tune MAI-Thinking-1. See MAI-1 > Fine-tuning MAI-Thinking-1.

Glenn's Digital Garden

Explorer

SGLang

Users

Graph View

Backlinks