Ray

Ray is a framework developed by Ion Stoica’s group that provides abstractions that simplify scaling up training and inferencing.

Prominent users

OpenAI famously used Ray Train to train GPT-3.5 and GPT-4.¹
Thinking Machines Lab appears to use Ray for its data engineering.² Ray is what sits behind TML’s Tinker API to spawn RL trainer and sampler GPU clusters.³
Microsoft AI uses Ray for⁴
- training, which enables “fast recovery through in-job restarts”
- reinforcement learning within their Rocket architecture