Ray is a framework developed by Ion Stoica’s group that provides abstractions that simplify scaling up training and inferencing.
Prominent users
- OpenAI famously used Ray Train to train GPT-3.5 and GPT-4.1
- Thinking Machines Lab appears to use Ray for its data engineering.2 Ray is what sits behind TML’s Tinker API to spawn RL trainer and sampler GPU clusters.3
Footnotes
-
See What is Ray?. Greg Brockman also stated that Ray was used to train OpenAI’s largest models at the 2022 Ray Summit; that clip can be viewed here: https://www.youtube.com/clip/UgkxBHfDgDA-IThBcmkOSUhDqF7FgLUdwB2V ↩
-
https://job-boards.greenhouse.io/thinkingmachines/jobs/5013919008 ↩
-
Ray Summit 2025 Keynote: The Shift to LLM Fine-Tuning with Thinking Machines ↩