Project Ceiba (AWS)

Project Ceiba is the next flagship GPU cluster that AWS is building for NVIDIA using GB200.

Specifications

It will have:

72 racks of GB200 NVL72
5,184 GB200 nodes
10,368 Grace CPUs
20,736 Blackwell B200 GPUs

Performance

The website also claims this supercomputer will provide “414 exaflops of AI.”¹ This claim is only true if each B200 GPU provides 20 PF, which is the sparse FP4 performance rating of B200. Thus, “414 exaflops of AI” is only true for inferencing at FP4 for a model that has been fine-tuned for structured sparsity.

Networking

The official website refers to each B200 accelerator as a “superchip” in some parts and implies that a 4-way GB200 board is a “superchip” in other parts.² Because of this confusing nomenclature, it is unclear what the claim of “1,600 Gbps per superchip” of network bandwidth actually means. It is likely that each GPU will have 400G NICs though, since GB200 reference designs are being paired with either 400G ConnectX-7 or 800G ConnectX-8 adapters.

Project Ceiba – Largest AI Super Computer Co-Built with NVIDIA - AWS ↩
“Project Ceiba’s configuration includes 20,736 NVIDIA GB200 Grace Blackwell Superchips.” and “scales to 20,736 Blackwell GPUs connected to 10,368 NVIDIA Grace CPUs” as stated on Project Ceiba – Largest AI Super Computer Co-Built with NVIDIA - AWS are contradictory statements. ↩

Glenn's Digital Garden

Table of Contents

Explorer

Recent Notes

Azure ND GB200 v6

Azure SmartNICs

LLM training datasets

test-time compute

Reasoning models

Project Ceiba (AWS)

Specifications

Performance

Networking

Graph View

Backlinks

Glenn's Digital Garden

Table of Contents

Explorer

Recent Notes

Azure ND GB200 v6

Azure SmartNICs

LLM training datasets

test-time compute

Reasoning models

Project Ceiba (AWS)

Specifications

Performance

Networking

Footnotes

Graph View

Backlinks