B200 is NVIDIA’s top-shelf Blackwell-generation datacenter GPU. Each GPU has:

B100 GPUs are a lower-power variant of B200 (700W) that is meant to be a “drop-in replacement” for HGX H100 platforms.3 That is, you can take a server platform built for 8-way H100 baseboards, swap in B100 baseboards, and sell them without having to re-engineer power or thermals.

Performance

The following are theoretical maximum performance in TFLOPS for the GB200 variant of B200 when configured to 1,200 W:3

Data TypeVFMAMatrixSparse
FP644545
FP3290
TF3212502500
FP1625005000
BF1625005000
FP8500010000
FP6500010000
FP41000020000
INT8500010000

The HGX variant of B200 supports the following when configured at 1,000 W:3

Data TypeVFMAMatrixSparse
FP644040
FP3280
TF3211002200
FP1622504500
BF1622504500
FP845009000
FP645009000
FP4900018000
INT845009000

GB200

GB200 is a combination of Grace CPUs and B200 GPUs in a single coherent memory domain. There are a couple of variants which are described below.

GB200 NVL4

GB200 NVL4 is a single board with two Grace CPUs and four Blackwell GPUs, all soldered down. This is a photo of one shown at SC24.

This variant only supports four B200 GPUs per NVLink domain and appears to be the preferred choice for traditional scientific computing workloads which do not need (nor can afford) the rack-scale NVL72 domain.

Further evidence of this affinity for HPC is the Cray EX154n blade, which puts this NVL4 node in the Cray EX form factor.

GB200 NVL72

GB200 NVL72 is the rack-scale implementation of GB200 which connects 72 B200 GPUs using a single NVLink domain. It is composed of:

  • GB200 superchip boards, each with two B200 GPUs and one Grace CPU
  • Server sleds, each containing two GB200 superchip boards
  • NVLink Switch sleds, each containing two NVLink 5 Switch ASICs
  • NVLink cable cartridges
  • Power shelves to power the rear bus bar
  • Liquid cooling manifolds

The rack is referred to as the Oberon rack.4

The 1C:2G GB200 superchip looks like this:

From left to right are one Grace CPU surrounded by LPDDR5X, two B200 GPUs, and the NVLink connectors (orange) which mate with the cable cartridges. Two of the above superchips fit in a single server sled.

There are two variants of the rack-scale architecture:

  1. 1 rack with 72 GPUs (18 server trays) and 9 NVLink Switch trays. This consumes around 120 kW per rack and is the mainstream variant.
  2. 2 racks, each with 36 GPUs (9 server trays)

The two-rack variant requires ugly cross-rack cabling in the front of the rack to join each rack’s NVLink Switches into a single NVL72 domain. This two-rack form factor exists for datacenters that cannot support 120 kW racks.

Server sleds and NVLink Switch sleds are interconnected using four rear-mounted cable cartridges, illuminated in the below mockup of the DGX GB200 on display at GTC25:

See GB200 NVL72 | NVIDIA for more information.

Footnotes

  1. NVIDIA/open-gpu-kernel-modules (github.com)

  2. GB200 NVL2 | NVIDIA 2

  3. NVIDIA Blackwell Architecture Technical Brief 2 3

  4. Jensen Huang’s Keynote at GTC25.