NVLink

NVLink is the interconnected used between NVIDIA GPUs for memory coherence. GPUs have NVLink ports, and NVLink Switches (also referred to as NVSwitches¹) allow multiple GPUs to connect into a mesh.

NVLink 7

NVLink 7 will be used on R300.²

NVLink 6

NVLink 6 will be used on R200. ²

NVLink 5

NVLink 5 provides 400G+400G per link.³

NVLink 5 switches have 72x400G ports.³

One B200 GPU has up to 1.8 TB/s (14,400 Gbps) of NVLink 5 bandwidth,¹ which breaks down to 18 NVLink 5 ports.

A GB200 NVL72 rack has:

72 GPUs (4 per tray, 18 trays)
18 NVSwitch ASICs (2 per tray, 9 trays)

Since each GPU has 18 NVLink ports and each rack has 18 NVSwitches, each GPU can connect to each NVSWitch in the rack to form a single-layer, non-blocking fabric.

NVLink-C2C

NVLink C2C (chip-to-chip) is the interconnect that comes off of NVIDIA’s Grace CPUs to carry coherence traffic to either other Grace CPUs or NVIDIA GPUs which implement NVLink. It was positioned against PCIe Gen 5 and can carry Arm AMBA CHI protocol.⁴

Physically, NVLink-C2C is implemented using:⁵

single-ended (ground-reference) NRZ signaling, not low-voltage differential
nine data signals per link, each at 40 Gbps, per direction, or 45 GB/s per direction

Grace and H100 both implement ten NVLink-C2C links per socket, for a total of 450 GB/s/direction or 900 GB/s bidirectional total⁵

cNVLink

Coherent NVLink (cNVLink) is something that appeared in the marketing material surrounding the launch of the Grace CPU. The only reference to it that I could find was in the HotChips34 talk about Grace⁶ where it is described as a protocol that is supported alongside PCIe Gen 5 on two of Grace’s PCIe Gen5 x16 blocks. Maybe it’s for some future NVLink-capable add-in card like a Mellanox NIC.

NVLink 4

This is the generation of NVLink supported on H100. It supports 25 GB/s per link, per direction.

Each H100 has 18 NVLinks, each at 25+25 GB/s (200+200 Gbps)⁷
- NVIDIA calls this 900 GB/s
- This is really 450 GB/s per direction
Each H100 is connected to each NVSwitch via
- 5 NVLinks to two NVSwitches
- 4 NVLinks to the other two NVSwitches
- 5+5+4+4 = 18 NVLinks per H100 GPU to four NVSwitches

Each NVSwitch has:⁷

100 Gbps per lane implemented as 50 Gbaud PAM4
64 ports, each two lanes wide, or 64x200G ports

NVSwitch in this generation also supports scale-up via external OSFP connectors.⁷

NVLink 3

This is the generation of NVLink supported on A100.

Each A100 GPU has 12 NVLink ports⁸
Each A100 is connected to each NVSwitch via two links
8x A100 $\times$ 2 links to each switch implies 16 ports per NVSwitch
Each A100-to-NVSwitch connects at 600 GB/s bidirectionally
- This is 300 GB/s/direction or 2,400 Gbps per direction
- This is implemented over 12 NVLink connections
- This implies each NVLink connection is 200 Gbps per direction

Glenn's Digital Garden

Explorer

NVLink

NVLink 7

NVLink 6

NVLink 5

NVLink-C2C

cNVLink

NVLink 4

NVLink 3

Graph View

Table of Contents

Backlinks

Glenn's Digital Garden

Explorer

NVLink

NVLink 7

NVLink 6

NVLink 5

NVLink-C2C

cNVLink

NVLink 4

NVLink 3

Footnotes

Graph View

Table of Contents

Backlinks