ND H100 v5 is Microsoft’s eight-way H100 VM and node. It resembles the NVIDIA DGX H100 reference architecture and uses a similar Intel Sapphire Rapids host platform.
Each VM has1
- 96 Intel Xeon (Sapphire Rapids) cores (physically two 56-core sockets)
- 1,900 GiB DDR5
- 28,000 GiB of local storage (physically 8x NVMe drives)
- 1x 80G Ethernet NIC (physically 100G Azure SmartNIC)
- 8x 400G ConnectX-7 NDR InfiniBand HCAs
- 8x Nvidia H100 GPUs
This is what it looks like from the top:2
Of note,
- At the far end are the intake fans
- The HGX baseboard is just behind them with eight tall heatsinks shown. Underneath the far grab bar are the NVLink Switches under equally tall heat sinks.
- At the near end is the Host Interface Board which connects the CPU board (not visible), the HGX baseboard, and the NICs and SSDs.
This is the rear end of it:
From top to bottom:
- Two pairs of 4 E1.S SSD carriers
- One RJ45 management port and four OSFP ports. Each OSFP port carries two ports of NDR InfiniBand
- A two-port Azure SmartNIC
- A one-port Ethernet NIC that works in conjunction with the SmartNIC
- Six power supplies
These nodes are what comprise the Eagle supercomputer.
Footnotes
-
ND-H100-v5 size series - Azure Virtual Machines | Microsoft Learn ↩
-
Took these photos myself at Microsoft’s booth at SC’23. ↩