Cray EX254n is the blade for Cray EX platforms that hosts GH200.

Each blade has two nodes, and each node has

  • 4x GH200 superchips, each with
    • 1 Grace CPU (72 cores, Arm Neoverse V2)
    • 1 Hopper H100 GPU
  • 128 GB LPDDR5X DRAM
    • Although Grace supports “up to” 480 GB with ECC, the HPE spec sheet only offers a 128 GB option.1
    • Similarly, Alps’s nodes only have 128 GB (or 120 GB2 with ECC).
    • It appears that this node has significantly less than the maximum LPDDR5 than Grace supports, probably reflecting an optimal cost/performance ratio for HPC applications.
  • 4x Slingshot-11 NICs3

I’ve never seen this blade with NVMe SSDs like some other Cray EX blades, but HPE Cray has documentation suggesting there is a blade kit to add M.2.4

Logically, each GH200 appears as its own NUMA domain within a node.2

Here’s a photo I took of the blade:5

Footnotes

  1. NVIDIA Accelerators for HPE (hpe.com)

  2. [2408.14090] Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects (arxiv.org) 2

  3. 8x NVIDIA Grace Hopper Superchips in a Blade HPE Cray EX254n at GTC 2024 (servethehome.com)

  4. HPE Cray EX254n SSD M.2 Support Kit (S2B62A)

  5. HPE’s booth at ISC’24.