Cray EX255a is the blade for Cray EX platforms that hosts GH200.

Each blade has two nodes, and each node has

  • 2x GH200 superchips
    • 1 Grace CPU (72 cores, Arm Neoverse V2)
    • 1 Hopper H100 GPU
  • 128 GB LPDDR5X DRAM
    • Although Grace supports “up to” 480 GB with ECC, the HPE spec sheet only offers a 128 GB option.1
    • Similarly, Alps’s nodes only have 128 GB (or 120 GB2 with ECC).
    • It appears that this node has significantly less than the maximum LPDDR5 than Grace supports, probably reflecting an optimal cost/performance ratio for HPC applications.
  • 4x Slingshot-11 NICs3

I don’t think this blade can accept NVMe SSDs like some other Cray EX blades.

Logically, each GH200 appears as its own NUMA domain within a node.2

Here’s a photo I took of the blade:4

Footnotes

  1. NVIDIA Accelerators for HPE (hpe.com)

  2. [2408.14090] Exploring GPU-to-GPU Communication: Insights into Supercomputer Interconnects (arxiv.org) 2

  3. 8x NVIDIA Grace Hopper Superchips in a Blade HPE Cray EX254n at GTC 2024 (servethehome.com)

  4. HPE’s booth at ISC’24.