Glenn's Digital Garden
Explorer
nodes
Azure HBv4
Azure ND A100 v4
Azure ND H100 v5
Azure ND MI200 v4
Azure ND MI300X v5
Cray EX235a
Cray EX235n
Cray EX254n
Cray EX255a
Cray EX425
papers
Azure Accelerated Networking: SmartNICs in the Public Cloud
processors
AMD MI250X
AMD MI300A
AMD MI300X
Custom A100 GPUs
Intel Ponte Vecchio
Microsoft Maia 100
NVIDIA A100
NVIDIA B200
NVIDIA GH200
NVIDIA Grace
NVIDIA H100
systems
ALCF Aurora
Bristol Isambard-AI
CINECA Leonardo
CSCS Alps
JSC JUPITER
LLNL El Capitan
Meta's H100 clusters
Microsoft Eagle
NERSC Perlmutter
OLCF Frontier
R-CCS Fugaku
Sandia Red Storm
TACC Horizon
AMD CDNA
AMD Zen
Availability
Azure SmartNICs
Azure supercomputers
Benchmarking blobfuse
Broadcom Tomahawk 5
Cables and connectors
Canadian HPC
Component reliability
Cray EX
DAOS
Digital gardens
Dragonfly topology
Dragonfly+ topology
DRAM architecture
ECC schemes
Electricity and power for HPC
Excursions
Frontier models
Google TPUv4
Government's role in AI
GPU terminology decoder ring
GSP and SMC
High-Performance Linpack
InfiniBand
Job Mean Time To Interrupt
Job reliability
LLM inferencing
LLM training
LLM training at scale
LLM training datasets
LPDDR5 Reliability
Lustre
Memory bandwidth
Meta Llama-3.1
Minipack2
MTBF, FIT, and AFR
Multi-plane topologies
Network flow
Networking for LLM training
NVIDIA GB200
Obsidian
OPT-175B
PCIe Gen6
Reliability
Signal modulation
Slingshot
Storage for LLM training
Structured sparsity
Synthetic data
System architect
Tensor cores and Matrix cores
Ultra Ethernet
VAST
Wisdom
Search
Search
Search
Dark mode
Light mode
Home
❯
tags
❯
Tag: model
Tag: model
2 items with this tag.
Aug 31, 2024
OPT-175B
anecdotes
model
Aug 30, 2024
Meta Llama-3.1
model