This page serves as a locus for all things Anthropic.

Software stack

Anthropic has disclosed the following about their inferencing stack:

  • They inference using NVIDIA GPUs, Google TPUs, and AWS’s Trainium.1
  • They use Kubernetes in their serving environment2
  • They appear to use JAX.1
  • They pin sessions to specific servers to improve KV cache hit rate.1
  • They provision pools of GPUs to specifically handle long contexts.1 This makes sense, given they likely use context parallelism to support long contexts.

Their observability stack was described in 2025 at ClickHouse Open House 2025:2

Anthropic also uses hex for something.3 Maybe observability?2

Footnotes

  1. A postmortem of three recent issues \ Anthropic 2 3 4

  2. How Anthropic is using ClickHouse to scale observability for the AI era 2 3

  3. The AI Analytics Platform for your whole team | Hex - Under the “trusted by” logo wall.