Skip to main content

NVIDIA Vera CPU (2026)

The Vera CPU is NVIDIA's second-generation custom Arm CPU, officially launched at GTC 2026 (March 2026) as the central processor of the Vera Rubin supercomputing platform. It is the world's first CPU with native FP8 support, purpose-built for Agentic AI inference and reinforcement learning workloads.

Key Specifications

ItemSpecification
ArchitectureNVIDIA custom Arm, compatible with Armv9.2 ISA
Cores88 cores
Threads176 threads (NVIDIA Spatial Multithreading)
ReleaseMarch 2026 (GTC 2026)
ProductionQ3 2026 (shipping with Rubin platform)

Memory & Bandwidth

ItemSpecification
Memory TypeLPDDR5X
Max Capacity1.5 TB (per CPU)
Memory Bandwidth1.2 TB/s
Vera Rubin NVL72 Total36 Vera CPUs = 54 TB LPDDR5X system memory

CPU-GPU Interconnect

ItemSpecification
InterconnectNVLink-C2C (2nd gen)
Coherent Bandwidth1.8 TB/s (CPU↔GPU)
vs PCIe Gen67× the bandwidth of PCIe Gen6
ArchitectureUnified virtual address space (CPU + GPU)

AI Inference Performance

Vera CPU is heavily optimized for the AI inference pipeline:

  • World's first native FP8 CPU: 6× 128-bit SVE2 SIMD units per core
  • Data processing: the performance of previous-gen Grace CPU
  • Agentic inference: Single rack of 256 liquid-cooled Vera CPUs can run 22,500 concurrent CPU sandboxes
  • Long context: 1.5 TB large memory cache for 1M+ token contexts

Vera Rubin Platform Integration

Vera CPU and Rubin R100 GPU use coWoS-L (Chip-on-Wafer-on-Substrate with LSI) packaging:

┌─────────────────────────────────────────────┐
│ Vera Rubin NVL72 Rack (1 rack) │
├─────────────────────────────────────────────┤
│ 36 × Vera CPU + 72 × Rubin R100 GPU │
│ NVLink-C2C 1.8 TB/s full interconnect │
│ Total: 54 TB LPDDR5X + 576 GB HBM4 │
└─────────────────────────────────────────────┘

Use Cases

ScenarioDescription
Agentic AI InferenceMulti-step reasoning, tool use, environment interaction
Reinforcement LearningHigh-throughput CPU sandbox parallel simulation
LLM TrainingMoE model training — same performance as Blackwell with ¼ the GPU count
Data PreprocessingData compression/decompression, tokenization, feature engineering

Competitive Comparison

FeatureVera CPUGrace CPU (prev-gen)AMD EPYC 9005
Cores8872192
ArchitectureCustom Armv9.2Custom Armv9x86-64 (Zen 5)
MemoryLPDDR5X 1.5TBLPDDR5X 960GBDDR5 6TB
Memory Bandwidth1.2 TB/s1.0 TB/s~0.6 TB/s
CPU-GPU InterconnectNVLink-C2C 1.8 TB/sNVLink-C2C 900 GB/sPCIe Gen6 256 GB/s

References