NVIDIA Vera CPU

NVIDIA Vera CPU is NVIDIA's AI-specific CPU announced at GTC 2026, featuring 88 custom Olympus cores, supporting ARMv9.2 instruction set, equipped with up to 1.5TB LPDDR5X memory and 1.2TB/s bandwidth, serving as the host CPU for the Vera Rubin platform, responsible for data movement scheduling, memory management, and system control orchestration.

Key Specifications

Specification	Value
CPU Architecture	ARM architecture (Olympus cores)
Instruction Set	ARMv9.2 (fully compatible)
Core Count	88 Olympus cores
Thread Count	176 threads (spatial multithreading)
Single-Core Performance	2× previous generation
Max Memory Capacity	1.5 TB (LPDDR5X)
Memory Bandwidth	1.2 TB/s
Interconnect	NVLink-C2C (1.8 TB/s)
On-Chip Interconnect	2nd-gen NVIDIA SCF (3.4 TB/s bisection bandwidth)
TDP	Not disclosed (estimated 350-500W)
Release Date	March 17, 2026
Mass Production	Second half of 2026

Architecture & Specifications

Vera CPU adopts a monolithic compute chip design, avoiding cross-chiplet communication latency, maintaining stable latency and throughput under all-core load, with predictable performance.

Key Technical Innovations

88 Custom Olympus Cores
- Supports spatial multithreading (176 threads)
- Single-core performance 2× previous generation
- Industry-leading energy efficiency
World's First CPU Supporting FP8 Precision
- Fully compatible with ARMv9.2 instruction set
- Hardware-level FP8 compute support
2nd-Gen NVIDIA SCF (Scalable Coherent Fabric)
- Provides 3.4 TB/s bisection bandwidth
- On-chip mesh + unified cache
- Latency-free scaling to 88 cores
NVLink-C2C Interconnect
- Coherent bandwidth up to 1.8 TB/s
- Enables seamless data sharing between CPUs and between CPU and GPU
- Supports unified memory system
Full Confidential Computing
- Supports hardware-enforced security isolation
- Protects sensitive data and code

Memory Subsystem

Max Memory Capacity: 1.5 TB (3× previous generation)
Memory Type: LPDDR5X
Memory Bandwidth: 1.2 TB/s (2× bandwidth, 1/2 power vs traditional CPU)

Companion Platform

Vera Rubin NVL72

72× Rubin R200 GPUs
36× Vera CPUs
Total Memory: 54 TB LPDDR5X
TDP: ~180kW (full liquid cooling required)

HGX Rubin NVL8

8× Rubin R200 GPUs
2× Vera CPUs
For small-to-medium scale AI training and inference

Deployment Formats

High-Density Liquid-Cooled Vera CPU Rack
- Based on NVIDIA MGX
- Supports up to 256 Vera CPUs
- Supports over 22,500 concurrent environments
- For AI factory-scale reinforcement learning and agentic AI
Standard Server Configuration
- Supports dual-socket and single-socket standard configurations
- Adaptable to general data center needs
Independent CPU Platform
- Can be used as high-performance independent CPU
- Supports hyperscale cloud, data analytics, storage, enterprise workloads, HPC

Performance Advantages

Software environment runtime speed: Up to 50% faster than traditional architecture CPUs
Efficiency: 2× traditional architecture CPUs
RL evaluation cycle: Can be shortened by 50% under full load
AI workflow: Seamless collaboration with NVIDIA GPUs, ensuring full-speed AI workflow

Application Scenarios

Vera CPU is designed for the AI era, suitable for:

Reinforcement learning (RL) and agentic AI
Data center host CPU (data movement scheduling, memory management, system control orchestration)
Hyperscale cloud
Data analytics and storage
Enterprise workloads and HPC

Key Specifications​

Architecture & Specifications​

Key Technical Innovations​

Memory Subsystem​

Companion Platform​

Vera Rubin NVL72​

HGX Rubin NVL8​

Deployment Formats​

Performance Advantages​

Application Scenarios​

References​