KRS8500V3

Liquid-Cooled Exascale AI Rack Solution based on NVIDIA GB300 NVL72

1.5x

LLM Training Speedvs GB200

1.4x

LLM Inference Throughputvs GB200

Supporting Extreme Performance Applications

AI Training & Inference

Optimized for trillion-scale models, transformer networks, and real-time inference.

Big Data Analytics

Accelerates data pipelines, reduces storage cost, and improves query performance

Scientific & Engineering Simulations

Enhances compute-intensive modeling, from CFD to circuit design.

NVIDIA® GB300 NVL72
Configuration	72x Blackwell Ultra GPUs, 36x Grace CPUs
NVLink Bandwidth	130TB/s
Fast Memory	Up to 40 TB
GPU Memory \| Bandwidth	Up to 21 TB \| Up to 576 TB/s
CPU Memory \| Bandwidth	Up to 18 TB LPDDR5X \| Up to 14.3 TB/s
CPU Core Count	2,592 Arm® Neoverse V2 cores
FP4 Tensor Core	1,080–1,400 PFLOPS (rack-level)
FP8/16 Tensor Core	720 PFLOPS
INT8 Tensor Core	23 PFLOPS
FP16/BF16 Tensor Core	360 PFLOPS
TF32 Tensor Core	180 PFLOPS
FP32 Tensor Core	6 PFLOPS
FP64 / FP64 Tensor Core	100 TFLOPS

Rack Specifications
Dimensions (W x H x D)	600mm x 2285mm x 1200mm (23.62” x 89.96” x 47.24”)
Weight	1,360–1,590 kg (3,000–3,500 lbs)
NVL Config	72x 1
NV OOB Switch	Option 1: 2x SN2201_M Option 2: 3x SN2201_M Option 3: 4x SN2201_M
NVL Cartridge	4
Rack Type (Per Rack)	9x 1U NVlink Switch Trays 18x 1U Compute Trays 8x 1U Power Shelves
Power-Shelf	(6+2)x 33kW
Busbar	1,400A
Rack Manifold	Option 1: 44RU, BF Option 2: 44RU, TF
CDU	Option 1: L2L In-Row Option 2: L2L In-Rack Option 3: L2A Sidecar

Compute Tray
CPU/GPU	2x Grace CPUs + 4x Blackwell Ultra GPUs
Count	18 per Rack
Cooling	CPU/GPU/CX8 liquid-cooled, others air cooling
Storage	8x E1.S NVMe SSDs
M.2	1 x M.2 NVMe SSD
Front I/O	1x USB 3.0, 1x Mgmt I/O , 1x RJ45, 1x mini display port
N-S Networking	1x FHFL PCIe Gen 5 X16 (BF3)
E-W Networking	2x CX7 Mezzanine on board (2x CX8 per Mezzanine) for 4x 800G OSFP Conn
Fan	CPU region: 8x 12V 4056 hot-swap fans with N+1 redundancy
Management	DC-SCM BMC management module
TPM	Supports TPM 2.0

Interested to learn more?

Inquire Now

	Ultra Tensor Cores 2x attention-layer acceleration, 1.5x AI compute FLOPS uplift Optimized for trillion-scale LLM and mixture-of-experts (MoE) workloads

	Expanded HBM3e Memory 288GB per GPU, 1.5x larger vs. previous gen Improved context length support and inference efficiency

	NVIDIA ConnectX-8 SuperNIC Dual-device IO module delivering 800 Gb/s per GPU Enables high-throughput, low-latency scale-out networking

	Rack-Scale Liquid Cooling Modular CDU options (In-Row, In-Rack, Sidecar) Full lifecycle DLC service for deployment and operations