KRS8000V4

Liquid-Cooled Rack-Scale Agentic AI Solution based on NVIDIA Vera Rubin NVL72

The Aivres KRS8000V4 rack-scale AI solution is based on the NVIDIA Vera Rubin NVL72 architecture, integrating 36 NVIDIA Vera CPUs and 72 NVIDIA Rubin GPUs into a fully liquid-cooled design. Optimized for trillion-parameter LLM training, large-scale inference, and emerging agentic AI workloads, KRS8000V4 builds on the architectural advancements of NVIDIA Rubin to deliver significantly improved GPU efficiency and inference cost economics, establishing a scalable foundation for next-generation AI factories.

4x

GPU Performance Efficiencyvs GB300

10x

Token Cost Efficiencyvs GB300

Accelerating Agentic and Next-Generation AI Applications

Large Language Model Training & Inference

Training and deployment of trillion-parameter transformer and MoE-based models

Agentic AI & Deep Reasoning Systems

Highly interactive AI agents requiring low latency and optimized cost per token

AI Factory & HPC Deployments

Scalable AI factory infrastructures spanning single-rack to multi-rack systems

Rubin-Based AI Compute Architecture

NVIDIA Rubin GPUs with next-generation tensor cores optimized for trillion-scale LLM and MoE workloads

Designed to maximize compute density within a single NVL72 rack


High-Bandwidth Memory & Data Path

Expanded GPU memory capacity and bandwidth to support long-context inference

75 TB fast memory tier for checkpointing, data staging, and KV-cache expansion


Rack-Scale & Data Center–Scale Interconnect

NVLink™ 6 switch system enabling low-latency, rack-scale GPU communication

NVIDIA ConnectX-9 and BlueField-4 enabling InfiniBand/Ethernet scale-out for AI factories


Liquid-Cooled, Deployment-Ready Design

Fully liquid-cooled compute and switch trays optimized for high power density

Modular rack, manifold, and CDU options to support diverse data center environments

Specifications

NVIDIA® Vera Rubin NVL72
Configuration 72x Rubin GPUs, 36x Vera CPUs
NVLink Bandwidth 260 TB/s
NVLink-C2C Bandwidth 65 TB/s
Fast Memory 75 TB
GPU Memory | Bandwidth 20.7 TB | 1,580 TB/s
CPU Memory | Bandwidth 54 TB LPDDR5X
CPU Core Count 3,168 NVIDIA Olympus cores
NVFP4 Inference 3,600 PFLOPS
NVFP4 Training 2,520 PFLOPS
FP8/FP6 Training 1,260 PFLOPS
INT8 18 POPS
FP16/BF16 18 POPS
TF32 144 PFLOPS
FP32 9,360 TFLOPS
FP64 2,400 TFLOPS
FP32 SGEMM 28,800 TFLOPS
FP64 SGEMM 14,400 TFLOPS
Rack Specifications
Dimensions (W x H x D) 600mm x 2300mm x 1200mm (23.62” x 90.6” x 47.2”)
Weight ~1,600 kg (~3,527.4 lbs)
NVL Config 72
NV OOB Switch Option 1: 2x SN2201_M
Option 2: 3x SN2201_M
Option 3: 4x SN2201_M
NVL Cartridge 4
Rack Type (Per Rack) 9x 1U NVlink Switch Trays
18x 1U Compute Trays
4 x 3U Power Shelves
Power-Shelf (3+1) x 3U 110kW
Power Cap Shelf (Option) Up to 4 x 1U
Busbar 5,000A+
Rack Manifold Option 1: VR MGX Rack Manifold – Bottom Feed
Option 2: VR MGX Rack Manifold – Top Feed
CDU Option 1: L2L In-Row
Option 2: L2A Sidecar
Compute Tray
CPU/GPU 2 x Vera CPUs + 4 x Rubin GPUs
Cooling 1U fully liquid-cooled
Data Storage 4 x E1.S NVMe SSDs
Boot Storage 1 x E1.S NVMe SSD
Front I/O 1 x USB 3.0 type-C, 1x Mgmt I/O, 1x RJ45
N-S Networking 1 x PCIe Gen6 X16 (BlueField-4)
E-W Networking 4 × ConnectX-9 modules (8 × 800G OSFP)
Management DC-SCM BMC
TPM Supports TPM 2.0
Switch Tray
Type N6100_LD
Bandwidth 72x400Gb/s
Cooling 1U fully liquid cooled
Front IO 2 x RJ45, 1 x USB type-C, 1 x BMC ETH, 2 x CPU ETH

Interested to learn more?