Training and deployment of trillion-parameter transformer and MoE-based models
The Aivres KRS8000V4 rack-scale AI solution is based on the NVIDIA Vera Rubin NVL72 architecture, integrating 36 NVIDIA Vera CPUs and 72 NVIDIA Rubin GPUs into a fully liquid-cooled design. Optimized for trillion-parameter LLM training, large-scale inference, and emerging agentic AI workloads, KRS8000V4 builds on the architectural advancements of NVIDIA Rubin to deliver significantly improved GPU efficiency and inference cost economics, establishing a scalable foundation for next-generation AI factories.
GPU Performance Efficiencyvs GB300
Token Cost Efficiencyvs GB300
Accelerating Agentic and Next-Generation AI Applications
Rubin-Based AI Compute ArchitectureNVIDIA Rubin GPUs with next-generation tensor cores optimized for trillion-scale LLM and MoE workloads Designed to maximize compute density within a single NVL72 rack |
|
|
|
|
High-Bandwidth Memory & Data PathExpanded GPU memory capacity and bandwidth to support long-context inference 75 TB fast memory tier for checkpointing, data staging, and KV-cache expansion |
|
|
|
|
Rack-Scale & Data Center–Scale InterconnectNVLink™ 6 switch system enabling low-latency, rack-scale GPU communication NVIDIA ConnectX-9 and BlueField-4 enabling InfiniBand/Ethernet scale-out for AI factories |
|
|
|
|
Liquid-Cooled, Deployment-Ready DesignFully liquid-cooled compute and switch trays optimized for high power density Modular rack, manifold, and CDU options to support diverse data center environments |
|
Specifications
| NVIDIA® Vera Rubin NVL72 | |
|---|---|
| Configuration | 72x Rubin GPUs, 36x Vera CPUs |
| NVLink Bandwidth | 260 TB/s |
| NVLink-C2C Bandwidth | 65 TB/s |
| Fast Memory | 75 TB |
| GPU Memory | Bandwidth | 20.7 TB | 1,580 TB/s |
| CPU Memory | Bandwidth | 54 TB LPDDR5X |
| CPU Core Count | 3,168 NVIDIA Olympus cores |
| NVFP4 Inference | 3,600 PFLOPS |
| NVFP4 Training | 2,520 PFLOPS |
| FP8/FP6 Training | 1,260 PFLOPS |
| INT8 | 18 POPS |
| FP16/BF16 | 18 POPS |
| TF32 | 144 PFLOPS |
| FP32 | 9,360 TFLOPS |
| FP64 | 2,400 TFLOPS |
| FP32 SGEMM | 28,800 TFLOPS |
| FP64 SGEMM | 14,400 TFLOPS |
| Rack Specifications | |
|---|---|
| Dimensions (W x H x D) | 600mm x 2300mm x 1200mm (23.62” x 90.6” x 47.2”) |
| Weight | ~1,600 kg (~3,527.4 lbs) |
| NVL Config | 72 |
| NV OOB Switch | Option 1: 2x SN2201_M Option 2: 3x SN2201_M Option 3: 4x SN2201_M |
| NVL Cartridge | 4 |
| Rack Type (Per Rack) | 9x 1U NVlink Switch Trays 18x 1U Compute Trays 4 x 3U Power Shelves |
| Power-Shelf | (3+1) x 3U 110kW |
| Power Cap Shelf (Option) | Up to 4 x 1U |
| Busbar | 5,000A+ |
| Rack Manifold | Option 1: VR MGX Rack Manifold – Bottom Feed Option 2: VR MGX Rack Manifold – Top Feed |
| CDU | Option 1: L2L In-Row Option 2: L2A Sidecar |
| Compute Tray | |
|---|---|
| CPU/GPU | 2 x Vera CPUs + 4 x Rubin GPUs |
| Cooling | 1U fully liquid-cooled |
| Data Storage | 4 x E1.S NVMe SSDs |
| Boot Storage | 1 x E1.S NVMe SSD |
| Front I/O | 1 x USB 3.0 type-C, 1x Mgmt I/O, 1x RJ45 |
| N-S Networking | 1 x PCIe Gen6 X16 (BlueField-4) |
| E-W Networking | 4 × ConnectX-9 modules (8 × 800G OSFP) |
| Management | DC-SCM BMC |
| TPM | Supports TPM 2.0 |
| Switch Tray | |
|---|---|
| Type | N6100_LD |
| Bandwidth | 72x400Gb/s |
| Cooling | 1U fully liquid cooled |
| Front IO | 2 x RJ45, 1 x USB type-C, 1 x BMC ETH, 2 x CPU ETH |