AceleMax DGS-428AS

4U Dual AMD EPYC Processor 8x NVIDIA A100 SXM4 GPU Server

High density 4U System with eight NVIDIA HGX A100 SXM4 GPUs
High GPU Peer to Peer Communication via NVIDIA NVLINK
Less latency with Direct Attached GPUs & Next Gen PCIe 4.0 Support
Eight NIC for GPU Direct Attach RDMA & 4 NVMe GPU Direct Storage
Supports two AMD EPYC™ 7002 or 7003 series processors family

Request a Quote

Reference # 4B28330

Overview

The AceleMax DGS-428AS is a high density 4U system with eight NVIDIA® HGX™ A100 40GB or 80GB SXM4 GPUs, and support for high GPU Peer to Peer Communication via NVIDIA NVLINK, Direct Attached GPUs & Next Gen PCIe 4.0, dual AMD EPYC 7002 or 7003 series processors, eight NIC for GPU Direct Attach RDMA, and 4 NVMe GPU Direct storage, bringing huge parallel computing power to customers, thereby helping customers accelerate their digital transformation.

NVIDIA A100 SXM for HGX

The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration and flexibility to power the world’s highest-performing elastic data centers for AI, data analytics and HPC applications. As the engine of the NVIDIA data center platform, the A100 GPU provides up to 20X higher performance and 2.5X AI performance than V100 GPUs, and can efficiently scale up to thousands of GPUs or be partitioned into seven isolated GPU instances with new multi-Instance GPU (MIG) capability to accelerate workloads of all sizes.

The NVIDIA A100 GPU features third-generation Tensor Core technology that supports a broad range of math precisions providing a unified workload accelerator for data analytics, AI training, AI inference, and HPC. It also supports new features such as New Multi-Instance GPU, delivering optimal utilization with right sized GPU and 7x Simultaneous Instances per GPU; New Sparsity Acceleration, harnessing Sparsity in AI Models with 2x AI Performance; 3^rd Generation NVLINK and NVSWITCH, delivering Efficient Scaling to Enable Super GPU, and 2X More Bandwidth than the V100 GPU. Accelerating both scale-up and scale-out workloads on one platform enables elastic data centers that can dynamically adjust to shifting application workload demands. This simultaneously boosts throughput and drives down the cost of data centers.

Combined with the NVIDIA software stack, the A100 GPU accelerates all major deep learning and data analytics frameworks and over 700 HPC applications. NVIDIA NGC, a hub for GPU-optimized software containers for AI and HPC, simplifies application deployments so researchers and developers can focus on building their solutions.

Applications:

HPC, VDI, machine intelligence, deep learning, machine learning, artificial intelligence, Neural Network, advanced rendering and compute.

Product Highlights

S Y S T E M

4U Rackmount

P R O C E S S O R S

Dual AMD EPYC™ 7002 or 7003 Processors

G P U

NVIDIA HGX A100 8-GPUs 40GB

M E M O R Y

32 DIMM slots, up to 8TB DDR4 memory 3200 MHz DIMMs

D R I V E S

6x U.2 NVMe (4x – PCIe switch & 2x – CPU) 2x M.2 NVMe

I / O

8x PCIe 4.0 x16 from PCI3 switch 2x PCIe 4.0 x16 LP from CPUs AIOM Support

C O O L I N G F A N S

4x Removable heavy duty fans

P O W E R S U P P L I E S

3000W Redundant Power Supplies Titanium Level

Specifications

Processor

Dual AMD EPYC™ 7002 or 7003 series processor, 7nm, Socket SP3, up to 64 cores, 128 threads, and 256MB L3 cache TDP up to 280W

Memory

32x DDR4 DIMM slots
8-Channel memory architecture
Up to 8TB RDIMM/LRDIMM DDR4-3200 memory

Graphics Processing Unit (GPU):

Supports 8 NVIDIA A100 40GB or 80GB SXM4 GPUs
Supports PCIe Gen4: 64 GB/sec Third generation NVIDIA® NVLink® 600 GB/sec interconnect interface
Up to 7 Multi-Instance GPU (MIG) instances
Delivers 100% performance for Top applications
Up to 55,296 FP32 CUDA Cores, 27,648 FP64 CUDA Cores, 3,456 Tensor Cores, 77.60 TF peak FP64 double-precision performance, 156 TF peak FP64 Tensor Core double-precision performance, 156 TF peak FP32 single-precision performance, 2,496 TF peak Bfloat16 performance, 2,496 TF peak FP16 Tensor Core half-precision performance, 9,984 TOPS peak Int8 Tensor Core Inference performance, and 320GB GPU memory, with eight A100 SXM4 GPUs in a 4U chassis
On-board Aspeed AST2600 graphics controller

Expansion Slots

8x PCIe 4.0 x16 from PCIe switch
2x PCIe 4.0 x16 LP from CPUs

Server Management

Support for Intelligent Platform Management Interface v.2.0
IPMI 2.0 with virtual media over LAN and KVM-over-LAN support

Storage

6x U.2 NVMe bays (4 –PCIe switch & 2 –CPU)
2x M.2 NVMe bays

Network Controller

Dual RJ45 10GbE-aggregate host LAN
1x GbE management LAN

Power Supply

3000W Titanium level Redundant Power Supplies with PMBus

System Dimension

7.0″ x 17.2″ x 29″ / 178mm x 437mm x 737mm (H x W x D)

Request Quote

Optimized for Turnkey Solutions

Enable powerful design, training, and visualization with built-in software tools including TensorFlow, Caffe, Torch, Theano, BIDMach cuDNN, NVIDIA CUDA Toolkit and NVIDIA DIGITS.