GPU overviews
Last updated
Last updated
Networke's infrastructure is purpose-built for large-scale, GPU-accelerated workloads, specifically designed to support the most demanding AI and machine learning applications. Networke takes pride in being one of the few cloud platforms globally to offer NVIDIA's most advanced end-to-end AI supercomputing solutions.The NVIDIA HGX H100, available on Networke, delivers up to 7x greater efficiency for high-performance computing (HPC) applications, up to 9x faster AI training for large models, and up to 30x faster AI inference compared to the NVIDIA HGX A100.
NVIDIA HGX H100
The NVIDIA HGX H100, available on Networke, delivers up to 7x higher efficiency for high-performance computing (HPC) applications, up to 9x faster AI training on large models, and up to 30x faster AI inference compared to the NVIDIA HGX A100.
NVIDIA H200 ON-DEMAND
The NVIDIA H200, available on Networke, offers cutting-edge performance for on-demand workloads. For models like Fairseq 2.7B, GPT Neo 2.7B, or smaller, this GPU provides exceptional value and efficiency for less intensive inference tasks.For larger contexts or more demanding workloads, upgrading to higher-tier GPUs, such as the NVIDIA H200-SXM or H200 Tensor Core, can enhance performance, especially when inference requests saturate GPU resources. These options are designed to handle high-performance needs, ensuring optimal efficiency and scalability for your use case.
The table below presents the specifications of both NVIDIA HGX H100 GPU models at peak performance:
FP64
134 TFLOPS
268 TFLOPS
FP64 Tensor Core
268 TFLOPS
535 TFLOPS
FP32
268 TFLOPS
535 TFLOPS
TF32 Tensor Core
3958 TFLOPS*
7,915 TFLOPS*
FP16 Tensor Core
7915 TFLOPS*
15,830 TFLOPS*
FP8 Tensor Core
15,830 TFLOPS*
31,662 TFLOPS*
INT8 Tensor Core
15,830 TOPS*
31,662 TOPS*
GPU Memory
320 GB
640GB
Aggregate GPU Memory Bandwidth
13TB/s
27TB/s
Maximum Number of MIG Instances
28
56
NVIDIA NVLink
Fourth-generation NVLink 900GB/s
Fourth-generation NVLink 900GB/s
NVIDIA NVSwitch
N/A
Third-generation NVSwitch
NVSwitch GPU-GPU bandwidth
N/A
900GB/s
In-network compute
N/A
3.6 TFLOPS
Total aggregate network bandwidth
3.6TB/s
7.2TB/s
* with sparsity.
The 8-GPU model provides significantly higher computational power and is better suited for highly demanding tasks that require intense GPU-GPU communication. It's ideal for large-scale AI training and for applications that involve massive data volumes.
The 4-GPU model, while still highly capable, targets slightly less intensive and exascale computing tasks. It focuses on maximizing GPU density while minimizing required space and power.
Networke leverages the unparalleled speed of NVIDIA HGX H100 GPUs combined with the NVIDIA Quantum-2 InfiniBand platform, delivering the lowest GPUDirect network latency in the industry. This combination reduces AI model training times dramatically—from months to just days or even hours.
In a world where AI drives nearly every industry, these speeds and efficiencies are critical for high-performance computing (HPC) applications.
The NVIDIA Transformer Engine, an open-source Python library, unlocks the power of the FP8 (8-bit floating point) format on Hopper GPUs, the architecture behind HGX H100s. This innovation enables faster onboarding and optimized performance for AI model training on Networke’s advanced infrastructure.