Nvidia unveils H200, its newest high-end chip for training AI models

Technology
Monday, November 13th, 2023 7:16 pm EDT

Key Points

  • Introduction of the H200 GPU: Nvidia has revealed its latest graphics processing unit, the H200, designed for training and deploying artificial intelligence models, particularly those driving the generative AI field. This GPU represents an upgrade from the H100, previously used by OpenAI for training its advanced language model, GPT-4.
  • Key Improvements and Specifications: The H200 comes with 141GB of next-generation “HBM3” memory, a notable enhancement facilitating efficient performance in “inference” tasks, where a trained model generates text, images, or predictions. Nvidia claims the H200 will deliver output nearly twice as fast as its predecessor, the H100, as demonstrated in a test using Meta’s Llama 2 LLM. The chip is expected to be available in the second quarter of 2024 and will compete with AMD’s MI300X GPU, boasting additional memory for accommodating large models during inference.
  • Market Dynamics and Future Prospects: The H200 is anticipated to face high demand, continuing the trend seen with Nvidia’s AI GPUs, which has significantly boosted the company’s stock, experiencing over a 230% increase in 2023. Despite the H200’s advancements, the article suggests it may not retain the title of the fastest Nvidia AI chip for long, given the industry’s typical two-year architecture cadence. Nvidia has signaled a shift to a one-year release pattern due to heightened GPU demand, with plans to introduce the B100 chip, based on the forthcoming Blackwell architecture, in 2024.

Nvidia has introduced the H200, an advanced graphics processing unit (GPU) tailored for training and deploying artificial intelligence (AI) models, particularly those fueling the generative AI trend. This GPU represents an enhancement over its predecessor, the H100, famously used by OpenAI to train its sophisticated GPT-4 language model. The H100 chips, estimated to cost between $25,000 and $40,000, are in high demand from major corporations, startups, and government entities for creating large AI models through a process called “training.”

Notably, the H200 incorporates 141GB of next-generation “HBM3” memory, a significant improvement contributing to enhanced performance in “inference” tasks, where a trained model generates text, images, or predictions. Nvidia claims that the H200 can generate output almost twice as fast as the H100, as demonstrated in a test using Meta’s Llama 2 LLM. The H200 is set to be shipped in the second quarter of 2024 and will face competition from AMD’s MI300X GPU, which shares a similar focus on additional memory for accommodating large models during inference.

Excitement around Nvidia’s AI GPUs has been a driving force behind the company’s stock surge, with a remarkable 230% increase in 2023. The company anticipates revenue of around $16 billion for its fiscal third quarter, representing a 170% year-over-year growth. Importantly, the H200 will be compatible with the H100, allowing AI companies already utilizing the earlier model to seamlessly integrate the new version without necessitating changes to server systems or software.

The H200 will be available in both four-GPU and eight-GPU server configurations on Nvidia’s HGX complete systems. Additionally, it will be offered in a chip called GH200, combining the H200 GPU with an Arm-based processor. Despite the promising features of the H200, its reign as Nvidia’s fastest AI chip may be short-lived, as the company announced plans to shift from a two-year architecture cadence to a one-year release pattern in response to high GPU demand. Nvidia hinted at the release of its B100 chip, based on the forthcoming Blackwell architecture, in 2024, suggesting that future advancements may outpace the capabilities of the H200.

For the full original article on CNBC, please click here: https://www.cnbc.com/2023/11/13/nvidia-unveils-h200-its-newest-high-end-chip-for-training-ai-models.html