Nvidia CEO Jensen Huang announces availability of “Hopper” GPU, cloud service for large AI language models

Nvidia co-founder and CEO Jensen Huang opened the company’s fall GTC conference by announcing the general availability of the company’s new “Hopper” GPU next month in systems from Dell and others. The keynote also introduced computers for healthcare, robotics, industrial automation and automotive applications, as well as several cloud services, including a cloud service hosted by Nvidia for deep learning language models such as GPT-3.


Nvidia CEO Jensen Huang opened the company’s fall GTC conference on Tuesday by announcing that the company’s “Hopper” graphics processing unit (GPU) is in volume production and shipping in partner systems next month how Dell, Hewlett Packard and Cisco Systems will begin. Nvidia systems with the GPU will be available in the first quarter of next year, Huang announced.

The hopper chip, also known as the “H100”, targets data center tasks such as: artificial intelligence. Nvidia says the H100 can “dramatically” reduce the cost of deploying AI programs. For example, the work of 320 GPUs from the previous top-of-the-line, the A100, can be done with just 64 H100 chips, which would require only one-fifth the number of server computers and cut power consumption by 3.5 times.

Hopper was first introduced by Nvidia in March. The company showed some benchmark results for the chip in the MLPerf suite of machine learning tasks this month.

Likewise: NVIDIA unveils Hopper, its new hardware architecture to turn data centers into AI factories

In addition to the availability of Hopper, Huang discussed a new “as-a-service” cloud offering, Nvidia NeMo LLM, for “large language models” that will enable customers to run very large natural language processing models such as OpenAI’s GPT-3 and very easy to use Megatron-Turing NLG 530B, the language model that Nvidia developed with Microsoft.

Likewise: Neural Magic’s Sparsity, Nvidia’s Hopper and Alibaba’s Network among the first in the latest MLPerf AI benchmarks

The NeMo service, available next month as “Early Access,” offers GPT-3 and other models in a pre-trained manner and can be tuned by a developer using a method Nvidia calls “prompt learning,” which Nvidia developed from a has adopted technology that scientists at Google introduced last year.

A version of NeMo is hosted specifically for biomedical uses of large language models such as drug discovery, called the NVIDIA BioNeMo Service.


According to Nvidia, the Hopper GPU can reduce the required compute nodes to run large AI tasks by four-fifths compared to the previous data center GPU, the A100.


During a press conference announcing the announcement ZDNet asked Nvidia’s Head of Accelerated Computing, Ian Buck, what guard rails the company will put in place to prevent abuse of large language models, which are well documented in the AI ​​ethics literature.

“Yes, good question, this service is an entry point to EA [early access] starting next month, and it goes straight to the company, and we’re actually doing this to work with different companies, different companies to develop their workflows,” Buck said.

added Buck,

They’ll be doing everything in partnership with Nvidia, of course, and each user will be requesting access to the service, so we’re going to understand a little more about what they’re doing versus a generic, open-ended way. Again, we’re trying to focus on bringing big language models to businesses, and that’s our customers’ go-to-market.

ZDNet He then remarked to Buck that misuse in the large language model literature involves biases embedded in the training data, not necessarily malicious use.

“Are you claiming that with limited access to corporate partners, there are no such documented abuses as bias in the training materials and the final product?” asked ZDNet.

replied Buck,

Yes. I mean, the customers are going to bring their own datasets into the domain training, so they certainly have some of that responsibility, and we all as a community have to have that responsibility. This stuff was trained on the internet. It needs to be understood and constrained to what it is now for domain specific solutions. The problem is a bit more limited as we are trying to provide a single service for a specific use case. That bias will persist. Humans have prejudices, and unfortunately this was naturally trained on input from humans.

The presentation also included a variety of dedicated computing platforms for a variety of industries, including healthcare, and for OmniVerse, Nvidia’s software platform for the Metaverse.

Platforms include IGX for healthcare applications, IGX for factories, a new version of OVX computing systems for OmniVerse development, and a new version of Nvidia’s DRIVE computing platform for cars called DRIVE Thor, which will combine the Hopper GPU with the upcoming CPU from Nvidia will combine , “grace.”

Likewise: Nvidia clarifies Megatron Turing scaling claim

Laisser un commentaire

Votre adresse e-mail ne sera pas publiée. Les champs obligatoires sont indiqués avec *