Nvidia CEO Jensen Huang opened the company’s fall GTC conference on Tuesday by announcing that the company’s “Hopper” graphics processing unit (GPU) is in volume production and shipping in partner systems next month how Dell, Hewlett Packard and Cisco Systems will begin. Nvidia systems with the GPU will be available in the first quarter of next year, Huang announced.
The hopper chip, also known as the “H100”, targets data center tasks such as: artificial intelligence. Nvidia says the H100 can “dramatically” reduce the cost of deploying AI programs. For example, the work of 320 GPUs from the previous top-of-the-line, the A100, can be done with just 64 H100 chips, which would require only one-fifth the number of server computers and cut power consumption by 3.5 times.
Hopper was first introduced by Nvidia in March. The company showed some benchmark results for the chip in the MLPerf suite of machine learning tasks this month.
Likewise: NVIDIA unveils Hopper, its new hardware architecture to turn data centers into AI factories
In addition to the availability of Hopper, Huang discussed a new “as-a-service” cloud offering, Nvidia NeMo LLM, for “large language models” that will enable customers to run very large natural language processing models such as OpenAI’s GPT-3 and very easy to use Megatron-Turing NLG 530B, the language model that Nvidia developed with Microsoft.
Likewise: Neural Magic’s Sparsity, Nvidia’s Hopper and Alibaba’s Network among the first in the latest MLPerf AI benchmarks
The NeMo service, available next month as “Early Access,” offers GPT-3 and other models in a pre-trained manner and can be tuned by a developer using a method Nvidia calls “prompt learning,” which Nvidia developed from a has adopted technology that scientists at Google introduced last year.
A version of NeMo is hosted specifically for biomedical uses of large language models such as drug discovery, called the NVIDIA BioNeMo Service.
During a press conference announcing the announcement ZDNet asked Nvidia’s Head of Accelerated Computing, Ian Buck, what guard rails the company will put in place to prevent abuse of large language models, which are well documented in the AI ethics literature.
“Yes, good question, this service is an entry point to EA [early access] starting next month, and it goes straight to the company, and we’re actually doing this to work with different companies, different companies to develop their workflows,” Buck said.
They’ll be doing everything in partnership with Nvidia, of course, and each user will be requesting access to the service, so we’re going to understand a little more about what they’re doing versus a generic, open-ended way. Again, we’re trying to focus on bringing big language models to businesses, and that’s our customers’ go-to-market.
ZDNet He then remarked to Buck that misuse in the large language model literature involves biases embedded in the training data, not necessarily malicious use.
“Are you claiming that with limited access to corporate partners, there are no such documented abuses as bias in the training materials and the final product?” asked ZDNet.
Yes. I mean, the customers are going to bring their own datasets into the domain training, so they certainly have some of that responsibility, and we all as a community have to have that responsibility. This stuff was trained on the internet. It needs to be understood and constrained to what it is now for domain specific solutions. The problem is a bit more limited as we are trying to provide a single service for a specific use case. That bias will persist. Humans have prejudices, and unfortunately this was naturally trained on input from humans.
The presentation also included a variety of dedicated computing platforms for a variety of industries, including healthcare, and for OmniVerse, Nvidia’s software platform for the Metaverse.
Platforms include IGX for healthcare applications, IGX for factories, a new version of OVX computing systems for OmniVerse development, and a new version of Nvidia’s DRIVE computing platform for cars called DRIVE Thor, which will combine the Hopper GPU with the upcoming CPU from Nvidia will combine , “grace.”
Likewise: Nvidia clarifies Megatron Turing scaling claim