Nvidia launches chip aimed at data center economics

On May 14, 2020

Semiconductor firm Nvidia on Thursday (May 14) announced a new chip that can be digitally split up to run several different programs on one physical chip, a first for the company that matches a key capability on many of Intel’s chips.

The notion behind what the Santa Clara, California-based company calls its A100 chip is simple: Help the owners of data centers get every bit of computing power possible out of the physical chips they purchase by ensuring the chip never sits idle. The same principle helped power the rise of cloud computing over the past two decades and helped Intel build a massive data center business.

When software developers turn to a cloud computing provider such as Amazon.com or Microsoft Corp for computing power, they do not rent a full physical server inside a data center. Instead they rent a software-based slice of a physical server called a “virtual machine.”

Such virtualization technology came about because software developers realized that powerful and pricey servers often ran far below full computing capacity. By slicing physical machines into smaller virtual ones, developers could cram more software on to them, similar to the puzzle game Tetris. Amazon, Microsoft and others built profitable cloud businesses out of wringing every bit of computing power from their hardware and selling that power to millions of customers.

But the technology has been mostly limited to processor chips from Intel and similar chips such as those from Advanced Micro Devices Inc. Nvidia said Thursday that its new A100 chip can be split into seven “instances.”

For Nvida, that solves a practical problem. Nvidia sells chips for artificial intelligence tasks. The market for those chips breaks into two parts. “Training” requires a powerful chip to, for example, analyze millions of images to train an algorithm to recognize faces. But once the algorithm is trained, “inference” tasks need only a fraction of the computing power to scan a single image and spot a face.

Nvidia is hoping the A100 can replace both, being used as a big single chip for training and split into smaller inference chips.

Customers who want to test the theory will pay a steep price of US$200,000 for Nvidia’s DGX server built around the A100 chips. In a call with reporters, Chief Executive Jensen Huang argued the math will work in Nvidia’s favor, saying the computing power in the DGX A100 was equal to that of 75 traditional servers that would cost US$5,000 each.

“Because it’s fungible, you don’t have to buy all these different types of servers. Utilization will be higher,” he said. “You’ve got 75 times the performance of a US$5,000 server, and you don’t have to buy all the cables.”