Nvidia Fastest Supercomputer AMD’s 560 Epyc 7742 CPUs
Nvidia has revealed details for the supercomputer equipped with AMD’s 560 Epyc 7742 CPUs which have 64 cores each, and Nvidia’s 2240 A100 GPUs. It can reach a theoretical peak performance of 35 thousand teraflops.
Selene was built by a socially distanced team of six engineers and a handy robot called Trip. It is built using the fourth generation of Nvidia’s commercially available DGX SuperPOD architecture instead of other CPU heavy designs used by most other top 500 supercomputers on the list. Selene is also rated as the second most power-efficient supercomputer on Green500’s list.
Older Nvidia supercomputers took months to assemble and were difficult to maintain and upgrade. However, the company learned its lesson and made Selene both simpler and modular with scalable units. Each standardized DGX Pod contains Nvidia A100 GPUs and AMD Epyc CPUs. These pods are stacked together in a cabinet which is then combined in groups of sixteen to form a SuperPod.
Setting up and wiring a supercomputer is always tricky, but Selene’s homogeneity made it a lot easier. Nvidia also used Mellanox’s InfiniBand switches to reduce the number of cables required while also increasing bandwidth.
To keep the computer’s thermals in check, all SuperPods are kept in a giant air-conditioned warehouse and are slightly raised off the ground to improve thermal conduction. The assembly team only needed to install the flooring and seal up the SuperPods to control the airflow.
Selene is remotely monitored by a little robot called Trip. It roams around the facility and observes the computer to see if everything is functioning normally. It sends notifications to the team if something goes wrong, for instance, if some hardware is misbehaving or a cable is loose.
Selene is mostly aimed towards AI development and deep learning research but it will also be used in the fight against coronavirus.