2 ExaFLOPS, Tens of 1000’s of CPUs and GPUs

Argonne Nationwide Laboratory and Intel mentioned on Thursday that they’d put in all 10,624 blades for the Aurora supercomputer, a machine introduced again in 2015 with a very bumpy historical past. The system guarantees to ship a peak theoretical compute efficiency over 2 FP64 ExaFLOPS utilizing its array of tens of hundreds of Xeon Max ‘Sapphire Rapids’ CPUs with on-package HBM2E reminiscence in addition to Information Heart GPU Max ‘Ponte Vecchio’ compute GPUs. The system will come on-line later this yr.

“Aurora is the primary deployment of Intel’s Max Collection GPU, the most important Xeon Max CPU-based system, and the biggest GPU cluster on this planet,” mentioned Jeff McVeigh, Intel company vp and basic supervisor of the Tremendous Compute Group.

The Aurora supercomputer seems to be fairly spectacular, even by the numbers. The machine is powered by 21,248 general-purpose processors with over 1.1 million cores for workloads that require conventional CPU horsepower and 63,744 compute GPUs that may serve AI and HPC workloads. On the reminiscence facet of issues, Aurora has 1.36 PB of on-package HBM2E reminiscence and 19.9 PB of DDR5 reminiscence that’s utilized by the CPUs in addition to 8.16 PB of HBM2E carried by the Ponte Vecchi compute GPUs.

The Aurora machine makes use of 166 racks that home 66 blades every. It spans eight rows and occupies an area equal to 2 basketball courts. In the meantime, that doesn’t rely the storage subsystem of Aurora, which employs 1,024 all-flash storage nodes providing 220TB of storage capability and a complete bandwidth of 31 TB/s. For now, Argonne Nationwide Laboratory doesn’t publish official energy consumption numbers for Aurora or its storage subsystem.

The supercomputer, which shall be used for all kinds of workloads from nuclear fusion simulations as to whether prediction and from aerodynamics to medical analysis, makes use of HPE’s Shasta supercomputer structure with Slingshot interconnects. In the meantime, earlier than the system passes ANL’s acceptance exams, it is going to be used for large-scale scientific generative AI fashions.

“Whereas we work towards acceptance testing, we’re going to be utilizing Aurora to coach some large-scale open-source generative AI fashions for science,” mentioned Rick Stevens, Argonne Nationwide Laboratory affiliate laboratory director. “Aurora, with over 60,000 Intel Max GPUs, a really quick I/O system, and an all-solid-state mass storage system, is the proper atmosphere to coach these fashions.“

Though Aurora blades have been put in, the supercomputer nonetheless has to bear and move a sequence of acceptance exams, a standard process for supercomputers. As soon as it efficiently clears these and comes on-line later within the yr, it’s projected to achieve a theoretical efficiency exceeding 2 ExaFLOPS (two billion billion floating level operations per second). With huge efficiency, it’s anticipated to safe the highest place within the Top500 record.

The set up of the Aurora supercomputer marks a number of milestones: it’s the trade’s first supercomputer with efficiency larger than 2 ExaFLOPS and the primary Intel’-based ExaFLOPS-class machine. Lastly, it marks the conclusion of the Aurora saga that started eight years in the past because the supercomputer’s journey has seen its fair proportion of bumps.

Initially unveiled in 2015, Aurora was initially meant to be powered by Intel’s Xeon Phi co-processors and was projected to ship roughly 180 PetaFLOPS in 2018. Nonetheless, Intel determined to desert the Xeon Phi in favor of compute GPUs, leading to the necessity to renegotiate the settlement with Argonne Nationwide Laboratory to offer an ExaFLOPS system by 2021.

The supply of the system was additional delayed as a result of problems with compute tile of Ponte Vecchio as a result of delay of Intel’s 7 nm (now often called Intel 4) manufacturing node and the need to revamp the tile for TSMC’s N5 (5 nm-class) course of know-how. Intel lastly launched its Information Heart GPU Max merchandise late final yr and has now shipped over 60,000 of those compute GPUs to ANL.