Frontera: The Evolution of Leadership Computing at the National Science Foundation
Debuting as the fifth largest supercomputer in the world, Frontera represents a robust and well-balanced HPC system designed to enable large-scale, productive science on day one of operations.
Abstract
As part of the NSF's cyberinfrastructure vision for a robust mix of high capability and capacity HPC systems, Frontera represents the most recent evolution of trans-petascale resources available to all open science research projects in the U.S. Debuting as the fifth largest supercomputer in the world, Frontera represents a robust and well-balanced HPC system designed to enable large-scale, productive science on day one of operations. The system provides a primary compute capability of nearly 39PF, delivered completely via more than 8,000 dual-socket servers with conventional Intel 8280 ("Cascade Lake") processors. A unique configuration of both desktop GPUs and advanced floating units from NVIDIA enables both machine learning and scientific workloads, and the system delivers nearly 2TB/s of total filesystem bandwidth with 55 PB of usable Lustre disk-based storage and 3PB of all flash Lustre storage. A Mellanox InfiniBand (IB) interconnect provides very low latency with 100Gbps to each node, and 200Gbps between switches in a fat tree topology with minimal oversubscription for efficient communication, even in jobs that use the full system with complex communication patterns. The system hardware is complemented by a robust set of software services, including Application Programmer Interfaces (APIs) to support an evolving user base that increasingly demands productive access via science gateways and automated workflows, as well as a first-of-its-kind partnership with the three major cloud service providers to create a bridge between "traditional" HPC and the cloud infrastructure upon which research increasingly depends.