An advisory board for the new high-performance computing hub will make decision-making a shared experience, says Dr. Vladimir Florinsky.
Michael Mercier / UAH
The University of Alabama at Huntsville (UAH), part of the University of Alabama system, will become the center for statewide high-performance computing (HPC) in Alabama as part of a two-year grant from the National Science Foundation (NSF).
“This funding will allow us to establish a high-performance computing facility to serve a consortium of 10 public and private universities in Alabama,” said Dr. Vladmir Florinski, the principal investigator for the effort, which attracted $972,261 from NSF’s Established Program to Stimulate Competitive Research (EPSCoR).
The proposal leveraged the UAH-administered NSF EPSCoR Track-1 Future Technologies enabled by Plasma Processes (FTPP) and Connecting the Plasma Universe to Plasma Technology in Alabama grants to develop the HPC consortium and match it with statewide FTPP needs Connect HPC support.
“As the host site, UAH will ultimately have control over hardware design, usage policies, and resource allocation,” says Dr. Florinski, a professor of space science and researcher at the UAH’s Center for Space Plasma and Aeronomic Research (CSPAR), is one of the most prolific HPC users on campus.
“However, we plan to invite representatives from each participating institution to an advisory board and thus make decision-making a shared experience,” he says. “As a result, the UAH will form closer ties with the other state universities, which would allow greater collaboration on scientific and technical projects involving computer modeling or data analysis.”
The Advisory Board will meet remotely every semester and make recommendations for sharing computing and storage resources between the numerous research institutions. Resource allocation can be based on groups, projects, or individual users.
“Faculties, researchers and students from all participating universities can apply for accounts and run their applications on the system,” says Dr. Florinski. “A central web portal will be created to simplify account applications.”
Eighty percent of the grant money will be used to purchase a new HPC cluster that will dramatically improve the consortium’s computing capabilities, he says. Once installed in the server room in the Cramer Research Hall, the system consists of three to four racks densely packed with individual servers called nodes. Each node will have 64-128 central processing unit cores, for a total of around 3,000 cores for the entire system.
“The real power of the machine will come from its GPU subsystem, which consists of 20-24 Nvidia Ampere units with a total of about 160,000 CUDA cores,” says Dr. Florinski. “Theoretical maximum double precision performance will be in the range of 240 to 360 teraflops.”
Users across the state will connect remotely to the new flagship HPC cluster using a secure shell protocol. The project mainly relies on existing network connections between the sites. CSPAR’s IT staff will be responsible for the operation and maintenance of the new HPC system.
Before the new facility goes live, a series of network bandwidth tests are performed to determine connection speeds between the hub and its users. The UAH Vice President’s Office for Research and Economic Development and CSPAR staff will work with the UAH Office of Information Technology at each campus to optimize routing and achieve the best possible throughput.
Resource sharing policies and support structures must be implemented and tested before allowing other users.
“The system will be integrated into a national association network that will allow resource sharing with out-of-state users as required by NSF policy,” says Dr. Florinski. “When the system goes into production, my role will shift to helping participating institutions – particularly those serving minority groups – develop and improve their HPC skills to fully utilize this important resource.”
He says the project’s success depends on the existence of regional fiber optic networks.
“These include the Alabama Research and Education Network, as well as the University of Alabama System Regional Optical Network and Georgia Institute of Technology Southern Crossroads,” he says.
The UAH College of Science and College of Engineering worked together on the grant application. Departments that provide a list of projects that would benefit their research include life sciences, chemical and materials engineering, computer science, mechanical and aerospace engineering, and space science.
For CSPAR, the new system will enable much larger physics-based numerical simulations to be performed, ranging from space weather forecasting to cosmic ray science, that previously could only be performed at national-scale supercomputing facilities.
“Of course we hope to win over HPC users from other departments,” says Dr. Florinski. “In addition, the Department of Space Science plans to introduce a graduate certificate in computational physics that will incorporate the new system into pedagogical capacities. This offers the opportunity to work on the new curriculum together with the departments of computer science and electrical engineering and information technology.”
The UAH is working with several existing programs to introduce HPC to graduate and undergraduate students as part of their summer research projects, he says.
“We also plan to reach current and potential HPC users across the state by offering a series of webinars focused on specific advanced topics such as: B. Using GPUs in a distributed memory environment.”