DaCe OMEN Framework Could Help Industry Design Better and More Efficient Computer Chips
ACM, the Association for Computing Machinery, named a six-member team from the Swiss Federal Institute of Technology (ETH) Zurich recipients of the 2019 ACM Gordon Bell Prize for their project, “A Data-Centric Approach to Extreme-Scale Ab initio Dissipative Quantum Transport Simulations.”
The ETH Zurich team introduced DaCe OMEN, a new framework for simulating the transport of electrical signals through nanoscale materials (such as the silicon atoms used in transistors). To better understand the thermal properties of transistors, the team simulated how electricity would be transported through a two-dimensional slice of a transistor consisting of 10,0000 atoms. The ETH Zurich researchers simulated the 10,000-atom system 14 times faster than an earlier framework that was used for a 1,000- atom system. The DaCe OMEN code they developed for the simulation has been run on two top-6 hybrid supercomputers, reaching a sustained performance of 85.45 Pflop/s on 4,560 nodes of Summit (42.55% of the peak) in double precision, and 90.89 Pflop/s in mixed precision.
The ACM Gordon Bell Prize tracks the progress of parallel computing and rewards innovation in applying high performance computing to challenges in science, engineering, and large-scale data analytics. The award was presented by ACM President Cherri Pancake and Arndt Bode, Chair of the 2019 Gordon Bell Prize Award Committee, during the International Conference for High Performance Computing, Networking, Storage and Analysis (SC19) in Denver, Colo.
Today’s commercial microchips contain 100,000,000 transistors in the span of a single millimeter, and managing heat generation and dissipation is one of the central problems in computer architecture. As the transistors on each microchip have become smaller and more densely packed, the amount of heat they generate has steadily increased. The cooling systems needed to keep supercomputers and data centers from overheating have become increasingly expensive. The ETH Zurich researchers estimate that cooling can consume up to 40% of the total electricity needed for data centers, amounting to cumulative costs of many billions of dollars per year.
Today’s supercomputers, which can perform up to 200 quadrillion calculations per second, allow scientists in many disciplines to gain new insights by processing a staggering number of variables. The ETH Zurich team used their simulation to develop a map of where heat is produced in a single transistor, how it is generated, and how it is evacuated. It is hoped that a deeper understanding of these thermal characteristics could inform the development of new semiconductors with optimal heat-evacuating properties.
In recent years, the OMEN framework has been a popular quantum transport simulator for modeling nanoscale materials, but has experienced scaling bottlenecks. The ETH Zurich Team wrote a variation of OMEN that is Data Centric (DaCe OMEN). “We show that the key to eliminating the scaling bottleneck is in formulating a communication-avoiding algorithm,” the team writes in their paper. The ETH Zurich team’s solver yields data movement characteristics that can be used for performance and communication modeling, communication avoidance, and dataflow transformations. They go on to note that the speedup made by the DaCe OMEN framework is two orders of magnitude faster per atom than the original OMEN code.
The ETH Zurich team also built a graphical interface for the DaCe OMEN framework that includes a visualization of dataflow in lieu of a simple textual description. Anyone running the code can use the image representation to interact with the data directly. The team believes this new innovation could be applied to numerous scientific disciplines beyond nanoelectronics.
Winning team members include Alexandros Nikolaos Ziogas, Tal Ben-Nun, Timo Schneider and Torsten Hoefler, from ETH Zurich’s Scalable Parallel Computing Laboratory, as well as Guillermo Indalecio Fernández and Mathieu Luisier from ETH Zurich’s Integrated Systems Laboratory.