Skip to main content Skip to navigation

EECS Colloquium: Tunable and Low-cost Adaptivity for HPC Software to Easily Speed Up Applications on Supercomputers by Vivek Kale, Computer Scientist, Brookhaven National Laboratory

Online
ZOOM

About the event

Abstract: Emerging and next-generation supercomputers have nodes that are becoming more powerful and complex, with a set of hosts, e.g., CPUs, and a set of devices, e.g., GPUs, on their nodes. During an application’s execution on such a supercomputer, load imbalances occurring within a node – induced by the application or induced by system noise and other complexities of the node’s hardware – slow down the application’s execution significantly. Furthermore, the application’s slowdown is projected to be dramatic when it is run on a very large number of nodes of such a supercomputer. To easily minimize this slowdown of applications on such supercomputers, systems software for High Performance Computing, or HPC systems software, must support novel adaptive intra-node load balancing strategies which have low overhead and are tunable. In this talk, I first describe such load imbalances and discuss tunable locality-sensitive loop scheduling strategies that I have developed for HPC systems software to handle such load imbalances. I then demonstrate that such loop scheduling strategies have a potential to significantly speed up applications when run on emerging and next-generation supercomputers, particularly the U.S. Department of Energy’s supercomputers. With this, I explain how these strategies motivate development of a general set of strategies involving tunable and low-cost adaptivity within HPC software. I conclude by putting forth a vision and ideas for continued research in the area of HPC with this software design philosophy in mind and by identifying domains I expect to impact through these ideas.

Bio: Vivek Kale is a Computer Scientist at Brookhaven National Laboratory who works on the US DoE Exascale Computing Project through the SOLLVE project and the Lattice QCD project. His work in these projects focuses on the use of LLVM’s OpenMP for applications to harness the computational power of nodes of US DoE’s upcoming exascale supercomputers along with research on tunable low-overhead runtime systems for such supercomputers. During his PhD, Vivek worked under advisor William Gropp and with DoE labs, in particular Lawrence Livermore National Laboratory. Vivek’s dissertation was on locality-sensitive multi-core loop scheduling strategies to improve the performance and scalability of bulk-synchronous MPI+OpenMP applications. He was also involved in work on the MPI shared memory extensions model for MPI-3. His work after his PhD has primarily involved (1) synergizing locality-sensitive loop scheduling strategies with inter-node load balancing strategies, (2) applying and evaluating loop transformations for code run on GPUs, and (3) developing tunable locality-sensitive loop scheduling strategies across multiple devices on a node. Vivek has been involved in the broader HPC community as well, being a reviewer in publication venues like the International Conference on Parallel Processing and the Journal of Parallel Computing.

Contact