Stanford University logo SLAC National Accelerator Laboratory logo Los Alamos National Laboratory logo NVIDIA logo Winner of the R&D 100 Award

Legion

High Productivity High Performance Computing

Github

Legion: High-Productivity High-Performance Computing

The vast majority of all programs are sequential. Programmers are inherently productive when developing sequential code because they can construct more powerful programs simply by composing functionality from one or more software modules (e.g. libraries) in serial without worrying about parallelism, data coherence, or synchronization. The productivity engendered by this facet of sequential programming is vital to the success of many popular software ecosystems such as Python, R, and MATLAB. However, the implementations of these environments struggle to achieve high performance on parallel and distributed hardware without resorting to explicit parallelism. Ideally users want to write programs in a high productivity sequential programming model and have those programs automatically executed with high performance on parallel hardware. Achieving this end requires the development of a nuanced programming model and sophisticated programming systems capable of analyzing and transforming sequential programs into parallel programs.

High Productivity High Performance Computing

Fortunately, there already exist many well known techniques for implicitly parallelizing sequential programs to target parallel hardware. However, in most systems these algorithms are currently only deployed to exploit fine-grained instruction-level parallelism. The primary thesis of the Legion project is that these same techniques can and should be deployed hierarchically at coarser granularities in software to leverage modern parallel hardware (multi-core CPUs, GPUs, supercomputers, etc.) without compromising the productivity of developing sequential programs.

The basis for this thesis rests upon the fundamental observation that implicitly mapping sequential programs onto parallel hardware looks similar at many different scales. At the finest granularity, hardware or compilers can extract parallelism from a stream of instructions by analyzing register usage and mapping independent instructions onto parallel hardware units. The same principles apply when extracting parallelism from a stream of demarcated functions called tasks operating on logical regions of data to map onto the parallel execution units inside of a workstation or a supercomputer for different granularities of tasks and regions. (Legion derives its name from the concatenation of the words in ‘logical region’.)

Implicit Parallelism Analogy

This analogy forms the basis of the Legion project, and its two primary software artifacts can be understood as direct analogs to existing systems. The Legion runtime endeavors to be a full reimplementation of a pipelined out-of-order superscalar processor in software for dynamically exploiting task-parallelism from a stream of tasks generated by the execution of a sequential program. Similarly, the Regent compiler strives to be an optimizing compiler, performing static analyses and transformations of programs at the coarser granularity of tasks before mapping them onto the Legion runtime. Armed with these systems that automatically parallelize and distribute sequential programs, we aim to facilitate the creation of high productivity high performance computing ecosystems so that all users can leverage modern massively parallel machines.

The Key Ideas

In order to realize the above vision, the Legion project has developed several novel technologies:

Many of these ideas are intertwined and resonate with each other in the design and we encourage you to explore them further.

Get Started

To learn more about Legion you can:

Acknowledgments

Legion is developed as an open source project, with major contributions from LANL, NVIDIA Research, SLAC, and Stanford. This research was supported by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of two U.S. Department of Energy organizations (Office of Science and the National Nuclear Security Administration) responsible for the planning and preparation of a capable exascale ecosystem, including software, applications, hardware, advanced system engineering, and early testbed platforms, in support of the nation’s exascale computing imperative. Additional support has been provided to LANL and SLAC via the Department of Energy Office of Advanced Scientific Computing Research and to NVIDIA, LANL and Stanford from the U.S. Department of Energy National Nuclear Security Administration Advanced Simulation and Computing Program. Previous support for Legion has included the U.S. Department of Energy’s ExaCT Combustion Co-Design Center and the Scientific Data Management, Analysis and Visualization (SDMAV) program, DARPA, the Army High Performance Computing Research Center, and NVIDIA, and grants from OLCF, NERSC, and the Swiss National Supercomputing Centre (CSCS).

Legion Contributors

Stanford SLAC LANL NVIDIA
Alex Aiken Elliott Slaughter Pat McCormick Michael Bauer
Rohan Yadav Seema Mirchandaney Galen Shipman Sean Treichler
David Zhang Seshu Yamajala Wei Wu Wonchan Lee
Xi Luo Jonathan Graham Manolis Papadakis
Nirmal Prajapati Irina Demeshko
CMU
Zhihao Jia