Converged computing: minimizing the difference between HPC and cloud



1/06/2023 - 13:30 Daniel Milroy (Lawrence Livermoore National Lab) IMAG 406 (Different room than usual!!)

Computing is undergoing fundamental changes. The end of Dennard scaling and tapering of Moore’s law has led to economic conditions that favor large-scale demand aggregators like public cloud providers. Gartner projects cloud to be the largest sector of computing by 2025, reaching nearly $1T USD in revenue with 20% annual growth. The tremendous growth is translating into substantial investment in research and development to manage the complexity of systems with post-Dennard and post-Moore architectures.
Current HPC tools and techniques are unsuitable for managing the vastly increased complexity of the resource heterogeneity and dynamism of distributed systems used by emerging scientific workflows. Adoption of cloud technologies will enable support for new workflows and avoid HPC becoming a technology island. However, direct implementation of current cloud technologies will not satisfy scientific software demands for performance. Converged computing, or an environment that seamlessly combines the performance of HPC with the portability, reproducibility, and automation of the cloud, is an active area of research and development that seeks to benefit both worlds. Successful realization of converged computing requires a clear-eyed appraisal of each world’s strengths and weaknesses.
In this talk I will provide a brief background on the economic and technical trends that led to the dominance of cloud computing, and the related increase of scientific software complexity. I will identify aspects of cloud computing and HPC that are immediately compatible such as coupling of simulation and services and those which require research for deeper integration such as MPI and elasticity. There are many academic research groups, laboratories, and companies working to bring cloud and HPC together, which translates to a large space for collaboration.
I will provide an overview of converged computing R&D themes and present highlights of my team’s work on converged computing at the Lawrence Livermore National Laboratory. I will conclude by suggesting potentially fruitful R&D directions and identifying collaborative work to enable cloud and HPC to integrate more completely and benefit both communities.