HPCC’16 Paper: HPC Interconnect Model

Scalable Interconnection Network Models for Rapid Performance Prediction of HPC Applications, Kishwar Ahmed, Jason Liu, Stephan Eidenbenz, and Joe Zerr. In Proceedings of the 18th International Conference on High Performance Computing and Communications (HPCC 2016), December 2016. [paper] [slides]

Performance Prediction Toolkit (PPT) is a simulator mainly developed at Los Alamos National Laboratory to facilitate rapid and accurate performance prediction of large-scale scientific applications on existing and future HPC architectures. In this paper, we present three interconnect models for performance prediction of large-scale HPC applications. They are based on interconnect topologies widely used in HPC systems: torus, dragonfly, and fat-tree. We conduct extensive validation tests of our interconnect models, in particular, using configurations of existing HPC systems. Results show that our models provide good accuracy for predicting the network behavior. We also present a performance study of a parallel computational physics application to show that our model can accurately predict the parallel behavior of large-scale applications.
author={K. Ahmed and J. Liu and S. Eidenbenz and J. Zerr},
booktitle={Proceedings of the IEEE 18th International Conference on High Performance Computing and Communications (HPCC)},
title={Scalable Interconnection Network Models for Rapid Performance Prediction of HPC Applications},

PADS’16 Paper: Integrated Interconnect Model

An Integrated Interconnection Network Model for Large-Scale Performance Prediction, Kishwar Ahmed, Mohammad Obaida, Jason Liu, Stephan Eidenbenz, Nandakishore Santhi, and Guillaume Chapuis. In Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation (SIGSIM-PADS 2016), May 2016. [paper]

Interconnection network is a critical component of high- performance computing architecture and application co-design. For many scientific applications, the increasing communication complexity poses a serious concern as it may hinder the scaling properties of these applications on novel architectures. It is apparent that a scalable, efficient, and accurate interconnect model would be essential for performance evaluation studies. In this paper, we present an interconnect model for predicting the performance of large-scale applications on high-performance architectures. In particular, we present a sufficiently detailed interconnect model for Cray’s Gemini 3-D torus network. The model has been integrated with an implementation of the Message-Passing Interface (MPI) that can mimic most of its functions with packet-level accuracy on the target platform. Extensive experiments show that our integrated model provides good accuracy for predicting the network behavior, while at the same time allowing for good parallel scaling performance.
author = {Ahmed, Kishwar and Obaida, Mohammad and Liu, Jason and Eidenbenz, Stephan and Santhi, Nandakishore and Chapuis, Guillaume},
title = {An Integrated Interconnection Network Model for Large-Scale Performance Prediction},
booktitle = {Proceedings of the 2016 Annual ACM Conference on SIGSIM Principles of Advanced Discrete Simulation},
series = {SIGSIM-PADS ’16},
year = {2016},
isbn = {978-1-4503-3742-7},
location = {Banff, Alberta, Canada},
pages = {177–187},
numpages = {11},
url = {http://doi.acm.org/10.1145/2901378.2901396},
doi = {10.1145/2901378.2901396},
acmid = {2901396},
publisher = {ACM},
address = {New York, NY, USA},