HPCC’16 Paper: HPC Interconnect Model

Scalable Interconnection Network Models for Rapid Performance Prediction of HPC Applications, Kishwar Ahmed, Jason Liu, Stephan Eidenbenz, and Joe Zerr. In Proceedings of the 18th International Conference on High Performance Computing and Communications (HPCC 2016), December 2016. [paper] [slides]


Performance Prediction Toolkit (PPT) is a simulator mainly developed at Los Alamos National Laboratory to facilitate rapid and accurate performance prediction of large-scale scientific applications on existing and future HPC architectures. In this paper, we present three interconnect models for performance prediction of large-scale HPC applications. They are based on interconnect topologies widely used in HPC systems: torus, dragonfly, and fat-tree. We conduct extensive validation tests of our interconnect models, in particular, using configurations of existing HPC systems. Results show that our models provide good accuracy for predicting the network behavior. We also present a performance study of a parallel computational physics application to show that our model can accurately predict the parallel behavior of large-scale applications.


author={K. Ahmed and J. Liu and S. Eidenbenz and J. Zerr},
booktitle={Proceedings of the IEEE 18th International Conference on High Performance Computing and Communications (HPCC)},
title={Scalable Interconnection Network Models for Rapid Performance Prediction of HPC Applications},

WSC’16 Paper: Simulation Reproducibility

Panel – Reproducible Research in Discrete-Event Simulation – A Must or Rather a Maybe? Adelinde M. Uhrmacher, Sally Brailsford, Jason Liu, Markus Rabe, and Andreas Tolk. In Proceedings of the 2016 Winter Simulation Conference (WSC 2016), T. M. K. Roeder, P. I. Frazier, R. Szechtman, E. Zhou, T. Huschka, and S. E. Chick, eds., December 2016. [paper]


Scientific research should be reproducible, and as such also simulation research. However, the question is – is this really the case? In some application areas of simulation, e.g., cell biology, simulation studies cannot be published without data, models, methods, including computer code being made available for evaluation. With the applications and methodological areas of modeling and simulation, how the problem of reproducibility is assessed and addressed differs. The diversity of answers to this question will be illuminated by looking into the area of network simulations, simulation in logistics, in military, and health. Making different scientific cultures, different challenges, and different solutions in discrete event simulation explicit is central to improving the reproducibility and thus quality of discrete event simulation research.


author={A. M. Uhrmacher and S. Brailsford and J. Liu and M. Rabe and A. Tolk}, 
booktitle={2016 Winter Simulation Conference (WSC)}, 
title={Panel--Reproducible research in discrete event simulation--A must or rather a maybe?}, 

PADS’16 Paper: Integrated Interconnect Model

An Integrated Interconnection Network Model for Large-Scale Performance Prediction, Kishwar Ahmed, Mohammad Obaida, Jason Liu, Stephan Eidenbenz, Nandakishore Santhi, and Guillaume Chapuis. In Proceedings of the 2016 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation (SIGSIM-PADS 2016), May 2016. [paper]


Interconnection network is a critical component of high- performance computing architecture and application co-design. For many scientific applications, the increasing communication complexity poses a serious concern as it may hinder the scaling properties of these applications on novel architectures. It is apparent that a scalable, efficient, and accurate interconnect model would be essential for performance evaluation studies. In this paper, we present an interconnect model for predicting the performance of large-scale applications on high-performance architectures. In particular, we present a sufficiently detailed interconnect model for Cray’s Gemini 3-D torus network. The model has been integrated with an implementation of the Message-Passing Interface (MPI) that can mimic most of its functions with packet-level accuracy on the target platform. Extensive experiments show that our integrated model provides good accuracy for predicting the network behavior, while at the same time allowing for good parallel scaling performance.


 author = {Ahmed, Kishwar and Obaida, Mohammad and Liu, Jason and Eidenbenz, Stephan and Santhi, Nandakishore and Chapuis, Guillaume},
 title = {An Integrated Interconnection Network Model for Large-Scale Performance Prediction},
 booktitle = {Proceedings of the 2016 Annual ACM Conference on SIGSIM Principles of Advanced Discrete Simulation},
 series = {SIGSIM-PADS '16},
 year = {2016},
 isbn = {978-1-4503-3742-7},
 location = {Banff, Alberta, Canada},
 pages = {177--187},
 numpages = {11},
 url = {http://doi.acm.org/10.1145/2901378.2901396},
 doi = {10.1145/2901378.2901396},
 acmid = {2901396},
 publisher = {ACM},
 address = {New York, NY, USA},

TOMACS’15 Paper: Symbiotic Network Simulation and Emulation

Symbiotic Network Simulation and Emulation, Miguel Erazo, Rong Rong, and Jason Liu. ACM Transactions on Modeling and Computer Simulation (TOMACS), 26(1), Article No. 2, December 2015. [paper]

A testbed capable of representing detailed operations of complex applications under diverse network conditions is invaluable for understanding the design and performance of new protocols and applications before their real deployment. We introduce a novel method that combines high-performance large-scale network simulation and high-fidelity network emulation, and thus enables real instances of network applications and protocols to run in real operating environments and be tested under simulated network settings. Using our approach, network simulation and emulation can form a symbiotic relationship, through which they are synchronized for an accurate representation of the network-scale traffic behavior. We introduce a model downscaling method along with an efficient queuing model and a traffic reproduction technique, which can significantly reduce the synchronization overhead and improve accuracy. We validate our approach with extensive experiments via simulation and with a real-system implementation. We also present a case study using our approach to evaluate a multipath data transport protocol.
author = {Erazo, Miguel A. and Rong, Rong and Liu, Jason},
title = {Symbiotic Network Simulation and Emulation},
journal = {ACM Trans. Model. Comput. Simul.},
issue_date = {December 2015},
volume = {26},
number = {1},
month = jun,
year = {2015},
issn = {1049-3301},
pages = {2:1–2:25},
articleno = {2},
numpages = {25},
url = {http://doi.acm.org/10.1145/2717308},
doi = {10.1145/2717308},
acmid = {2717308},
publisher = {ACM},
address = {New York, NY, USA},

DSRT’15 Paper: Scalable Emulation with Simulation Symbiosis

Toward Scalable Emulation of Future Internet Applications with Simulation Symbiosis, Jason Liu, Cesar Marcondes, Musa Ahmed, and Rong Rong. In Proceedings of the 19th IEEE/ACM International Symposium on Distributed Simulation and Real Time Applications (DS-RT 2015), October 2015. [paper]

Mininet is a popular container-based emulation environment built on Linux for testing OpenFlow applications. Using Mininet, one can compose an experimental network using a set of virtual hosts and virtual switches with flexibility. However, it is well understood that Mininet can only provide a limited capacity, both for CPU and network I/O, due to its underlying physical constraints. We propose a method for combining simulation and emulation to improve the scalability of network experiments. This is achieved by applying the symbiotic approach to effectively integrate emulation and simulation for hybrid experimentation. In this case, one can use Mininet to directly run OpenFlow applications on the virtual machines and software switches, with network connectivity represented by detailed simulation at scale.
author={J. Liu and C. Marcondes and M. Ahmed and R. Rong},
booktitle={Proccedings of the 2015 IEEE/ACM 19th International Symposium on Distributed Simulation and Real Time Applications (DS-RT)},
title={Toward Scalable Emulation of Future Internet Applications with Simulation Symbiosis},

TOMACS’15 Paper: Cluster-Based Spatiotemporal Background Traffic

Cluster-Based Spatiotemporal Background Traffic Generation for Network Simulation, Ting Li and Jason Liu. ACM Transactions on Modeling and Computer Simulation (TOMACS), 25(1), Article No. 4, January 2015. [paper]

To reduce the computational complexity of large-scale network simulation, one needs to distinguish foreground traffic generated by the target applications one intends to study from background traffic that represents the bulk of the network traffic generated by other applications. Background traffic competes with foreground traffic for network resources and consequently plays an important role in determining the behavior of network applications. Existing background traffic models either operate only at coarse time granularity or focus only on individual links. There is little insight on how to meaningfully apply realistic background traffic over the entire network. In this article, we propose a method for generating background traffic with spatial and temporal characteristics observed from real traffic traces. We apply data clustering techniques to describe the behavior of end hosts as a function of multidimensional attributes and group them into distinct classes, and then map the classes to simulated routers so that we can generate traffic in accordance with the cluster-level statistics. The proposed traffic generator makes no assumption on the target network topology. It is also capable of scaling the generated traffic so that the traffic intensity can be varied accordingly in order to test applications under different and yet realistic network conditions. Experiments show that our method is able to generate traffic that maintains the same spatial and temporal characteristics as in the observed traffic traces.
author = {Li, Ting and Liu, Jason},
title = {Cluster-Based Spatiotemporal Background Traffic Generation for Network Simulation},
journal = {ACM Trans. Model. Comput. Simul.},
issue_date = {January 2015},
volume = {25},
number = {1},
month = nov,
year = {2014},
issn = {1049-3301},
pages = {4:1–4:25},
articleno = {4},
numpages = {25},
url = {http://doi.acm.org/10.1145/2667222},
doi = {10.1145/2667222},
acmid = {2667222},
publisher = {ACM},
address = {New York, NY, USA},