Simulation of HPC Job Scheduling and Large-Scale Parallel Workloads, Mohammad Abu Obaida and Jason Liu. In Proceedings of the 2017 Winter Simulation Conference (WSC 2017), W. K. V. Chan, A. D’Ambrogio, G. Zacharewicz, N. Mustafee, G. Wainer, and E. Page, eds., December 2017. [paper]
Abstract
The paper presents a simulator designed specifically for evaluating job scheduling algorithms on large-scale HPC systems. The simulator was developed based on the Performance Prediction Toolkit (PPT), which is a parallel discrete-event simulator written in Python for rapid assessment and performance prediction of large-scale scientific applications on supercomputers. The proposed job scheduler simulator incorporates PPT’s application models, and when coupled with the sufficiently detailed architecture models, can represent more realistic job runtime behaviors. Consequently, the simulator can evaluate different job scheduling and task mapping algorithms on the specific target HPC platforms more accurately.
Bibtex
@inproceedings{wsc17-jobsched, title = {Simulation of HPC Job Scheduling and Large-Scale Parallel Workloads}, author = {Mohammad Abu Obaida and Jason Liu}, booktitle = {Proceedings of the 2017 Winter Simulation Conference (WSC 2017)}, editor = {W. K. V. Chan and A. D’Ambrogio and G. Zacharewicz and N. Mustafee and G. Wainer and E. Page}, month = {December}, year = {2017} }