WSC’17 Paper: HPC Job Scheduling Simulation

Simulation of HPC Job Scheduling and Large-Scale Parallel Workloads, Mohammad Abu Obaida and Jason Liu. In Proceedings of the 2017 Winter Simulation Conference (WSC 2017), W. K. V. Chan, A. D’Ambrogio, G. Zacharewicz, N. Mustafee, G. Wainer, and E. Page, eds., December 2017. To appear. [paper]

abstractbibtex
The paper presents a simulator designed specifically for evaluating job scheduling algorithms on large-scale HPC systems. The simulator was developed based on the Performance Prediction Toolkit (PPT), which is a parallel discrete-event simulator written in Python for rapid assessment and performance prediction of large-scale scientific applications on supercomputers. The proposed job scheduler simulator incorporates PPT’s application models, and when coupled with the sufficiently detailed architecture models, can represent more realistic job runtime behaviors. Consequently, the simulator can evaluate different job scheduling and task mapping algorithms on the specific target HPC platforms more accurately.
Not yet available.

HPPAC’17 Paper: Energy-Aware Scheduling

When Good Enough Is Better: Energy-Aware Scheduling for Multicore Servers, Xinning Hui, Zhihui Dua, Jason Liu, Hongyang Sun, Yuxiong He, David A. Bader. In Proceedings of the 13th Workshop on High-Performance, Power-Aware Computing (HPPAC 2017), held in conjunction with 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2017), May 2017. [paper]

abstractbibtex
Power is a primary concern for mobile, cloud, and high-performance computing applications. Approximate computing refers to running applications to obtain results with tolerable errors under resource constraints, and it can be applied to balance energy consumption with service quality. In this paper, we propose a “Good Enough (GE)” scheduling algorithm that uses approximate computing to provide satisfactory QoS (Quality of Service) for interactive applications with significant energy savings. Given a user-specified quality level, the GE algorithm works in the AES (Aggressive Energy Saving) mode for the majority of the time, neglecting the low-quality portions of the workload. When the perceived quality falls below the required level, the algorithm switches to the BQ (Best Quality) mode with a compensation policy. To avoid core speed thrashing between the two modes, GE employs a hybrid power distribution scheme that uses the Equal-Sharing (ES) policy to distribute power among the cores when the workload is light (to save energy) and the Water-Filling (WF) policy when the workload is high (to improve quality). We conduct simulations to compare the performance of GE with existing scheduling algorithms. Results show that the proposed algorithm can provide large energy savings with satisfactory user experience.
@INPROCEEDINGS{Hui2017:approx,
author={X. Hui and Z. Du and J. Liu and H. Sun and Y. He and D. A. Bader},
booktitle={Proceedings of the 13th Workshop on High-Performance, Power-Aware Computing (HPPAC 2017), held in conjunction with 31st IEEE International Parallel and Distributed Processing Symposium (IPDPS 2017)},
title={When Good Enough Is Better: Energy-Aware Scheduling for Multicore Servers},
year={2017},
pages={984-993},
doi={10.1109/IPDPSW.2017.38},
month={May},}