Understanding & Improving I/O Performance on HPC Systems
Date: Sunday, June 18, 2017, 09:00 AM - 06:00 PM
Room: Analog 1, Forum
Description: I/O is a key part of all applications, whether it be reading in data to start simulations, or writing checkpoint files to protect against hardware failures, or outputting the results of a simulation. As I/O is often infrequent, especially in computational simulation applications that run at scale on HPC resources, it is often neglected when considering application performance and optimisation. However, as we scale to larger HPC systems the fraction of time spent in I/O for applications is increasing. We are also now encountering a new type of application using HPC resources, data intensive applications where I/O is a dominant part of the workload. Therefore, understanding I/O performance for application, and optimising I/O, is crucial in enabling efficient computational simulations. Furthermore, whilst compute resources tend to be exclusively assigned to an individual job on a HPC machine, I/O hardware is shared between jobs that are running, meaning I/O performance can be variable and understanding the I/O performance of an application in isolation is often difficult. This tutorial will address how users can assess the I/O performance and capabilities of the systems they are using, of individual applications, and what parallel I/O software and strategies can be used to optimise I/O.
Links: Official link from ISC 2017