IOscope: A Flexible I/O Tracer for Workloads' I/O Pattern Characterization (original) (raw)

Buttress: A toolkit for flexible and high fidelity I/O benchmarking

static.usenix.org

In benchmarking I/O systems, it is important to generate, accurately, the I/O access pattern that one is intending to generate. However, timing accuracy (issuing I/Os at the desired time) at high I/O rates is difficult to achieve on stock operating systems. We currently lack tools to easily and accurately generate complex I/O workloads on modern storage systems. As a result, we may be introducing substantial errors in observed system metrics when we benchmark I/O systems using inaccurate tools for replaying traces or for producing synthetic workloads with known inter-arrival times. In this paper, we demonstrate the need for timing accuracy for I/O benchmarking in the context of replaying I/O traces. We also quantitatively characterize the impact of error in issuing I/Os on measured system parameters. For instance, we show that the error in perceived I/O response times can be as much as •350% or 15% by using naive benchmarking tools that have timing inaccuracies. To address this problem, we present Buttress, a portable and flexible toolkit that can generate I/O workloads with microsecond accuracy at the I/O throughputs of high-end enterprise storage arrays. In particular, Buttress can issue I/O requests within 100µs of the desired issue time even at rates of 10000 I/Os per second (IOPS).

Exploring the Impacts of Multiple I/O Metrics in Identifying I/O Bottlenecks

SC'23: International Conference for High Performance Computing, Networking, Storage and Analysis, 2023

HPC systems, driven by the rise of workloads with significant data requirements, face challenges in I/O performance. To address this, a thorough I/O analysis is crucial to identify potential bottlenecks. However, the multitude of metrics makes it difficult to pinpoint the causes of low I/O performance. In this work, we analyze three scientific workloads using three widely accepted I/O metrics. We demonstrate that different metrics uncover different I/O bottlenecks, highlighting the importance of considering multiple metrics for comprehensive I/O analysis.

I/O System Performance Debugging Using Model-driven Anomaly Characterization

2005

It is challenging to identify performance problems and pinpoint their root causes in complex systems, especially when the system supports wide ranges of workloads and when performance problems only materialize under particular workload conditions. This paper proposes a model-driven anomaly characterization approach and uses it to discover operating system performance bugs when supporting disk I/O-intensive online servers. We construct a whole-system I/O throughput model as the reference of expected performance and we use statistical clustering and characterization of performance anomalies to guide debugging. Unlike previous performance debugging methods offering detailed statistics at specific execution settings, our approach focuses on comprehensive anomaly characterization over wide ranges of workload conditions and system configurations.

Vidya: Performing Code-Block I/O Characterization for Data Access Optimization

2018 IEEE 25th International Conference on High Performance Computing (HiPC), 2018

Understanding, characterizing and tuning scientific applications' I/O behavior is an increasingly complicated process in HPC systems. Existing tools use either offline profiling or online analysis to get insights into the applications' I/O patterns. However, there is lack of a clear formula to characterize applications' I/O. Moreover, these tools are application specific and do not account for multi-tenant systems. This paper presents Vidya, an I/O profiling framework which can predict application's I/O intensity using a new formula called Code-Block I/O Characterization (CIOC). Using CIOC, developers and system architects can tune an application's I/O behavior and better match the underlying storage system to maximize performance. Evaluation results show that Vidya can predict an application's I/O intensity with a variance of 0.05%. Vidya can profile applications with a high accuracy of 98% while reducing profiling time by 9x. We further show how Vidya can o...

IOSIG+: On the Role of I/O Tracing and Analysis for Hadoop Systems

2015 IEEE International Conference on Cluster Computing, 2015

Hadoop, as one of the most widely accepted MapReduce frameworks, is naturally data-intensive. Its several dependent projects, such as Mahout and Hive, inherent this characteristic. Meanwhile I/O optimization becomes a daunting work, since applications' source code is not always available. I/O traces for Hadoop and its dependents are increasingly important, because it can faithfully reveal intrinsic I/O behaviors without knowing the source code. This method can not only help to diagnose system bottlenecks but also further optimize performance. To achieve this goal, we propose a transparent tracing and analysis tool suite, namely IOSIG+, which can be plugged into Hadoop system. We make several contributions: 1) we describe our approach of tracing; 2) we release the tracer, which can trace I/O operations without modifying targets' source code; 3) this work adopts several techniques to mitigate the introduced execution overhead at runtime; 4) we create an analyzer, which helps to discover new approaches to address I/O problems according to access patterns. The experimental results and analysis confirm its effectiveness and the observed overhead can be as low as 1.97%.

Performance analysis of distributed storage clusters based on kernel and userspace traces

Software: Practice and Experience, 2020

Distributed storage systems are commonly used in modern computing. They are highly scalable and offer data replication and fault tolerance. The complexity of those systems makes them difficult to debug using traditional tools. The existing tools are able to evaluate the overall performance of such systems but they do not provide enough information to find the root cause of performance issues. In this article, we propose a tracing-based performance analysis framework for storage clusters. We use a tracing strategy that reduces the tracing overhead in production systems. The traces collected from the different storage nodes are correlated and used to generate a data model that represents the cluster. Userspace tracing is used to gather data from the storage daemons, while Kernel tracing is used to provide detailed information about operating system internals such as disk queues, network queues and process scheduling. Efficient data structures are used to store the model and to generate metrics and graphical views. Our tool is used in different real world scenarios and is able to investigate interesting performance problems including I/O latencies, data replication and storage nodes failures.

Towards I/O analysis of HPC systems and a generic architecture to collect access patterns

2012

In high-performance computing (HPC) applications, a high-level I/O call will trigger activities on a multitude of hardware components such as massively parallel systems supported by huge storage systems and internal software layers. Currently, their complex interplay makes it impossible to identify the causes for and the locations of I/O bottlenecks. Existing tools indicate the bottleneck but provide little guidance to identify the cause and how to improve the situation. Our project Scalable I/O for Extreme Performance was initiated to find solutions for this problem. To achieve this goal in SIOX, we will build a system to record access information on all layers and components, recognize access patterns, and characterize the I/O system. Ultimately, it will localize the reasons for I/O bottlenecks and propose optimizations for the I/O middleware that improve I/O performance, such as through-We want to express our gratitude to the "Deutsches Zentrum für Luft-und Raumfahrt e.V." as responsible project agency and to the "Bundesministerium für Bildung und Forschung" for the financial support under grant 01 IH 11008 A-C.