Irene Finocchi - Academia.edu (original) (raw)

Papers by Irene Finocchi

Research paper thumbnail of Clique counting in MapReduce: algorithms and experiments

We tackle the problem of counting the number q k of k-cliques in large-scale graphs, for any cons... more We tackle the problem of counting the number q k of k-cliques in large-scale graphs, for any constant k ≥ 3. Clique counting is essential in a variety of applications, among which social network analysis. Our algorithms make it possible to compute q k for several real-world graphs and shed light on its growth rate as a function of k. Even for small values of k, the number q k of k-cliques can be in the order of tens or hundreds of trillions. As k increases, different graph instances show different behaviors: while on some graphs q k+1 < q k , on other benchmarks q k+1 q k , up to two orders of magnitude in our observations. Graphs with steep clique growth rates represent particularly tough instances in practice.

Research paper thumbnail of On data skewness, stragglers, and MapReduce progress indicators

Proceedings of the Sixth ACM Symposium on Cloud Computing - SoCC '15, 2015

We tackle the problem of predicting the performance of MapReduce applications designing accurate ... more We tackle the problem of predicting the performance of MapReduce applications designing accurate progress indicators, which keep programmers informed on the percentage of completed computation time during the execution of a job. This is especially important in pay-as-you-go cloud environments, where slow jobs can be aborted in order to avoid excessive costs. Performance predictions can also serve as a building block for several profile-guided optimizations. By assuming that the running time depends linearly on the input size, state-of-the-art techniques can be seriously harmed by data skewness, load unbalancing, and straggling tasks. We thus design a novel profile-guided progress indicator, called NearestFit, that operates without the linear hypothesis assumption in a fully online way (i.e., without resorting to profile data collected from previous executions). NearestFit exploits a careful combination of nearest neighbor regression and statistical curve fitting techniques. Fine-grained profiles required by our theoretical progress model are approximated through space-and time-efficient data streaming algorithms. We implemented NearestFit on top of Hadoop 2.6.0. An extensive empirical assessment over the Amazon EC2 platform on a variety of benchmarks shows that its accuracy is very good, even when competitors incur nonnegligible errors and wide prediction fluctuations.

Research paper thumbnail of Conflict-free star-access in parallel memory systems

We study conflict-free data distribution schemes in parallel memories in multiprocessor system ar... more We study conflict-free data distribution schemes in parallel memories in multiprocessor system architectures. Given a host graph G, the problem is to map the nodes of G into memory modules such that any instance of a template type T in G can be accessed without memory conflicts. A conflict occurs if two or more nodes of T are mapped to the same memory module.

Research paper thumbnail of Graph sketches

Abstract We introduce the notion of Graph Sketches. They can be thought of as visual indices that... more Abstract We introduce the notion of Graph Sketches. They can be thought of as visual indices that guide the navigation of a multi-graph too large to fit on the available display. We adhere to the Visual Information-Seeking Mantra: Overview first, zoom and filter, then details on demand. Graph Sketches are incorporated into MGV, an integrated visualization and exploration system for massive multi-digraph navigation.

Research paper thumbnail of An experimental analysis of simple, distributed vertex coloring algorithms

Abstract We perform an extensive experimental evaluation of very simple, distributed, randomized ... more Abstract We perform an extensive experimental evaluation of very simple, distributed, randomized algorithms for (Δ+ 1) and so-called Brooks–Vizing vertex colorings, ie, colorings using considerably fewer than Δ colors (here Δ denotes the maximum degree of the graph). We consider variants of algorithms known from the literature, boosting them with a distributed independent set computation.

Research paper thumbnail of A unified approach to coding labeled trees

We consider the problem of coding labeled trees by means of strings of node labels and we present... more We consider the problem of coding labeled trees by means of strings of node labels and we present a unified approach based on a reduction of both coding and decoding to integer (radix) sorting. Applying this approach to four well-known codes introduced by Prüfer [18], Neville [17], and Deo and Micikevicius [5], we close some open problems. With respect to coding, our general sequential algorithm requires optimal linear time, thus solving the problem of optimally computing the second code presented by Neville.

Research paper thumbnail of Reactive imperative programming with dataflow constraints

Abstract Dataflow languages provide natural support for specifying constraints between objects in... more Abstract Dataflow languages provide natural support for specifying constraints between objects in dynamic applications, where programs need to react efficiently to changes of their environment. Researchers have long investigated how to take advantage of dataflow constraints by embedding them into procedural languages. Previous mixed imperative/dataflow systems, however, require syntactic extensions or libraries of ad hoc data types for binding the imperative program to the dataflow solver.

Research paper thumbnail of Input-Sensitive Profiling

In this article we present a building block technique and a toolkit towards automatic discovery o... more In this article we present a building block technique and a toolkit towards automatic discovery of workload-dependent performance bottlenecks.
From one or more runs of a program, our profiler automatically measures how the performance of individual routines scales as a function of the input size, yielding clues to their growth rate. The output of the profiler is, for each executed routine of the program, a set of tuples that aggregate performance costs by input size. The collected profiles can be used to produce performance plots and derive trend functions by statistical curve fitting techniques. A key feature of our method is the ability to automatically measure the size of the input given to a generic code fragment:
to this aim, we propose an effective metric for estimating the input size of a routine
and show how to compute it efficiently.
We discuss several examples, showing that our approach can reveal asymptotic bottlenecks that other profilers may fail to detect and can provide useful characterizations of the workload and behavior of individual routines in the context of mainstream applications, yielding several code optimizations as well as algorithmic improvements. To prove the feasibility of our techniques, we implemented a Valgrind tool called \aprof\ and performed an extensive experimental evaluation on the SPEC CPU2006 benchmarks.
Our experiments show that \aprof\ delivers comparable performance to other prominent Valgrind tools, and can generate informative plots even from single runs on typical workloads for most algorithmically-critical routines.

Research paper thumbnail of Trading off space for passes in graph streaming problems

Abstract Data stream processing has recently received increasing attention as a computational par... more Abstract Data stream processing has recently received increasing attention as a computational paradigm for dealing with massive data sets. Surprisingly, no algorithm with both sublinear space and passes is known for natural graph problems in classical read-only streaming. Motivated by technological factors of modern storage systems, some authors have recently started to investigate the computational power of less restrictive models where writing streams is allowed.

Research paper thumbnail of A portable virtual machine for program debugging and directing

Abstract Directors are reactive systems that monitor the run-time environment and react to the em... more Abstract Directors are reactive systems that monitor the run-time environment and react to the emitted events. Typical examples of directors are debuggers and tools for program analysis and software visualization. In this paper we describe a cross-platform virtual machine that provides advanced facilities for implementing directors with low effort.

Research paper thumbnail of Resilient Dynamic Programming

Research paper thumbnail of Resilient dictionaries

ACM Transactions on Algorithms, 2009

We address the problem of designing data structures in the presence of faults that may arbitraril... more We address the problem of designing data structures in the presence of faults that may arbitrarily corrupt memory locations. More precisely, we assume that an adaptive adversary can arbitrarily overwrite the content of up to δ memory locations, that corrupted locations cannot be detected, and that only O(1) memory locations are safe. In this framework, we call a data structure resilient if it is able to operate correctly (at least) on the set of uncorrupted values. We present a resilient dictionary, implementing search, insert and delete operations. Our dictionary has O(log n + δ) expected amortized time per operation, and O(n) space complexity, where n denotes the current number of keys in the dictionary. We also describe a deterministic resilient dictionary, with the same amortized cost per operation over a sequence of at least δ ǫ operations, where ǫ > 0 is an arbitrary constant. Finally, we show that any resilient comparison-based dictionary must take Ω(log n+δ) expected time per search. Our results are achieved by means of simple, new techniques, which might be of independent interest for the design of other resilient algorithms.

Research paper thumbnail of Estimating the Empirical Cost Function of Routines with Dynamic Workloads

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization - CGO '14, 2014

A crucial aspect in software development is understanding how an application's performance scales... more A crucial aspect in software development is understanding how an application's performance scales as a function of its input data. Estimating the empirical cost function of individual routines of a program can help developers predict the runtime on larger workloads and pinpoint asymptotic inefficiencies in the code. While this has been the target of extensive research in performance profiling, a major limitation of state-of-the-art approaches is that the input size is assumed to be determinable from the program's state prior to the invocation of the routine to be profiled, failing to characterize the scenario where routines dynamically receive input values during their activations. This results in missing workloads generated by kernel system calls (e.g., in response to I/O or network operations) or by other threads, which play a crucial role in modern concurrent and interactive applications. Measuring dynamic workloads poses several challenges, requiring shared-memory communication between threads to be efficiently traced. In this paper we present a new metric and an efficient algorithm for automatically estimating the size of the input of each routine activation. We provide examples showing that our metric allows the estimation of the empirical cost functions of complex applications more precisely than previous approaches. An extensive experimental investigation on a variety of benchmarks shows that our metric can be integrated in a Valgrind-based profiler incurring overheads comparable to other prominent heavyweight dynamic analysis tools.

Research paper thumbnail of Estimating the Empirical Cost Function of Routines with Dynamic Workloads

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization - CGO '14, 2014

A crucial aspect in software development is understanding how an application's performance scales... more A crucial aspect in software development is understanding how an application's performance scales as a function of its input data. Estimating the empirical cost function of individual routines of a program can help developers predict the runtime on larger workloads and pinpoint asymptotic inefficiencies in the code. While this has been the target of extensive research in performance profiling, a major limitation of state-of-the-art approaches is that the input size is assumed to be determinable from the program's state prior to the invocation of the routine to be profiled, failing to characterize the scenario where routines dynamically receive input values during their activations. This results in missing workloads generated by kernel system calls (e.g., in response to I/O or network operations) or by other threads, which play a crucial role in modern concurrent and interactive applications. Measuring dynamic workloads poses several challenges, requiring shared-memory communication between threads to be efficiently traced. In this paper we present a new metric and an efficient algorithm for automatically estimating the size of the input of each routine activation. We provide examples showing that our metric allows the estimation of the empirical cost functions of complex applications more precisely than previous approaches. An extensive experimental investigation on a variety of benchmarks shows that our metric can be integrated in a Valgrind-based profiler incurring overheads comparable to other prominent heavyweight dynamic analysis tools.

Research paper thumbnail of k-Calling context profiling

Proceedings of the ACM international conference on Object oriented programming systems languages and applications - OOPSLA '12, 2012

Calling context trees are one of the most fundamental data structures for representing the interp... more Calling context trees are one of the most fundamental data structures for representing the interprocedural control flow of a program, providing valuable information for program understanding and optimization. Nodes of a calling context tree associate performance metrics to whole distinct paths in the call graph starting from the root function. However, no explicit information is provided for detecting short hot sequences of activations, which may be a better optimization target in large modular programs where groups of related functions are reused in many different parts of the code. Furthermore, calling context trees can grow prohibitively large in some scenarios. Another classical approach, called edge profiling, collects performance metrics for caller-callee pairs in the call graph, allowing it to detect hot paths of fixed length one. We study a generalization of edge and context-sensitive profiles by introducing a novel data structure called k-calling context forest (k-CCF). Nodes in a k-CCF associate performance metrics to paths of length at most k that lead to each distinct routine of the program, providing edge profiles for k = 1, full context-sensitive profiles for k = ∞, as well as any other intermediate point in the spectrum. We study the properties of the k-CCF both theoretically and experimentally on a large suite of prominent Linux applications, showing how to construct it efficiently and discussing its relationships with the calling context tree. Our experiments show that the k-CCF can provide effective space-accuracy tradeoffs for interprocedural contextual profiling, yielding useful clues to the hot spots of a program that may be hidden in a calling context tree and using less space for small values of k, which appear to be the most interesting in practice.

Research paper thumbnail of Sorting and searching in faulty memories

Abstract In this paper we investigate the design and analysis of algorithms resilient to memory f... more Abstract In this paper we investigate the design and analysis of algorithms resilient to memory faults. We focus on algorithms that, despite the corruption of some memory values during their execution, are nevertheless able to produce a correct output at least on the set of uncorrupted values. In this framework, we consider two fundamental problems: sorting and searching. In particular, we prove that any O (nlog n) comparison-based sorting algorithm can tolerate the corruption of at most O ((nlog n) 1/2) keys.

Research paper thumbnail of Designing reliable algorithms in unreliable memories

Computer Science Review, 2007

Some of today's applications run on computer platforms with large and inexpensive memories, which... more Some of today's applications run on computer platforms with large and inexpensive memories, which are also error-prone. Unfortunately, the appearance of even very few memory faults may jeopardize the correctness of the computational results. An algorithm is resilient to memory faults if, despite the corruption of some memory values before or during its execution, it is nevertheless able to get a correct output at least on the set of uncorrupted values. In this paper we will survey some recent work on reliable computation in the presence of memory faults.

Research paper thumbnail of Experimental Study of Resilient Algorithms and Data Structures

Large and inexpensive memory devices may suffer from faults, where some bits may arbitrarily flip... more Large and inexpensive memory devices may suffer from faults, where some bits may arbitrarily flip and corrupt the values of the affected memory cells. The appearance of such faults may seriously compromise the correctness and performance of computations. In recent years, several algorithms for computing in the presence of memory faults have been introduced in the literature: in particular, we say that an algorithm or a data structure is resilient if it is able to work correctly on the set of uncorrupted values. In this invited talk, we contribute carefully engineered implementations of recent resilient algorithms and data structures and report the main results of a preliminary experimental evaluation of our implementations.

Research paper thumbnail of The Price of Resiliency: A Case Study on Sorting with Memory Faults

We address the problem of sorting in the presence of faults that may arbitrarily corrupt memory l... more We address the problem of sorting in the presence of faults that may arbitrarily corrupt memory locations, and investigate the impact of memory faults both on the correctness and on the running times of mergesort-based algorithms. To achieve this goal, we develop a software testbed that simulates different fault injection strategies, and we perform a thorough experimental study using a combination of several fault parameters. Our experiments give evidence that simple-minded approaches to this problem are largely impractical, while the design of more sophisticated resilient algorithms seems really worth the effort. Another contribution of our computational study is a carefully engineered implementation of a resilient sorting algorithm, which appears robust to different memory fault patterns.

Research paper thumbnail of Infinite Trees and the Future* Extended Abstract

Graph drawing: 7th international symposium, GD'99, Štiřín Castle, Czech Republic, September 15-19, 1999: proceedings, 1999

Research paper thumbnail of Clique counting in MapReduce: algorithms and experiments

We tackle the problem of counting the number q k of k-cliques in large-scale graphs, for any cons... more We tackle the problem of counting the number q k of k-cliques in large-scale graphs, for any constant k ≥ 3. Clique counting is essential in a variety of applications, among which social network analysis. Our algorithms make it possible to compute q k for several real-world graphs and shed light on its growth rate as a function of k. Even for small values of k, the number q k of k-cliques can be in the order of tens or hundreds of trillions. As k increases, different graph instances show different behaviors: while on some graphs q k+1 < q k , on other benchmarks q k+1 q k , up to two orders of magnitude in our observations. Graphs with steep clique growth rates represent particularly tough instances in practice.

Research paper thumbnail of On data skewness, stragglers, and MapReduce progress indicators

Proceedings of the Sixth ACM Symposium on Cloud Computing - SoCC '15, 2015

We tackle the problem of predicting the performance of MapReduce applications designing accurate ... more We tackle the problem of predicting the performance of MapReduce applications designing accurate progress indicators, which keep programmers informed on the percentage of completed computation time during the execution of a job. This is especially important in pay-as-you-go cloud environments, where slow jobs can be aborted in order to avoid excessive costs. Performance predictions can also serve as a building block for several profile-guided optimizations. By assuming that the running time depends linearly on the input size, state-of-the-art techniques can be seriously harmed by data skewness, load unbalancing, and straggling tasks. We thus design a novel profile-guided progress indicator, called NearestFit, that operates without the linear hypothesis assumption in a fully online way (i.e., without resorting to profile data collected from previous executions). NearestFit exploits a careful combination of nearest neighbor regression and statistical curve fitting techniques. Fine-grained profiles required by our theoretical progress model are approximated through space-and time-efficient data streaming algorithms. We implemented NearestFit on top of Hadoop 2.6.0. An extensive empirical assessment over the Amazon EC2 platform on a variety of benchmarks shows that its accuracy is very good, even when competitors incur nonnegligible errors and wide prediction fluctuations.

Research paper thumbnail of Conflict-free star-access in parallel memory systems

We study conflict-free data distribution schemes in parallel memories in multiprocessor system ar... more We study conflict-free data distribution schemes in parallel memories in multiprocessor system architectures. Given a host graph G, the problem is to map the nodes of G into memory modules such that any instance of a template type T in G can be accessed without memory conflicts. A conflict occurs if two or more nodes of T are mapped to the same memory module.

Research paper thumbnail of Graph sketches

Abstract We introduce the notion of Graph Sketches. They can be thought of as visual indices that... more Abstract We introduce the notion of Graph Sketches. They can be thought of as visual indices that guide the navigation of a multi-graph too large to fit on the available display. We adhere to the Visual Information-Seeking Mantra: Overview first, zoom and filter, then details on demand. Graph Sketches are incorporated into MGV, an integrated visualization and exploration system for massive multi-digraph navigation.

Research paper thumbnail of An experimental analysis of simple, distributed vertex coloring algorithms

Abstract We perform an extensive experimental evaluation of very simple, distributed, randomized ... more Abstract We perform an extensive experimental evaluation of very simple, distributed, randomized algorithms for (Δ+ 1) and so-called Brooks–Vizing vertex colorings, ie, colorings using considerably fewer than Δ colors (here Δ denotes the maximum degree of the graph). We consider variants of algorithms known from the literature, boosting them with a distributed independent set computation.

Research paper thumbnail of A unified approach to coding labeled trees

We consider the problem of coding labeled trees by means of strings of node labels and we present... more We consider the problem of coding labeled trees by means of strings of node labels and we present a unified approach based on a reduction of both coding and decoding to integer (radix) sorting. Applying this approach to four well-known codes introduced by Prüfer [18], Neville [17], and Deo and Micikevicius [5], we close some open problems. With respect to coding, our general sequential algorithm requires optimal linear time, thus solving the problem of optimally computing the second code presented by Neville.

Research paper thumbnail of Reactive imperative programming with dataflow constraints

Abstract Dataflow languages provide natural support for specifying constraints between objects in... more Abstract Dataflow languages provide natural support for specifying constraints between objects in dynamic applications, where programs need to react efficiently to changes of their environment. Researchers have long investigated how to take advantage of dataflow constraints by embedding them into procedural languages. Previous mixed imperative/dataflow systems, however, require syntactic extensions or libraries of ad hoc data types for binding the imperative program to the dataflow solver.

Research paper thumbnail of Input-Sensitive Profiling

In this article we present a building block technique and a toolkit towards automatic discovery o... more In this article we present a building block technique and a toolkit towards automatic discovery of workload-dependent performance bottlenecks.
From one or more runs of a program, our profiler automatically measures how the performance of individual routines scales as a function of the input size, yielding clues to their growth rate. The output of the profiler is, for each executed routine of the program, a set of tuples that aggregate performance costs by input size. The collected profiles can be used to produce performance plots and derive trend functions by statistical curve fitting techniques. A key feature of our method is the ability to automatically measure the size of the input given to a generic code fragment:
to this aim, we propose an effective metric for estimating the input size of a routine
and show how to compute it efficiently.
We discuss several examples, showing that our approach can reveal asymptotic bottlenecks that other profilers may fail to detect and can provide useful characterizations of the workload and behavior of individual routines in the context of mainstream applications, yielding several code optimizations as well as algorithmic improvements. To prove the feasibility of our techniques, we implemented a Valgrind tool called \aprof\ and performed an extensive experimental evaluation on the SPEC CPU2006 benchmarks.
Our experiments show that \aprof\ delivers comparable performance to other prominent Valgrind tools, and can generate informative plots even from single runs on typical workloads for most algorithmically-critical routines.

Research paper thumbnail of Trading off space for passes in graph streaming problems

Abstract Data stream processing has recently received increasing attention as a computational par... more Abstract Data stream processing has recently received increasing attention as a computational paradigm for dealing with massive data sets. Surprisingly, no algorithm with both sublinear space and passes is known for natural graph problems in classical read-only streaming. Motivated by technological factors of modern storage systems, some authors have recently started to investigate the computational power of less restrictive models where writing streams is allowed.

Research paper thumbnail of A portable virtual machine for program debugging and directing

Abstract Directors are reactive systems that monitor the run-time environment and react to the em... more Abstract Directors are reactive systems that monitor the run-time environment and react to the emitted events. Typical examples of directors are debuggers and tools for program analysis and software visualization. In this paper we describe a cross-platform virtual machine that provides advanced facilities for implementing directors with low effort.

Research paper thumbnail of Resilient Dynamic Programming

Research paper thumbnail of Resilient dictionaries

ACM Transactions on Algorithms, 2009

We address the problem of designing data structures in the presence of faults that may arbitraril... more We address the problem of designing data structures in the presence of faults that may arbitrarily corrupt memory locations. More precisely, we assume that an adaptive adversary can arbitrarily overwrite the content of up to δ memory locations, that corrupted locations cannot be detected, and that only O(1) memory locations are safe. In this framework, we call a data structure resilient if it is able to operate correctly (at least) on the set of uncorrupted values. We present a resilient dictionary, implementing search, insert and delete operations. Our dictionary has O(log n + δ) expected amortized time per operation, and O(n) space complexity, where n denotes the current number of keys in the dictionary. We also describe a deterministic resilient dictionary, with the same amortized cost per operation over a sequence of at least δ ǫ operations, where ǫ > 0 is an arbitrary constant. Finally, we show that any resilient comparison-based dictionary must take Ω(log n+δ) expected time per search. Our results are achieved by means of simple, new techniques, which might be of independent interest for the design of other resilient algorithms.

Research paper thumbnail of Estimating the Empirical Cost Function of Routines with Dynamic Workloads

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization - CGO '14, 2014

A crucial aspect in software development is understanding how an application's performance scales... more A crucial aspect in software development is understanding how an application's performance scales as a function of its input data. Estimating the empirical cost function of individual routines of a program can help developers predict the runtime on larger workloads and pinpoint asymptotic inefficiencies in the code. While this has been the target of extensive research in performance profiling, a major limitation of state-of-the-art approaches is that the input size is assumed to be determinable from the program's state prior to the invocation of the routine to be profiled, failing to characterize the scenario where routines dynamically receive input values during their activations. This results in missing workloads generated by kernel system calls (e.g., in response to I/O or network operations) or by other threads, which play a crucial role in modern concurrent and interactive applications. Measuring dynamic workloads poses several challenges, requiring shared-memory communication between threads to be efficiently traced. In this paper we present a new metric and an efficient algorithm for automatically estimating the size of the input of each routine activation. We provide examples showing that our metric allows the estimation of the empirical cost functions of complex applications more precisely than previous approaches. An extensive experimental investigation on a variety of benchmarks shows that our metric can be integrated in a Valgrind-based profiler incurring overheads comparable to other prominent heavyweight dynamic analysis tools.

Research paper thumbnail of Estimating the Empirical Cost Function of Routines with Dynamic Workloads

Proceedings of Annual IEEE/ACM International Symposium on Code Generation and Optimization - CGO '14, 2014

A crucial aspect in software development is understanding how an application's performance scales... more A crucial aspect in software development is understanding how an application's performance scales as a function of its input data. Estimating the empirical cost function of individual routines of a program can help developers predict the runtime on larger workloads and pinpoint asymptotic inefficiencies in the code. While this has been the target of extensive research in performance profiling, a major limitation of state-of-the-art approaches is that the input size is assumed to be determinable from the program's state prior to the invocation of the routine to be profiled, failing to characterize the scenario where routines dynamically receive input values during their activations. This results in missing workloads generated by kernel system calls (e.g., in response to I/O or network operations) or by other threads, which play a crucial role in modern concurrent and interactive applications. Measuring dynamic workloads poses several challenges, requiring shared-memory communication between threads to be efficiently traced. In this paper we present a new metric and an efficient algorithm for automatically estimating the size of the input of each routine activation. We provide examples showing that our metric allows the estimation of the empirical cost functions of complex applications more precisely than previous approaches. An extensive experimental investigation on a variety of benchmarks shows that our metric can be integrated in a Valgrind-based profiler incurring overheads comparable to other prominent heavyweight dynamic analysis tools.

Research paper thumbnail of k-Calling context profiling

Proceedings of the ACM international conference on Object oriented programming systems languages and applications - OOPSLA '12, 2012

Calling context trees are one of the most fundamental data structures for representing the interp... more Calling context trees are one of the most fundamental data structures for representing the interprocedural control flow of a program, providing valuable information for program understanding and optimization. Nodes of a calling context tree associate performance metrics to whole distinct paths in the call graph starting from the root function. However, no explicit information is provided for detecting short hot sequences of activations, which may be a better optimization target in large modular programs where groups of related functions are reused in many different parts of the code. Furthermore, calling context trees can grow prohibitively large in some scenarios. Another classical approach, called edge profiling, collects performance metrics for caller-callee pairs in the call graph, allowing it to detect hot paths of fixed length one. We study a generalization of edge and context-sensitive profiles by introducing a novel data structure called k-calling context forest (k-CCF). Nodes in a k-CCF associate performance metrics to paths of length at most k that lead to each distinct routine of the program, providing edge profiles for k = 1, full context-sensitive profiles for k = ∞, as well as any other intermediate point in the spectrum. We study the properties of the k-CCF both theoretically and experimentally on a large suite of prominent Linux applications, showing how to construct it efficiently and discussing its relationships with the calling context tree. Our experiments show that the k-CCF can provide effective space-accuracy tradeoffs for interprocedural contextual profiling, yielding useful clues to the hot spots of a program that may be hidden in a calling context tree and using less space for small values of k, which appear to be the most interesting in practice.

Research paper thumbnail of Sorting and searching in faulty memories

Abstract In this paper we investigate the design and analysis of algorithms resilient to memory f... more Abstract In this paper we investigate the design and analysis of algorithms resilient to memory faults. We focus on algorithms that, despite the corruption of some memory values during their execution, are nevertheless able to produce a correct output at least on the set of uncorrupted values. In this framework, we consider two fundamental problems: sorting and searching. In particular, we prove that any O (nlog n) comparison-based sorting algorithm can tolerate the corruption of at most O ((nlog n) 1/2) keys.

Research paper thumbnail of Designing reliable algorithms in unreliable memories

Computer Science Review, 2007

Some of today's applications run on computer platforms with large and inexpensive memories, which... more Some of today's applications run on computer platforms with large and inexpensive memories, which are also error-prone. Unfortunately, the appearance of even very few memory faults may jeopardize the correctness of the computational results. An algorithm is resilient to memory faults if, despite the corruption of some memory values before or during its execution, it is nevertheless able to get a correct output at least on the set of uncorrupted values. In this paper we will survey some recent work on reliable computation in the presence of memory faults.

Research paper thumbnail of Experimental Study of Resilient Algorithms and Data Structures

Large and inexpensive memory devices may suffer from faults, where some bits may arbitrarily flip... more Large and inexpensive memory devices may suffer from faults, where some bits may arbitrarily flip and corrupt the values of the affected memory cells. The appearance of such faults may seriously compromise the correctness and performance of computations. In recent years, several algorithms for computing in the presence of memory faults have been introduced in the literature: in particular, we say that an algorithm or a data structure is resilient if it is able to work correctly on the set of uncorrupted values. In this invited talk, we contribute carefully engineered implementations of recent resilient algorithms and data structures and report the main results of a preliminary experimental evaluation of our implementations.

Research paper thumbnail of The Price of Resiliency: A Case Study on Sorting with Memory Faults

We address the problem of sorting in the presence of faults that may arbitrarily corrupt memory l... more We address the problem of sorting in the presence of faults that may arbitrarily corrupt memory locations, and investigate the impact of memory faults both on the correctness and on the running times of mergesort-based algorithms. To achieve this goal, we develop a software testbed that simulates different fault injection strategies, and we perform a thorough experimental study using a combination of several fault parameters. Our experiments give evidence that simple-minded approaches to this problem are largely impractical, while the design of more sophisticated resilient algorithms seems really worth the effort. Another contribution of our computational study is a carefully engineered implementation of a resilient sorting algorithm, which appears robust to different memory fault patterns.

Research paper thumbnail of Infinite Trees and the Future* Extended Abstract

Graph drawing: 7th international symposium, GD'99, Štiřín Castle, Czech Republic, September 15-19, 1999: proceedings, 1999