Renato Ferreira - Academia.edu (original) (raw)
Uploads
Papers by Renato Ferreira
The international journal of high performance computing applications, 2017
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many In... more We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core-MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application performan...
Procedia Computer Science, 2016
Mss, Apr 24, 2000
Increasingly powerful computers have made it possible for computational scientists and en-gineers... more Increasingly powerful computers have made it possible for computational scientists and en-gineers to model physical phenomena in great detail. As a result, overwhelming amounts of data are being generated by scientific and engineering simulations. In addition, large amounts of ...
Proceedings of the 2003 Acm Symposium on Applied Computing, 2003
Page 1. A C0mp0nent-6a5ed 1mp1ementat10n 0f Mu1t1p1e 5e4uence A119nment * Um1t Cata1yUrek t, M1ke... more Page 1. A C0mp0nent-6a5ed 1mp1ementat10n 0f Mu1t1p1e 5e4uence A119nment * Um1t Cata1yUrek t, M1ke 6ray t, 7ah51n KUrC t, J0e1 5a1t2 t, Er1C 5tah16er9 +, Renat0 Ferre1ra~ t Dept. 0f 810med1ca11nf0rmat1c5 7he ...
Processing and analyzing large volumes of data plays an increasingly important role in many domai... more Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler which processes data intensive applications written in a dialect of Java and compiles them for efficient execution on cluster of workstations or distributed memory machines. In this paper, we focus on data intensive applications with two important properties: 1) data elements have spatial coordinates associated with them and the distribution of the data is not regular with ...
Solving problems that have large computational and storage requirements is becoming increasingly ... more Solving problems that have large computational and storage requirements is becoming increasingly critical for advances in many domains of science and en-gineering. By allowing algorithms for such problems to be programmed in widely used or rapidly emerging high-level paradigms, like object-oriented and declar-ative programming models, rapid prototyping and easy development of compu-tational techniques can be facilitated. Our research focuses on an important class of scientific and engineering prob-lems, data ...
... 1-23. [21] Veloso, A., Meira, W., Ferreira, R., et al., “ Asynchronous and Anticipatory Filte... more ... 1-23. [21] Veloso, A., Meira, W., Ferreira, R., et al., “ Asynchronous and Anticipatory Filter-StreamBased Parallel Algorithm for ... [27] Ferreira, RA, W. Meira Jr., Guedes, D., Drumond, D. “ Anthill: A Scalable Run-Time Environment for Data Mining Applications”, SBAC-PAD, 2005.
We describe a project that employs anobject-relational programming paradigm to supportcomputation... more We describe a project that employs anobject-relational programming paradigm to supportcomputation on and spatial subsetting of very largedisk or tape-based datasets. The runtime supportfor this project will be adapted from that developedin the context of the Maryland Active Data Repository(ADR) project.
Processing and analyzing large volumes of data plays an increasingly important role in many
this paper, we concentrate on how the system manipulates and displays high power, high resolution... more this paper, we concentrate on how the system manipulates and displays high power, high resolution histopathology datasets.
In this work we address the design of a database sys-tem to explore, process, and visualize very ... more In this work we address the design of a database sys-tem to explore, process, and visualize very large (multi-terabyte) multi-resolution image datasets, obtained from MRI, CT and ultrasound, and digitized microscopy im-ages. The basic requirements for such a ...
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many In... more We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core - MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application perform...
This paper describes the design of a complete software system for
Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06), 2006
Page 1. Assessing Data Virtualization for Irregularly Replicated Large Datasets Bruno Diniz Diêgo... more Page 1. Assessing Data Virtualization for Irregularly Replicated Large Datasets Bruno Diniz Diêgo L. Nogueira André Cardoso Renato A. Ferreira Dorgival Guedes Wagner Meira Jr. Department of Computer Science Federal ...
Proceedings : a conference of the American Medical Informatics Association / ... AMIA Annual Fall Symposium. AMIA Fall Symposium, 1997
We present the design of the Virtual Microscope, a software system employing a client/server arch... more We present the design of the Virtual Microscope, a software system employing a client/server architecture to provide a realistic emulation of a high power light microscope. We discuss several technical challenges related to providing the performance necessary to achieve rapid response time, mainly in dealing with the enormous amounts of data (tens to hundreds of gigabytes per slide) that must be retrieved from secondary storage and processed. To effectively implement the data server, the system design relies on the computational power and high I/O throughput available from an appropriately configured parallel computer.
Proceedings / IEEE International Symposium on Biomedical Imaging: from nano to macro. IEEE International Symposium on Biomedical Imaging, 2009
Accurate segmentation of tissue microarrays is a challenging topic because of some of the similar... more Accurate segmentation of tissue microarrays is a challenging topic because of some of the similarities exhibited by normal tissue and tumor regions. Processing speed is another consideration when dealing with imaged tissue microarrays as each microscopic slide may contain hundreds of digitized tissue discs. In this paper, a fast and accurate image segmentation algorithm is presented. Both a whole disc delineation algorithm and a learning based tumor region segmentation approach which utilizes multiple scale texton histograms are introduced. The algorithm is completely automatic and computationally efficient. The mean pixel-wise segmentation accuracy is about 90%. It requires about 1 second for whole disc (1024×1024 pixels) segmentation and less than 5 seconds for segmenting tumor regions. In order to enable remote access to the algorithm and collaborative studies, an analytical service is implemented using the caGrid infrastructure. This service wraps the algorithm and provides inte...
The international journal of high performance computing applications, 2017
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many In... more We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core-MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application performan...
Procedia Computer Science, 2016
Mss, Apr 24, 2000
Increasingly powerful computers have made it possible for computational scientists and en-gineers... more Increasingly powerful computers have made it possible for computational scientists and en-gineers to model physical phenomena in great detail. As a result, overwhelming amounts of data are being generated by scientific and engineering simulations. In addition, large amounts of ...
Proceedings of the 2003 Acm Symposium on Applied Computing, 2003
Page 1. A C0mp0nent-6a5ed 1mp1ementat10n 0f Mu1t1p1e 5e4uence A119nment * Um1t Cata1yUrek t, M1ke... more Page 1. A C0mp0nent-6a5ed 1mp1ementat10n 0f Mu1t1p1e 5e4uence A119nment * Um1t Cata1yUrek t, M1ke 6ray t, 7ah51n KUrC t, J0e1 5a1t2 t, Er1C 5tah16er9 +, Renat0 Ferre1ra~ t Dept. 0f 810med1ca11nf0rmat1c5 7he ...
Processing and analyzing large volumes of data plays an increasingly important role in many domai... more Processing and analyzing large volumes of data plays an increasingly important role in many domains of scientific research. We are developing a compiler which processes data intensive applications written in a dialect of Java and compiles them for efficient execution on cluster of workstations or distributed memory machines. In this paper, we focus on data intensive applications with two important properties: 1) data elements have spatial coordinates associated with them and the distribution of the data is not regular with ...
Solving problems that have large computational and storage requirements is becoming increasingly ... more Solving problems that have large computational and storage requirements is becoming increasingly critical for advances in many domains of science and en-gineering. By allowing algorithms for such problems to be programmed in widely used or rapidly emerging high-level paradigms, like object-oriented and declar-ative programming models, rapid prototyping and easy development of compu-tational techniques can be facilitated. Our research focuses on an important class of scientific and engineering prob-lems, data ...
... 1-23. [21] Veloso, A., Meira, W., Ferreira, R., et al., “ Asynchronous and Anticipatory Filte... more ... 1-23. [21] Veloso, A., Meira, W., Ferreira, R., et al., “ Asynchronous and Anticipatory Filter-StreamBased Parallel Algorithm for ... [27] Ferreira, RA, W. Meira Jr., Guedes, D., Drumond, D. “ Anthill: A Scalable Run-Time Environment for Data Mining Applications”, SBAC-PAD, 2005.
We describe a project that employs anobject-relational programming paradigm to supportcomputation... more We describe a project that employs anobject-relational programming paradigm to supportcomputation on and spatial subsetting of very largedisk or tape-based datasets. The runtime supportfor this project will be adapted from that developedin the context of the Maryland Active Data Repository(ADR) project.
Processing and analyzing large volumes of data plays an increasingly important role in many
this paper, we concentrate on how the system manipulates and displays high power, high resolution... more this paper, we concentrate on how the system manipulates and displays high power, high resolution histopathology datasets.
In this work we address the design of a database sys-tem to explore, process, and visualize very ... more In this work we address the design of a database sys-tem to explore, process, and visualize very large (multi-terabyte) multi-resolution image datasets, obtained from MRI, CT and ultrasound, and digitized microscopy im-ages. The basic requirements for such a ...
We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many In... more We carry out a comparative performance study of multi-core CPUs, GPUs and Intel Xeon Phi (Many Integrated Core - MIC) with a microscopy image analysis application. We experimentally evaluate the performance of computing devices on core operations of the application. We correlate the observed performance with the characteristics of computing devices and data access patterns, computation complexities, and parallelization forms of the operations. The results show a significant variability in the performance of operations with respect to the device used. The performances of operations with regular data access are comparable or sometimes better on a MIC than that on a GPU. GPUs are more efficient than MICs for operations that access data irregularly, because of the lower bandwidth of the MIC for random data accesses. We propose new performance-aware scheduling strategies that consider variabilities in operation speedups. Our scheduling strategies significantly improve application perform...
This paper describes the design of a complete software system for
Sixth IEEE International Symposium on Cluster Computing and the Grid (CCGRID'06), 2006
Page 1. Assessing Data Virtualization for Irregularly Replicated Large Datasets Bruno Diniz Diêgo... more Page 1. Assessing Data Virtualization for Irregularly Replicated Large Datasets Bruno Diniz Diêgo L. Nogueira André Cardoso Renato A. Ferreira Dorgival Guedes Wagner Meira Jr. Department of Computer Science Federal ...
Proceedings : a conference of the American Medical Informatics Association / ... AMIA Annual Fall Symposium. AMIA Fall Symposium, 1997
We present the design of the Virtual Microscope, a software system employing a client/server arch... more We present the design of the Virtual Microscope, a software system employing a client/server architecture to provide a realistic emulation of a high power light microscope. We discuss several technical challenges related to providing the performance necessary to achieve rapid response time, mainly in dealing with the enormous amounts of data (tens to hundreds of gigabytes per slide) that must be retrieved from secondary storage and processed. To effectively implement the data server, the system design relies on the computational power and high I/O throughput available from an appropriately configured parallel computer.
Proceedings / IEEE International Symposium on Biomedical Imaging: from nano to macro. IEEE International Symposium on Biomedical Imaging, 2009
Accurate segmentation of tissue microarrays is a challenging topic because of some of the similar... more Accurate segmentation of tissue microarrays is a challenging topic because of some of the similarities exhibited by normal tissue and tumor regions. Processing speed is another consideration when dealing with imaged tissue microarrays as each microscopic slide may contain hundreds of digitized tissue discs. In this paper, a fast and accurate image segmentation algorithm is presented. Both a whole disc delineation algorithm and a learning based tumor region segmentation approach which utilizes multiple scale texton histograms are introduced. The algorithm is completely automatic and computationally efficient. The mean pixel-wise segmentation accuracy is about 90%. It requires about 1 second for whole disc (1024×1024 pixels) segmentation and less than 5 seconds for segmenting tumor regions. In order to enable remote access to the algorithm and collaborative studies, an analytical service is implemented using the caGrid infrastructure. This service wraps the algorithm and provides inte...