Large-scale biomedical image analysis in grid environments (original) (raw)
Related papers
Image Processing or the Grid: A Toolkit or Building Grid-enabled Image Processing Applications
Cluster Computing and the Grid, 2003
Analyzing large and distributed image datasets is a cru- cial step in understanding the structural and functional characteristics of biological systems. In this paper, we present the design and implementation of a toolkit that al- lows rapid and efficient development of biomedical image analysis applications in a distributed environment. This toolkit employs the Insight Segmentation and Registration Toolkit (ITK) and
Using grid technologies to face medical image analysis challenges
CCGrid 2003. 3rd IEEE/ACM International Symposium on Cluster Computing and the Grid, 2003. Proceedings., 2003
The availability of digital imagers inside hospitals and their ever growing inspection capabilities have established digital medical images as a key component of many pathologies diagnosis, follow-up and treatment. To face the growing image analysis requirements, automated medical image processing algorithms have been developed over the two past decades. In parallel, medical image databases have been set up in health centers. Some attempts have been made to cross data coming from different origins for studies involving large databases. Grid technologies appear to be a promising tool to face the raising challenges of computational medicine. They offer wide area access to distributed databases in a secured environment and they bring the computational power needed to complete some large scale statistical studies involving image processing. In this paper, we review grid-related requirements of medical application that we illustrate through two real examples.
Grid computing in image analysis
Diagnostic pathology, 2011
Diagnostic surgical pathology or tissue–based diagnosis still remains the most reliable and specific diagnostic medical procedure. The development of whole slide scanners permits the creation of virtual slides and to work on so-called virtual microscopes. In addition to interactive work on virtual slides approaches have been reported that introduce automated virtual microscopy, which is composed of several tools focusing on quite different tasks. These include evaluation of image quality and image standardization, analysis of potential useful thresholds for object detection and identification (segmentation), dynamic segmentation procedures, adjustable magnification to optimize feature extraction, and texture analysis including image transformation and evaluation of elementary primitives. Grid technology seems to possess all features to efficiently target and control the specific tasks of image information and detection in order to obtain a detailed and accurate diagnosis. Grid techn...
High-throughput Analysis of Large Microscopy Image Datasets on CPU-GPU Cluster Platforms
2013 IEEE 27th International Symposium on Parallel and Distributed Processing, 2013
Analysis of large pathology image datasets offers significant opportunities for the investigation of disease morphology, but the resource requirements of analysis pipelines limit the scale of such studies. Motivated by a brain cancer study, we propose and evaluate a parallel image analysis application pipeline for high throughput computation of large datasets of high resolution pathology tissue images on distributed CPU-GPU platforms. To achieve efficient execution on these hybrid systems, we have built runtime support that allows us to express the cancer image analysis application as a hierarchical data processing pipeline. The application is implemented as a coarse-grain pipeline of stages, where each stage may be further partitioned into another pipeline of fine-grain operations. The fine-grain operations are efficiently managed and scheduled for computation on CPUs and GPUs using performance aware scheduling techniques along with several optimizations, including architecture aware process placement, data locality conscious task assignment, data prefetching, and asynchronous data copy. These optimizations are employed to maximize the utilization of the aggregate computing power of CPUs and GPUs and minimize data copy overheads. Our experimental evaluation shows that the cooperative use of CPUs and GPUs achieves significant improvements on top of GPU-only versions (up to 1.6×) and that the execution of the application as a set of fine-grain operations provides more opportunities for runtime optimizations and attains better performance than coarser-grain, monolithic implementations used in other works. An implementation of the cancer image analysis pipeline using the runtime support was able to process an image dataset consisting of 36,848 4Kx4K-pixel image tiles (about 1.8TB uncompressed) in less than 4 minutes (150 tiles/second) on 100 nodes of a state-of-the-art hybrid cluster system.
Grid-enabling medical image analysis
Journal of Clinical Monitoring and Computing, 2005
Digital medical image processing is a promising application area for grids. Given the volume of data, the sensitivity of medical information, and the joint complexity of medical datasets and computations expected in clinical practice, the challenge is to fill the gap between the grid middleware and the requirements of clinical applications. The research project AGIR (Grid Analysis of Radiological Data) presented in this paper addresses this challenge through a combined approach: on one hand, leveraging the grid middleware through core grid medical services which target the requirements of medical data processing applications; on the other hand, grid-enabling a panel of applications ranging from algorithmic research to clinical applications.
Emerging trends: grid technology in pathology
Studies in health technology and informatics, 2012
Grid technology has enabled clustering and access to, and interaction among, a wide variety of geographically distributed resources such as supercomputers, storage systems, data sources, instruments as well as special devices and services, realizing network-centric operations. Their main applications include large scale computational and data intensive problems in science and engineering. Grids are likely to have a deep impact on health related applications. Moreover, they seem to be suitable for tissue-based diagnosis. They offer a powerful tool to deal with current challenges in many biomedical domains involving complex anatomical and physiological modeling of structures from images or large image databases assembling and analysis. This chapter analyzes the general structures and functions of a Grid environment implemented for tissue-based diagnosis on digital images. Moreover, it presents a Grid middleware implemented by the authors for diagnostic pathology applications. The chap...
A Grid Information Infrastructure for Medical Image Analysis
Computing Research Repository, 2004
The storage and manipulation of digital images and the analysis of the information held in those images are essential requirements for next-generation medical information systems. The medical community has been exploring collaborative approaches for managing image data and exchanging knowledge and Grid technology [1] is a promising approach to enabling distributed analysis across medical institutions and for developing new collaborative and cooperative approaches for image analysis without the necessity for clinicians to co-locate. The EU-funded MammoGrid project [2] is one example of this and it aims to develop a Europe-wide database of mammograms to support effective co-working between healthcare professionals across the EU. The MammoGrid prototype comprises a high-quality clinician visualization workstation (for data acquisition and inspection), a DICOM-compliant interface to a set of medical services (annotation, security, image analysis, data storage and querying services) residing on a so-called Grid-box and secure access to a network of other Grid-boxes connected through Grid middleware. One of the main deliverables of the project is a Grid-enabled infrastructure that manages federated mammogram databases across Europe. This paper outlines the MammoGrid Information Infrastructure (MII) for meta-data analysis and knowledge discovery in the medical imaging domain.
Efficient Execution of Microscopy Image Analysis on CPU, GPU, and MIC Equipped Cluster Systems
2014 IEEE 26th International Symposium on Computer Architecture and High Performance Computing, 2014
High performance computing is experiencing a major paradigm shift with the introduction of accelerators, such as graphics processing units (GPUs) and Intel Xeon Phi (MIC). These processors have made available a tremendous computing power at low cost, and are transforming machines into hybrid systems equipped with CPUs and accelerators. Although these systems can deliver a very high peak performance, making full use of its resources in real-world applications is a complex problem. Most current applications deployed to these machines are still being executed in a single processor, leaving other devices underutilized. In this paper we explore a scenario in which applications are composed of hierarchical data flow tasks which are allocated to nodes of a distributed memory machine in coarse-grain, but each of them may be composed of several finergrain tasks which can be allocated to different devices within the node. We propose and implement novel performance aware scheduling techniques that can be used to allocate tasks to devices. We evaluate our techniques using a pathology image analysis application used to investigate brain cancer morphology, and our experimental evaluation shows that the proposed scheduling strategies significantly outperforms other efficient scheduling techniques, such as Heterogeneous Earliest Finish Time-HEFT, in cooperative executions using CPUs, GPUs, and MICs. We also experimentally show that our strategies are less sensitive to inaccuracy in the scheduling input data and that the performance gains are maintained as the application scales.
Tissue MicroArray: a distributed Grid approach for image analysis
Studies in health technology and informatics, 2007
The Tissue MicroArray (TMA) technique is assuming even more importance. Digital images acquisition becomes fundamental to provide an automatic system for subsequent analysis. The accuracy of the results depends on the image resolution, which has to be very high in order to provide as many details as possible. Lossless formats are more suitable to bring information, but data file size become a critical factor researchers have to deal with. This affects not only storage methods but also computing times and performances. Pathologists and researchers who work with biological tissues, in particular with the TMA technique, need to consider a large number of case studies to formulate and validate their hypotheses. It is clear the importance of image sharing between different institutes worldwide to increase the amount of interesting data to work with. In this context, preserving the security of sensitive data is a fundamental issue. In most of the cases copying patient data in places different from the original database is forbidden by the owner institutes. Storage, computing and security are key problems of TMA methodology. In our system we tackle all these aspects using the EGEE (Enabling Grids for E-sciencE) Grid infrastructure. The Grid platform provides good storage, performance in image processing and safety of sensitive patient information: this architecture offers hundreds of Storage and Computing Elements and enables users to handle images without copying them to physical disks other than where they have been archived by the owner, giving back to end-users only the processed anonymous images. The efficiency of the TMA analysis process is obtained implementing algorithms based on functions provided by the Parallel IMAge processing Genoa Library (PIMA(GE) 2 Lib). The acquisition of remotely distributed TMA images is made using specialized I/O functions based on the Grid File Access Library (GFAL) API. In our opinion this approach may represent important contribution to tele-pathology development.
IM.Grid, a Grid computing approach for Image Mining of High Throughput-High Content Screening
2008 9th IEEE/ACM International Conference on Grid Computing, 2008
Image processing and analysis has become essential for both cell biology research and drug discovery since the advent of High Content Screening (HCS) technologies. In this context, the Grid technology is a good opportunity to solve intensive computing problems with large data set. In addition, the exploitation of the Grid is not a simple task for many users because it is difficult to use the Grid in practical fields. Another important issue is to provide a simple way to use of Grid resources. In this paper, we present IM.Grid, a grid computing extension of our in-house image analysis software called IM (Image Mining) providing capabilities to simultaneously access visual data located on NAS (Network-Attached Storage) and extract knowledge from the raw information by customizable image processing pipeline in a parallel way. A user makes a plug-in designing own image mining pipeline using specific built-in image processing libraries. Then, the plug-in becomes an actual processing unit when Grid starts to analyze multiple images retrieving them from the NAS at a time. The user receives output results as fast as numbers of computational grids are available. We apply this method to reduce the image processing and analysis time of cell biological images for drug discovery within High Throughput-High Content Screening (HT-HCS) context. Because the processing time grows dramatically as the image size becomes huge due to many factors like multi-channel, high resolution and so on. To deal with these constraints, we propose a highperformance computing environment on .NET framework that helps to improve productivity not only in developing phases but also in HT-HCS platforms.