Beth Plale | Indiana University (original) (raw)

Uploads

Papers by Beth Plale

Research paper thumbnail of School of Informatics Strategic Plan for Research

Research paper thumbnail of Metadata and Preserva,on in Geosciences: Issues at Scale

Research paper thumbnail of Dynamic Querying of Streaming Data with the dQUOB System

Research paper thumbnail of Short Paper on Synthetic Workload for Grid Information Services/Registries

Research paper thumbnail of Towards Quantifying Limits of Automated Curation of Geospatial Data

Workflow systems are an increasingly popular eScience tool for executing complex sequences of tas... more Workflow systems are an increasingly popular eScience tool for executing complex sequences of tasks. The large volumes of data created during the course of these computationally intense and datadriven scientific investigations drives research in techniques to automate metadata capture to relieve the burden on the user of manual annotation. In this paper we describe our experience to date in quantifying the limits of automated metadata collection in e-Science workflow systems.

Research paper thumbnail of Unified Programming Model for Instrument-driven e-Science Workflows ?

The scientific knowledge discovery process has been aided recently by advances in cyberinfrastruc... more The scientific knowledge discovery process has been aided recently by advances in cyberinfrastructure that automate the execution of data retrieval, modeling and analysis tasks typically undertaken during scientific exploration. But these infrastructures lack a generalized programming model for integrating real time data from sensors and instruments into the analysis process. In this ongoing work we examine two approaches to stream processing, a rule system and a query language-based system, and argue that either can be made suitable for our user base with enough hand-written code, but can either become as accepted as workflow systems in the science community? What will that take?

Research paper thumbnail of Data Management Strategies for Scientific Applications in Cloud Environments

Clouds are increasingly being used for running dataintensive scientific applications. However, sc... more Clouds are increasingly being used for running dataintensive scientific applications. However, science applications need to contend with the I/O and network performance characteristics of cloud environments. Additionally, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance-cost trade-offs, complex application choices, complexity associated with elasticity and failure rates. In this paper, we evaluate various aspects of data management strategies in cloud environments. Our evaluation is performed in the context of two frameworks - Hadoop and FRIEDA and conducted on four cloud testbeds - FutureGrid, ExoGeni, Grid5000, Amazon. Our experiments highlight the different performance implications of storage, file system, provis

Research paper thumbnail of forecast_20100504170000Z_run001

Research paper thumbnail of forecast_20100505150000Z_run001

Research paper thumbnail of forecast_20100508110000Z_run001

Research paper thumbnail of forecast_20100508150000Z_run001

Research paper thumbnail of forecast_20100509140000Z_run001

Research paper thumbnail of forecast_20100510120000Z_run001

Research paper thumbnail of forecast_20100510150000Z_run001

Research paper thumbnail of forecast_20100514150000Z_run001

Research paper thumbnail of forecast_20100515140000Z_run001

Research paper thumbnail of forecast_20100517160000Z_run001

Research paper thumbnail of forecast_20100517170000Z_run001

Research paper thumbnail of forecast_20100519160000Z_run001

Research paper thumbnail of forecast_20100519170000Z_run001

Research paper thumbnail of School of Informatics Strategic Plan for Research

Research paper thumbnail of Metadata and Preserva,on in Geosciences: Issues at Scale

Research paper thumbnail of Dynamic Querying of Streaming Data with the dQUOB System

Research paper thumbnail of Short Paper on Synthetic Workload for Grid Information Services/Registries

Research paper thumbnail of Towards Quantifying Limits of Automated Curation of Geospatial Data

Workflow systems are an increasingly popular eScience tool for executing complex sequences of tas... more Workflow systems are an increasingly popular eScience tool for executing complex sequences of tasks. The large volumes of data created during the course of these computationally intense and datadriven scientific investigations drives research in techniques to automate metadata capture to relieve the burden on the user of manual annotation. In this paper we describe our experience to date in quantifying the limits of automated metadata collection in e-Science workflow systems.

Research paper thumbnail of Unified Programming Model for Instrument-driven e-Science Workflows ?

The scientific knowledge discovery process has been aided recently by advances in cyberinfrastruc... more The scientific knowledge discovery process has been aided recently by advances in cyberinfrastructure that automate the execution of data retrieval, modeling and analysis tasks typically undertaken during scientific exploration. But these infrastructures lack a generalized programming model for integrating real time data from sensors and instruments into the analysis process. In this ongoing work we examine two approaches to stream processing, a rule system and a query language-based system, and argue that either can be made suitable for our user base with enough hand-written code, but can either become as accepted as workflow systems in the science community? What will that take?

Research paper thumbnail of Data Management Strategies for Scientific Applications in Cloud Environments

Clouds are increasingly being used for running dataintensive scientific applications. However, sc... more Clouds are increasingly being used for running dataintensive scientific applications. However, science applications need to contend with the I/O and network performance characteristics of cloud environments. Additionally, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance-cost trade-offs, complex application choices, complexity associated with elasticity and failure rates. In this paper, we evaluate various aspects of data management strategies in cloud environments. Our evaluation is performed in the context of two frameworks - Hadoop and FRIEDA and conducted on four cloud testbeds - FutureGrid, ExoGeni, Grid5000, Amazon. Our experiments highlight the different performance implications of storage, file system, provis

Research paper thumbnail of forecast_20100504170000Z_run001

Research paper thumbnail of forecast_20100505150000Z_run001

Research paper thumbnail of forecast_20100508110000Z_run001

Research paper thumbnail of forecast_20100508150000Z_run001

Research paper thumbnail of forecast_20100509140000Z_run001

Research paper thumbnail of forecast_20100510120000Z_run001

Research paper thumbnail of forecast_20100510150000Z_run001

Research paper thumbnail of forecast_20100514150000Z_run001

Research paper thumbnail of forecast_20100515140000Z_run001

Research paper thumbnail of forecast_20100517160000Z_run001

Research paper thumbnail of forecast_20100517170000Z_run001

Research paper thumbnail of forecast_20100519160000Z_run001

Research paper thumbnail of forecast_20100519170000Z_run001

Log In