Distributed Storage Research Papers - Academia.edu (original) (raw)

Vast quantities of information is now being stored online. Web applications currently rely on monolithic storage structures which place the sole responsibility of data storage, protection and maintenance on the web application provider.... more

Vast quantities of information is now being stored online. Web applications currently rely on monolithic storage structures which place the sole responsibility of data storage, protection and maintenance on the web application provider. This research introduces the concept of a de-centralised approach for information storage online. Distributed storage techniques are used to address concerns with the classic monolithic approach while also addressing issues such as data ownership concerns for personal information. The research results in the presentation on an API that allows distributed storage of information with seamless integration of data into the traditional Web 2.0 model.

Weighted Reference Counting is a low communication distributed storage reclamation scheme for loosely-couple multiprocessors. The algorithm we present herein extends weighted reference counting to allow the collection of cyclic data... more

Weighted Reference Counting is a low communication distributed storage reclamation scheme for loosely-couple multiprocessors. The algorithm we present herein extends weighted reference counting to allow the collection of cyclic data structures. To do so, the algorithm identi es candidate objects that may be part of cycles and performs a tricolour mark-scan on their subgraph in a lazy manner to discover whether the subgraph is still in use. The algorithm is concurrent in the sense that multiple useful computation processes and garbage collection processes can be performed simultaneously.

Cloud computing is an emerging technology that allows users to utilize on-demand computation, storage, data and services from around the world. However, Cloud service providers charge users for these services. Specifically, to access data... more

Cloud computing is an emerging technology that allows users to utilize on-demand computation, storage, data and services from around the world. However, Cloud service providers charge users for these services. Specifically, to access data from their globally distributed storage edge servers, providers charge users depending on the user's location and the amount of data transferred. When deploying data-intensive applications in a Cloud computing environment, optimizing the cost of transferring data to and from these edge servers is a priority, as data play the dominant role in the application's execution. In this paper, we formulate a non-linear programming model to minimize the data retrieval and execution cost of data-intensive workflows in Clouds. Our model retrieves data from Cloud storage resources such that the amount of data transferred is inversely proportional to the communication cost. We take an example of an 'intrusion detection' application workflow, where the data logs are made available from globally distributed Cloud storage servers. We construct the application as a workflow and experiment with Cloud based storage and compute resources. We compare the cost of multiple executions of the workflow given by a solution of our non-linear program against that given by Amazon CloudFront's 'nearest' single data source selection. Our results show a savings of three-quarters of total cost using our model.

This review surveyed recent literature focused on factors that affect myoglobin chemistry, meat color, pigment redox stability, and methodology used to evaluate these properties. The appearance of meat and meat products is a complex topic... more

This review surveyed recent literature focused on factors that affect myoglobin chemistry, meat color, pigment redox stability, and methodology used to evaluate these properties. The appearance of meat and meat products is a complex topic involving animal genetics, ante-and postmortem conditions, fundamental muscle chemistry, and many factors related to meat processing, packaging, distribution, storage, display, and final preparation for consumption. These factors vary globally, but the variables that affect basic pigment chemistry are reasonably consistent between countries. Essential for maximizing meat color life is an understanding of the combined effects of two fundamental muscle traits, oxygen consumption and metmyoglobin reduction. In the antemortem sector of research, meat color is being related to genomic quantitative loci, numerous pre-harvest nutritional regimens, and housing and harvest environment. Our knowledge of postmortem chilling and pH effects, atmospheres used for packaging, antimicrobial interventions, and quality and safety of cooked color are now more clearly defined. The etiology of bone discoloration is now available. New color measurement methodology, especially digital imaging techniques, and improved modifications to existing methodology are now available. Nevertheless, unanswered questions regarding meat color remain. Meat scientists should continue to develop novel ways of improving muscle color and color stability while also focusing on the basic principles of myoglobin chemistry.

Proxy re-encryption (PRE) allows a semi-trusted proxy to convert a ciphertext originally intended for Alice into one encrypting the same plaintext for Bob. The proxy only needs a re-encryption key given by Alice, and cannot learn anything... more

Proxy re-encryption (PRE) allows a semi-trusted proxy to convert a ciphertext originally intended for Alice into one encrypting the same plaintext for Bob. The proxy only needs a re-encryption key given by Alice, and cannot learn anything about the plaintext encrypted. This adds flexibility in various applications, such as confidential email, digital right management and distributed storage. In this paper, we study unidirectional PRE, which the re-encryption key only enables delegation in one direction but not the opposite. In PKC 2009, Shao and Cao proposed a unidirectional PRE assuming the random oracle. However, we show that it is vulnerable to chosen-ciphertext attack (CCA). We then propose an efficient unidirectional PRE scheme (without resorting to pairings). We gain high efficiency and CCA-security using the "token-controlled encryption" technique, under the computational Diffie-Hellman assumption, in the random oracle model and a relaxed but reasonable definition.

We describe two extensions to the three-dimensional magnetotelluric inversion program WSINV3DMT (Siripunvaraporn, W., Egbert, G., Lenbury, Y., Uyeshima, M., 2005, Three-dimensional magnetotelluric inversion: data-space method. Phys. Earth... more

We describe two extensions to the three-dimensional magnetotelluric inversion program WSINV3DMT (Siripunvaraporn, W., Egbert, G., Lenbury, Y., Uyeshima, M., 2005, Three-dimensional magnetotelluric inversion: data-space method. Phys. Earth Planet. Interiors 150, 3-14), including modifications to allow inversion of the vertical magnetic transfer functions (VTFs), and parallelization of the code. The parallel implementation, which is most appropriate for small clusters, uses MPI to distribute forward solutions for different frequencies, as well as some linear algebraic computations, over multiple processors. In addition to reducing run times, the parallelization reduces memory requirements by distributing storage of the sensitivity matrix. Both new features are tested on synthetic and real datasets, revealing nearly linear speedup for a small number of processors (up to 8). Experiments on synthetic examples show that the horizontal position and lateral conductivity contrasts of anomalies can be recovered by inverting VTFs alone. However, vertical positions and absolute amplitudes are not well constrained unless an accurate host resistivity is imposed a priori. On very simple synthetic models including VTFs in a joint inversion had little impact on the inverse solution computed with impedances alone. However, in experiments with real data, inverse solutions obtained from joint inversion of VTF and impedances, and from impedances alone, differed in important ways, suggesting that for structures with more realistic levels of complexity the VTFs will in general provide useful additional constraints.

Governance, Risk, and Compliance (GRC) Management is on the edge of becoming one of the most important business activities for enterprises. Consequently, IT departments and IT service providers must sharpen their alignment to business... more

Governance, Risk, and Compliance (GRC) Management is on the edge of becoming one of the most important business activities for enterprises. Consequently, IT departments and IT service providers must sharpen their alignment to business processes and demands. Fulfilling these new requirements is supplemented by best practice frameworks, such as ITIL, which define a complete set of IT Service Management (ITSM)

The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached... more

The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at every size. We describe the architecture of HDFS and report on experience using HDFS to manage 25 petabytes of enterprise data at Yahoo!.

High-performance liquid chromatography was used to study the stability of folate vitamers in two types of rye breads after baking and 16 weeks of frozen storage. Bread made using sourdough seeds contained less total folate (74.6... more

High-performance liquid chromatography was used to study the stability of folate vitamers in two types of rye breads after baking and 16 weeks of frozen storage. Bread made using sourdough seeds contained less total folate (74.6 microg/100 g dry basis, expressed as folic acid) than the whole rye flour (79.8 microg/100 g dry basis) and bread leavened only with baker's yeast (82.8 microg/100 g dry basis). Most importantly, it was generated by a significant decrease in 5-CH3-H4folate form. The baking process caused some changes in folate distribution. Storage of breads at -18 degrees C for 2 weeks did not have a significant effect (p < 0.05) on total folates compared to the content directly after baking. After a 5-weeks storage period, a significant decrease (p < 0.05) in the content of total folates was recorded and it dropped on average by 14% for both type of breads. After a longer period of storage (16 weeks), a 25% loss of folates in the bread made with baker's yeast...

a b s t r a c t Vehicular sensing where vehicles on the road continuously gather, process, and share location-relevant sensor data (e.g., road condition, traffic flow) is emerging as a new network paradigm for sensor information sharing... more

a b s t r a c t Vehicular sensing where vehicles on the road continuously gather, process, and share location-relevant sensor data (e.g., road condition, traffic flow) is emerging as a new network paradigm for sensor information sharing in urban environments. Recently, smartphones have also received a lot of attention for their potential as portable vehicular urban sensing platforms, as they are equipped with a variety of environment and motion sensors (e.g., audio/video, accelerometer, and GPS) and multiple wireless interfaces (e.g., WiFi, Bluetooth and 2/3G). The ability to take a smartphone on board a vehicle and to complement the sensors of the latter with advanced smartphone capabilities is of immense interest to the industry. In this paper we survey recent vehicular sensor network developments and identify new trends. In particular we review the way sensor information is collected, stored and harvested using inter-vehicular communications (e.g., mobility-assist mobility-assisted dissemination and geographic storage), as well using the infrastructure (e.g., centralized and distributed storage in the wired Internet). The comparative performance of the various sensing schemes is important to us. Thus, we review key results by carefully examining and explaining the evaluation methodology, in the process gaining insight into vehicular sensor network design. Our comparative study confirms that system performance is impacted by a variety of factors such as wireless access methods, mobility, user location, and popularity of the information.

As the number of user-managed devices continues to increase, the need for synchronizing multiple file hierarchies distributed over devices with ad hoc connectivity, is becoming a significant problem. In this paper, we propose a new... more

As the number of user-managed devices continues to increase, the need for synchronizing multiple file hierarchies distributed over devices with ad hoc connectivity, is becoming a significant problem. In this paper, we propose a new approach for efficient cloud-based synchronization of an arbitrary number of distributed file system hierarchies. Our approach maintains both the advantages of peer-to-peer synchronization with the cloud-based approach that stores a master replica online. In contrast, we do not assume storage of any user's data in the cloud, so we address the related capacity, cost, security, and privacy limitations. Finally, the proposed system performs data synchronization in a peer-to-peer manner, eliminating cost and bandwidth concerns that arise in the "cloud master-replica" approach.

This paper reviews different aggregation approaches that can be applied for the integration of distributed energy resources and loads in electrical power systems. Based on this review we define a set of terms that allow a clear... more

This paper reviews different aggregation approaches that can be applied for the integration of distributed energy resources and loads in electrical power systems. Based on this review we define a set of terms that allow a clear differentiation of aggregation approaches. These definitions provide a framework for electrical power systems analyses concerning also the interaction of different control ap- proaches. In particular the integration of Controllable Distributed Energy (CDE) units - that comprise controllable distributed generators, controllable distributed storage units and controllable distributed loads - is analysed concerning their pos- sibilities to provide ancillary services for network operation.

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web... more

Bigtable is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers. Many projects at Google store data in Bigtable, including web indexing, Google Earth, and Google ...

Massively Multiplayer Online Games (MMOGs) are increasing in both popularity and scale, and while classical Client/Server architectures convey some benefits, they suffer from significant technical and commercial drawbacks. This... more

Massively Multiplayer Online Games (MMOGs) are increasing in both popularity and scale, and while classical Client/Server architectures convey some benefits, they suffer from significant technical and commercial drawbacks. This realisation has sparked intensive research interest in adapting MMOGs to Peer-to-Peer (P2P) architectures.

Because of the dynamic and heterogeneous nature of a grid infrastructure, the client/server paradigm is a common programming model for these environments, where the client submits requests to several geographically remote servers for... more

Because of the dynamic and heterogeneous nature of a grid infrastructure, the client/server paradigm is a common programming model for these environments, where the client submits requests to several geographically remote servers for executing already deployed applications on its own data. According to this model, the applications are usually decomposed into independent tasks that are solved concurrently by the servers (the so called Data Grid applications). On the other hand, as many scientific applications are characterized by very large set of input data and dependencies among subproblems, avoiding unnecessary synchronizations and data transfer is a difficult task. This work addresses the problem of implementing a strategy for an efficient task scheduling and data management in case of data dependencies among subproblems in the same Linear Algebra application. For the purpose of the experiments, the NetSolve distributed computing environment has been used and some minor changes h...

New technologies are deeply transforming the broadcasting industry. What we have seen so far is only the beginning of a long story. Inevitably, industry regulations must adapt, which means that a wide-ranging rethink of current practices... more

New technologies are deeply transforming the broadcasting industry. What we have seen so far is only the beginning of a long story. Inevitably, industry regulations must adapt, which means that a wide-ranging rethink of current practices is required. In order to assess the likely evolution of the industry, this article decomposes it into a number of components, from conception of programmes to their broadcasting, including distribution, storage and licensing. Contrary to popular expectations, the analysis suggests that the current high degree of concentration will, if anything, increase. The policy implication is that regulation, so far driven by now obsolete technological constraints, should increasingly emphasize promoting competition.

Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these... more

Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages.

In this paper we describe HIS, a system that enables efficient storage and querying of data organized into concept hierarchies and dispersed over a net- work. Our scheme utilizesan adaptive algorithm thatautomatically adjusts the level of... more

In this paper we describe HIS, a system that enables efficient storage and querying of data organized into concept hierarchies and dispersed over a net- work. Our scheme utilizesan adaptive algorithm thatautomatically adjusts the level of indexing according to the granularity of the incoming queries, without assuming any prior knowledge of the query workload. Efficient roll-up and drill-down opera- tions

Random linear network coding (RLNC) has been demonstrated to be able to improve the performance of many peerto-peer (P2P) applications, such as content distribution, multimedia streaming, distributed storage, wireless communications, etc.... more

Random linear network coding (RLNC) has been demonstrated to be able to improve the performance of many peerto-peer (P2P) applications, such as content distribution, multimedia streaming, distributed storage, wireless communications, etc. We first survey recent research progresses of applying RLNC in P2P content distribution and multimedia streaming, respectively. We then study the pollution attack, a primary type of security threat particularly relevant to RLNC, present and compare some existing countermeasures. We finally discuss confidentiality and privacy issues in RLNC-enabled P2P applications as future research challenges.

La realización de dos estudios de ámbito nacional durante 1988-89 con el objetivo de analizar la situación de la cadena de frío vacunal en España activó el interés por la logística de las vacunas entre los profesionales de la salud... more

La realización de dos estudios de ámbito nacional durante 1988-89 con el objetivo de analizar la situación de la cadena de frío vacunal en España activó el interés por la logística de las vacunas entre los profesionales de la salud pública. Se realizaron utilizando la metodología de evaluación basada en tarjetas de monitorización tiempo-temperatura (informe Battersby) y encuestas transversales sobre el estado de la cadena de frío vacunal (informe Ferrando) en los niveles provincial y local (2° y 3er eslabón). Los informes técnicos pusieron de manifiesto la precariedad de la situación, identificaron los puntos débiles que ponían en riesgo la efectividad de las vacunas y favorecieron la sensibilización hacia una actividad que constituye la columna vertebral de un programa de inmunización. Las mejoras propuestas fueron incorporadas paulatinamente por las distintas comunidades autónomas, que realizaron inversiones en equipamientos materiales y formación del personal sanitario, estableciendo protocolos de gestión específicos.Two studies aiming to analyze the vaccine cold chain throughout Spain, performed from 1988-89, sparked interest in vaccine logistics among public health authorities. The studies were performed using evaluation methodology based on cold chain monitor cards with a time-temperature indicator (Battersby's report) and cross-sectional surveys on the conditions of the cold chain (Ferrando's report) in the second and third levels (provincial and local stores). The technical reports revealed the precariousness of the situation, identified the weak points that were jeopardizing the vaccines’ efficiency, and favored awareness of an activity that constitutes the vertebral column of any immunization program. The improvements proposed were gradually implemented by regional governments. More funds for equipment and personnel training were provided and specific management protocols were established.

Ability to accommodate all types of distributed storage options and renewable energy sources is one of main characteristics of smart grid. Smart grid integrates advanced sensing technologies, control methodologies and communication... more

Ability to accommodate all types of distributed storage options and renewable energy sources is one of main characteristics of smart grid. Smart grid integrates advanced sensing technologies, control methodologies and communication technologies into current power distribution systems to provide electricity to customers in a better way. Infrastructure for implementation and utilization of renewable energy sources requires distributed storage systems with high power density and high energy density. Currently, some research investigates energy management and dynamic control of distributed storage system to offer not only high power density and high energy density storage but also high efficiency and long life systems. In this paper, an intelligent energy management system is proposed to provide short-term requirements of distributed storage system in smart grid. The energy management of a distributed storage system is formulated as a nonlinear mixed-integer optimization problem. A hybrid algorithm that is combined an evolutionary algorithm with a linear programming was developed to solve the problem. Outcomes of simulation studies show the potential of solving the problem by the proposed algorithm.

Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data... more

Secret sharing and erasure coding-based approaches have been used in distributed storage systems to ensure the confidentiality, integrity, and availability of critical information. To achieve performance goals in data accesses, these data fragmentation approaches can be combined with dynamic replication. In this paper, we consider data partitioning (both secret sharing and erasure coding) and dynamic replication in data grids, in which security and data access performance are critical issues. More specifically, we investigate the problem of optimal allocation of sensitive data objects that are partitioned by using secret sharing scheme or erasure coding scheme and/or replicated. The grid topology we consider consists of two layers. In the upper layer, multiple clusters form a network topology that can be represented by a general graph. The topology within each cluster is represented by a tree graph. We decompose the share replica allocation problem into two subproblems: the Optimal Intercluster Resident Set Problem (OIRSP) that determines which clusters need share replicas and the Optimal Intracluster Share Allocation Problem (OISAP) that determines the number of share replicas needed in a cluster and their placements. We develop two heuristic algorithms for the two subproblems. Experimental studies show that the heuristic algorithms achieve good performance in reducing communication cost and are close to optimal solutions.

a b s t r a c t Vehicular sensing where vehicles on the road continuously gather, process, and share location-relevant sensor data (e.g., road condition, traffic flow) is emerging as a new network paradigm for sensor information sharing... more

a b s t r a c t Vehicular sensing where vehicles on the road continuously gather, process, and share location-relevant sensor data (e.g., road condition, traffic flow) is emerging as a new network paradigm for sensor information sharing in urban environments. Recently, smartphones have also received a lot of attention for their potential as portable vehicular urban sensing platforms, as they are equipped with a variety of environment and motion sensors (e.g., audio/video, accelerometer, and GPS) and multiple wireless interfaces (e.g., WiFi, Bluetooth and 2/3G). The ability to take a smartphone on board a vehicle and to complement the sensors of the latter with advanced smartphone capabilities is of immense interest to the industry. In this paper we survey recent vehicular sensor network developments and identify new trends. In particular we review the way sensor information is collected, stored and harvested using inter-vehicular communications (e.g., mobility-assist mobility-assisted dissemination and geographic storage), as well using the infrastructure (e.g., centralized and distributed storage in the wired Internet). The comparative performance of the various sensing schemes is important to us. Thus, we review key results by carefully examining and explaining the evaluation methodology, in the process gaining insight into vehicular sensor network design. Our comparative study confirms that system performance is impacted by a variety of factors such as wireless access methods, mobility, user location, and popularity of the information.

Mandatory access control (MAC) enforcement is becoming available for commercial environments. For example, Linux 2.6 includes the Linux Security Modules (LSM) framework that enables the enforcement of MAC policies (e.g., Type Enforcement... more

Mandatory access control (MAC) enforcement is becoming available for commercial environments. For example, Linux 2.6 includes the Linux Security Modules (LSM) framework that enables the enforcement of MAC policies (e.g., Type Enforcement or Multi-Level Security) for individual systems. While this is a start, we envision that MAC enforcement should span multiple machines. The goal is to be able to control interaction between applications on different machines based on MAC policy. In this paper, we describe a recent extension of the LSM framework that enables labeled network communication via IPsec that is now available in mainline Linux as of version 2.6.16. This functionality enables machines to control communication with processes on other machines based on the security label assigned to an IPsec security association. We outline a security architecture based on labeled IPsec to enable distributed MAC authorization. In particular, we examine the construction of a xinetd service that uses labeled IPsec to limit client access on Linux 2.6.16 systems. We also discuss the application of labeled IPsec to distributed storage and virtual machine access control.

Chelonia is a novel grid storage system designed to fill the requirements gap between those of large, sophisticated scientific collaborations which have adopted the grid paradigm for their distributed storage needs, and of corporate... more

Chelonia is a novel grid storage system designed to fill the requirements gap between those of large, sophisticated scientific collaborations which have adopted the grid paradigm for their distributed storage needs, and of corporate business communities gravitating towards the cloud paradigm. Chelonia is an integrated system of heterogeneous, geographically dispersed storage sites which is easily and dynamically expandable and optimized for high availability and scalability. The architecture and implementation in term of web-services running inside the Advanced Resource Connector Hosting Environment Dameon (ARC HED) are described and results of tests in both local -area and wide-area networks that demonstrate the fault tolerance, stability and scalability of Chelonia will be presented. In addition, example setups for production deployments for small and medium-sized VO's are described.

In a proxy re-encryption (PRE) scheme, a proxy is given special information that allows it to translate a ciphertext under one key into a ciphertext of the same message under a different key. The proxy cannot, however, learn anything... more

In a proxy re-encryption (PRE) scheme, a proxy is given special information that allows it to translate a ciphertext under one key into a ciphertext of the same message under a different key. The proxy cannot, however, learn anything about the messages encrypted under either key. PRE schemes have many practical applications, including distributed storage, email, and DRM. Previously proposed re-encryption schemes achieved only semantic security; in contrast, applications often require security against chosen ciphertext attacks. We propose a definition of security against chosen ciphertext attacks for PRE schemes, and present a scheme that satisfies the definition. Our construction is efficient and based only on the Decisional Bilinear Diffie-Hellman assumption in the standard model. We also formally capture CCA security for PRE schemes via both a game-based definition and simulation-based definitions that guarantee universally composable security. We note that, simultaneously with our work, Green and Ateniese proposed a CCA-secure PRE, discussed herein.

Highly available cloud storage is often implemented with complex, multi-tiered distributed systems built on top of clusters of commodity servers and disk drives. Sophisticated management, load balancing and recovery techniques are needed... more

Highly available cloud storage is often implemented with complex, multi-tiered distributed systems built on top of clusters of commodity servers and disk drives. Sophisticated management, load balancing and recovery techniques are needed to achieve high performance and availability amidst an abundance of failure sources that include software, hardware, network connectivity, and power issues. While there is a relative wealth of failure studies of individual components of storage systems, such as disk drives, ...

Description/Abstract We prototype a storage system that provides the access performance of a well-endowed GridFTP deployment (eg, using a cluster and a parallel file-system) at the modest cost of single desktop. To this end, we integrate... more

Description/Abstract We prototype a storage system that provides the access performance of a well-endowed GridFTP deployment (eg, using a cluster and a parallel file-system) at the modest cost of single desktop. To this end, we integrate GridFTP and a combination of dedicated but low-bandwidth (thus cheap) storage nodes and scavenged storage from LAN-connected desktops that participate intermittently to the storage pool. The main advantage of this setup is that it alleviates the server I/O access bottleneck. Additionally, the specific ...

Peer-to-Peer networks attracted a significant amount of interest because of their capacity for resource sharing and content distribution. Content distribution applications allow personal computers to function in a coordinated manner as a... more

Peer-to-Peer networks attracted a significant amount of interest because of their capacity for resource sharing and content distribution. Content distribution applications allow personal computers to function in a coordinated manner as a distributed storage medium by contributing, searching, and obtaining digital content. Searching in unstructured P2P networks is an important problem, which has received considerable research attention. Acceptable searching techniques must provide large coverage rate, low traffic load, and optimum latency. This paper reviews flooding-based search techniques in unstructured P2P networks. It then analytically compares their coverage rate, and traffic overloads. Our simulation experiments have validated analytical results.

Structured peer-to-peer overlay networks provide a substrate for the construction of large-scale, decentralized applications, including distributed storage, group communication, and content distribution. These overlays are highly... more

Structured peer-to-peer overlay networks provide a substrate for the construction of large-scale, decentralized applications, including distributed storage, group communication, and content distribution. These overlays are highly resilient; they can route messages correctly even when a large fraction of the nodes crash or the network partitions. But current overlays are not secure; even a small fraction of malicious nodes can prevent correct message delivery throughout the overlay. This problem is particularly serious in open peer-to-peer systems, where many diverse, autonomous parties without preexisting trust relationships wish to pool their resources. This paper studies attacks aimed at preventing correct message delivery in structured peer-to-peer overlays and presents defenses to these attacks. We describe and evaluate techniques that allow nodes to join the overlay, to maintain routing state, and to forward messages securely in the presence of malicious nodes.

The Academy of Motion Picture Arts and Sciences (AMPAS) report ''The Digital Dilemma'' describes the issues caused by the rapid increase of storage requirements for long-term preservation and access of high quality digital media content.... more

The Academy of Motion Picture Arts and Sciences (AMPAS) report ''The Digital Dilemma'' describes the issues caused by the rapid increase of storage requirements for long-term preservation and access of high quality digital media content. As one of the research communities focusing on very high quality digital content, CineGrid addresses these issues by building a global-scale distributed storage platform suitable for handling high quality digital media, which we call CineGrid Exchange (CX). Today, the CX connects seven universities and research laboratories in five countries, managing 400TB of storage, of which 250TB are dedicated to CineGrid. All of these sites are interconnected through a 10 Gbps dedicated optical network. The CX distributed repository holds digital motion pictures at HD, 2K and 4K resolutions, digital still images and digital audio in various formats. The goals of the CX are: (1) providing a 10 Gbps interconnected distributed platform for the CineGrid community to study digital content related issues, e.g., digital archiving, the movie production process, and network transfer/streaming protocols; (2) building a tool with which people can securely store, easily share and transfer very high definition digital content worldwide for exhibition and real-time collaboration; (3) automating digital policies through middleware and metadata management. In this publication, we introduce the architecture of the CX, resources managed by the CX and the implementation of the first series of CX management policies using the iRODS programmable middleware. We evaluate the first phase of CX platform implementation. We show that the CX has the potential to be a reliable and scalable digital management system.

This paper describes a Micro Grid Management System developed using agent based technologies and its application to the effective management of generation and storage devices connected to a LV network forming a micro grid. The micro grid... more

This paper describes a Micro Grid Management System developed using agent based technologies and its application to the effective management of generation and storage devices connected to a LV network forming a micro grid. The micro grid is defined as a set of generation, storage and load systems electrically connected and complemented by a communication system to enable control actions and follow up surveillance. The effectiveness of the proposed architecture has been tested on laboratory facilities under different micro grid configurations. The performance and scalability issues related to the agent framework have also been considered and verified.

A self-stabilizing distributed file system is presented. The system constructs and maintains a spanning tree for each file volume. The spanning tree consists of the servers that have volume replicas and caches for the specific file... more

A self-stabilizing distributed file system is presented. The system constructs and maintains a spanning tree for each file volume. The spanning tree consists of the servers that have volume replicas and caches for the specific file volume. The spanning trees are constructed and maintained by selfstabilizing distributed algorithms. File system updates use the tree to implement file read and write operations.

Power quality (PQ) has become a non-negligible issue for DNOs. More and more sites are being monitored permanently throughout their networks covering almost all voltage levels. Due to the continuously increasing number of measurement... more

Power quality (PQ) has become a non-negligible issue for DNOs. More and more sites are being monitored permanently throughout their networks covering almost all voltage levels. Due to the continuously increasing number of measurement sites, certain issues become more and more evident. PQ analyzers usually run only in combination with proprietary software and data transfer methods. The interoperability between different brands is very limited. The efficiency of data handling and data analysis decreases significantly with growing number of measurement sites.

Electric energy storage (EES) installations in power systems are migrating from large centralized systems to more distributed installations for microgrid applications. This trend signifies modular EES installations for the local control... more

Electric energy storage (EES) installations in power systems are migrating from large centralized systems to more distributed installations for microgrid applications. This trend signifies modular EES installations for the local control of buildings and processes. A centralized EES system is often dispatched by grid operators for increasing the overall efficiency and enhancing the security of power systems. The distributed EES (DEES) is locally managed by aggregators to maximize the local impact of EES, before the aggregators' adjusted load profiles are submitted to grid operators for the day-ahead scheduling. In this paper, we present and analyze two models for the hourly scheduling of centralized and distributed EES systems in day-ahead electricity markets. The proposed models take into account specific characteristics and intertemporal constraints of EES systems in transmission-constrained power systems. The proposed models are applied to a 6-bus system and the IEEE-RTS, and the results are presented to compare impacts of utilizing the two EES models on system operations and quantify operation benefits of EES in power systems.

The contribution of renewable energies (in particular of wind power) to the electrical power generation has been continuously increasing in the recent decades. This article focuses on the necessary options that manage the variability of... more

The contribution of renewable energies (in particular of wind power) to the electrical power generation has been continuously increasing in the recent decades. This article focuses on the necessary options that manage the variability of wind turbine output and enable the large scale integration of wind power with the current electricity system, such as additional power reserves, distributed storage technologies, in particular electric vehicles, and cross-boarder power transmission. The influence of geographical distribution of wind turbines on the produced power variability is described as well. The article highlights that even though state-of-art technologies for higher wind integration are present, there is a necessity for the proper management and integration of mentioned options.

The rapid development and diversification of Cloud services occurs in a very competitive environment. The number of actors providing Infrastructure as a Service (IaaS) remains limited, while the number of PaaS (Platform as a Service) and... more

The rapid development and diversification of Cloud services occurs in a very competitive environment. The number of actors providing Infrastructure as a Service (IaaS) remains limited, while the number of PaaS (Platform as a Service) and SaaS (Software as a Service) providers is rapidly increasing. In this context, the ubiquity and the variety of Cloud services impose a form of collaboration between all these actors. For this reason, Cloud Service Providers (CSPs) rely on the availability of computing, storage, and network resources generally provided by various administrative entities. This multi-tenant environment raises multiple challenges such as confidentiality and scalability issues. To address these challenges, resource (network, computing, and storage) abstraction is introduced. In this paper, we focus on network resource abstraction algorithms used by a Network Service Provider (NSP) for sharing its network topology without exposing details of its physical resources. In this context, we propose two network resource abstraction techniques. First, we formulate the network topology abstraction problem as a Mixed-Integer Linear Program (MILP). Solving this formulation provides an optimal abstracted topology to the CSP in terms of availability of the underlying resources. Second, we propose an innovative scalable algorithm called SILK-ALT inspired from the SImple LinK (SILK) algorithm previously proposed by Abosi et al. We compare the MILP formulation, the SILK-ALT algorithm, and the SILK algorithm in terms of rejection ratio of users' requests at both the Cloud provider and the network provider levels. Using our proposed algorithms, the obtained numerical results show that resource abstraction in general and network topology abstraction in particular can effectively hide details of the underlying infrastructure. Moreover, these algorithms represent a scalable and sufficiently accurate way of advertising the resources in a multi-tenant environment.

Mammalian genomes encode only a small number of cuproenzymes. The many genes involved in coordinating copper uptake, distribution, storage and efflux make gene/nutrient interactions especially important for these cuproenzymes. Copper... more

Mammalian genomes encode only a small number of cuproenzymes. The many genes involved in coordinating copper uptake, distribution, storage and efflux make gene/nutrient interactions especially important for these cuproenzymes. Copper deficiency and copper excess both disrupt neural function. Using mice heterozygous for peptidylglycine α-amidating monooxygenase (PAM), a cuproenzyme essential for the synthesis of many neuropeptides, we identified alterations in anxietylike behavior, thermoregulation and seizure sensitivity. Dietary copper supplementation reversed a subset of these deficits. Wildtype mice maintained on a marginally copper deficient diet exhibited some of the same deficits observed in PAM +/− mice and displayed alterations in PAM metabolism. Altered copper homeostasis in PAM +/− mice suggested a role for PAM in the cell type specific regulation of copper metabolism. Physiological functions sensitive to genetic limitations of PAM that are reversed by supplemental copper and mimicked by copper deficiency may serve as indicators of marginal copper deficiency.

In 1998, Blaze, Bleumer, and Strauss (BBS) proposed an application called atomic proxy re-encryption, in which a semi-trusted proxy converts a ciphertext for Alice into a ciphertext for Bob without seeing the underlying plaintext. We... more

In 1998, Blaze, Bleumer, and Strauss (BBS) proposed an application called atomic proxy re-encryption, in which a semi-trusted proxy converts a ciphertext for Alice into a ciphertext for Bob without seeing the underlying plaintext. We predict that fast and secure re-encryption will become increasingly popular as a method for managing encrypted file systems. Although efficiently computable, the wide-spread adoption of BBS re-encryption has been hindered by considerable security risks. Following recent work of Dodis and Ivan, we present new re-encryption schemes that realize a stronger notion of security, and we demonstrate the usefulness of proxy re-encryption as a method of adding access control to a secure file system. Performance measurements of our experimental file system demonstrate that proxy re-encryption can work effectively in practice.

An adaptation of one popular model of neuralnetworks algorithm (ART model) in the field of wireless sensor networks is demonstrated in this paper. The important advantages of the ART class algorithms such as simple parallel distributed... more

An adaptation of one popular model of neuralnetworks algorithm (ART model) in the field of wireless sensor networks is demonstrated in this paper. The important advantages of the ART class algorithms such as simple parallel distributed computation, distributed storage, data robustness and autoclassification of sensor readings are confirmed within the proposed architecture consisting of one clusterhead which collects only classified input data from the other units.

Ada 95 has been the first standardized language to include distribution in the core language itself. However, the set of features required by the Distributed Systems Annex of the Reference Manual is very limited and does not take in... more

Ada 95 has been the first standardized language to include distribution in the core language itself. However, the set of features required by the Distributed Systems Annex of the Reference Manual is very limited and does not take in account advanced needs such as fault tolerance, code migration or persistent distributed storage. This article describes how we have extended the basic model without abandonning the compatibility in GLADE, our implementation of the Distributed Systems Annex. Extensions include restart on failure, easy code migration, hot code upgrade, restricted run time for use on embedded systems with limited processing as well as distributed storage capabilities and persistent storage handling.

Distributed sensor data storage and retrieval has gained increasing popularity in recent years for supporting various applications. While distributed architecture enjoys a more robust and fault-tolerant wireless sensor network (WSN), such... more

Distributed sensor data storage and retrieval has gained increasing popularity in recent years for supporting various applications. While distributed architecture enjoys a more robust and fault-tolerant wireless sensor network (WSN), such architecture also poses a number of security challenges especially when applied in mission-critical applications such as battle field and e-healthcare. First, as sensor data are stored and maintained by individual sensors and unattended sensors are easily subject to strong attacks such as physical compromise, it is significantly harder to ensure data security. Second, in many mission-critical applications, fine-grained data access control is a must as illegal access to the sensitive data may cause disastrous result and/or prohibited by the law. Last but not least, sensors usually are resource-scarce, which limits the direct adoption of expensive cryptographic primitives. To address the above challenges, we propose in this paper a distributed data access control scheme that is able to fulfill fine-grained access control over sensor data and is resilient against strong attacks such as sensor compromise and user colluding. The proposed scheme exploits a novel cryptographic primitive called attribute-based encryption (ABE), tailors, and adapts it for WSNs with respect to both performance and security requirements. The feasibility of the scheme is demonstrated by experiments on real sensor platforms. To our best knowledge, this paper is the first to realize distributed fine-grained data access control for WSNs.

Many archival storage systems rely on keyed encryption to ensure privacy. A data object in such a system is exposed once the key used to encrypt the data is compromised. When storing data for as long as a few decades or centuries, the use... more

Many archival storage systems rely on keyed encryption to ensure privacy. A data object in such a system is exposed once the key used to encrypt the data is compromised. When storing data for as long as a few decades or centuries, the use of keyed encryption becomes a real concern. The exposure of a key is bounded by computation effort and management of encryption keys becomes as much of a problem as the management of the data the key is protecting. POTSHARDS is a secure, distributed, very long-term archival storage system that eliminates the use of keyed encryption through the use of unconditionally secure secret sharing. A (m, n) unconditionally secure secret sharing scheme splits an object up into n shares, which provably gives no information about the object, unless m of the shares collaborate.

A high penetration of solar photovoltaic (PV) resources into distribution networks may create voltage rise problem when the generation from PV resources substantially exceeds the load demand. To reduce the voltage rise, the excess amount... more

A high penetration of solar photovoltaic (PV) resources into distribution networks may create voltage rise problem when the generation from PV resources substantially exceeds the load demand. To reduce the voltage rise, the excess amount of power from the solar PV units needs to be reduced. In this paper, distributed storage systems are proposed for the mitigation of voltage rise problem. The surplus energy from the solar PV is used to charge the distributed storage units during midday, when the power from the solar PV would be typically higher than the load level. This stored energy is then used to reduce the peak load in the evening. An intelligent strategy for charging and discharging control to make effective use of the storage capacity is discussed. The proposed voltage rise mitigation strategy is verified on a practical low voltage distribution feeder in Australia.

ABSTRACT The explosion of user generated data along with the evolution of web 2.0 applications (e.g. social networks, blogs, podcasts, etc.) has resulted in a tremendous demand for storage. With cloud computing posing as a possible... more

ABSTRACT The explosion of user generated data along with the evolution of web 2.0 applications (e.g. social networks, blogs, podcasts, etc.) has resulted in a tremendous demand for storage. With cloud computing posing as a possible all-in-one solution, &quot;storage clouds&quot; focus on providing distributed storage capability. We discuss the creation of a storage cloud using edge devices, based on Peer-to-Peer resource provisioning. In this approach, mobile phones, PCs/Media Centers, Set-top-boxes, modems and networked storage devices can all contribute as storage within these storage clouds. Combining all end-user edge devices may result in a scalable, very flexible storage capability that keeps the data comparatively close to the user, increasing availability, while reducing latency. This work addresses the issue of Quality of Service (QoS)-aware scheduling in a P2P storage cloud, built with edge devices by designing an optimization scheme that minimizes energy from a system perspective and simultaneously maximizing user satisfaction from the individual user perspective.

We describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating precision event logs that can be used to provide... more

We describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating precision event logs that can be used to provide detailed end-to-end application and system level monitoring; a Java agent-based system for managing the large amount of logging data; and tools for visualizing the log data and real-time state of the distributed system. We developed these tools for analyzing a high-performance distributed system centered around the transfer of large amounts of data at high speeds from a distributed storage server to a remote visualization client. However, this methodology should be generally applicable to any distributed system. This methodology, called NetLogger, has proven invaluable for diagnosing problems in networks and in distributed systems code. This approach is novel in that it combines network, host, and application-level monitoring, providing a complete view of the entire system.