Steven R Emmerson - Profile on Academia.edu (original) (raw)
Papers by Steven R Emmerson
Advances in Reliable File-Stream Multicasting over Multi-Domain Software Defined Networks (SDN)
In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) f... more In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) for Software Defined Networks (SDN) to support a reliable file-stream multicast application. In this work, we improved the algorithms used to set parameters: transport-layer sender retransmission timer, VLAN rate (which is also the sending rate) and sender-buffer size. Experimental evaluation using feeds with metadata collected from real meteorology file streams was conducted. A significant finding is that the throughput achieved is smaller than the VLAN/sending rate even though file blocks are multicast continuously in UDP datagrams. Sender-buffer waiting times and propagation delays are the main reasons for the degraded throughput. For example, increasing the VLAN rate from 20 Mbps to 500 Mbps, reduced the degradation from 90% to 45%. However, the degradation increased from 45% to 58% when the VLAN rate was increased from 500 Mbps to 1 Gbps. We found an increase in the number of block retransmissions at the higher rates, which explains this increased degradation. Increasing RTT from 0.1 ms to 100 ms caused throughput to drop from 274.8 Mbps to 27.6 Mbps on a 500 Mbps VLAN. If transmission delay was a significant component in total latency, then throughput degradation relative to VLAN rate would be small; however, the meteorology file-streams used in our study have small-sized data products. Due to bandwidth borrowing between VLAN and IP-routed services, VLAN utilization is not important, and hence we recommend using the smallest rate at which sender-buffer waiting times are insignificant.
Unidata's mission is to provide the data services, tools, and cyberinfrastructure leadership that... more Unidata's mission is to provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Several hundred institutions worldwide participate in the Unidata real-time data sharing network and many more institutions use Unidata tools and technologies in education, research, and operations.
Unidata NetCDF
NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a fre... more NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The netCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.
Advances in Reliable File-Stream Multicasting over Multi-Domain Software Defined Networks (SDN)
2019 28th International Conference on Computer Communication and Networks (ICCCN), 2019
In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) f... more In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) for Software Defined Networks (SDN) to support a reliable file-stream multicast application. In this work, we improved the algorithms used to set parameters: transport-layer sender retransmission timer, VLAN rate (which is also the sending rate) and sender-buffer size. Experimental evaluation using feeds with metadata collected from real meteorology file streams was conducted. A significant finding is that the throughput achieved is smaller than the VLAN/sending rate even though file blocks are multicast continuously in UDP datagrams. Sender-buffer waiting times and propagation delays are the main reasons for the degraded throughput. For example, increasing the VLAN rate from 20 Mbps to 500 Mbps, reduced the degradation from 90% to 45%. However, the degradation increased from 45% to 58% when the VLAN rate was increased from 500 Mbps to 1 Gbps. We found an increase in the number of block retransmissions at the higher rates, which explains this increased degradation. Increasing RTT from 0.1 ms to 100 ms caused throughput to drop from 274.8 Mbps to 27.6 Mbps on a 500 Mbps VLAN. If transmission delay was a significant component in total latency, then throughput degradation relative to VLAN rate would be small; however, the meteorology file-streams used in our study have small-sized data products. Due to bandwidth borrowing between VLAN and IP-routed services, VLAN utilization is not important, and hence we recommend using the smallest rate at which sender-buffer waiting times are insignificant.
Unidata’s hallmark has been democratizing access to real-time meteorological data and related too... more Unidata’s hallmark has been democratizing access to real-time meteorological data and related tools for higher education institutions. Data (both observations and operational forecast model output) are distributed in real-time to a worldwide community of users via Unidata’s Internet Data Distribution system. That network currently distributes nearly 130 GB/day of data via 22 push- and subscription-based data streams that are tailored to the receiving institution’s needs. Unidata-provided cyberinfrastructure has enriched university courses by facilitating educators’ efforts to incorporate applications of real-time data and state-of-the-art tools into student-centered learning experiences, enhanced productivity of students and researchers, and transformed the culture in atmospheric science departments. Unidata has experienced a gradual but natural evolution from a program focused primarily on synoptic scale meteorology to one that serves a broader geosciences community. The robustness...
2020 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS), 2020
A reliable message multicast transport protocol for virtual circuits
Motivated by the need to distribute large volumes of scientific data to large numbers of subscrib... more Motivated by the need to distribute large volumes of scientific data to large numbers of subscribers, we propose a reliable multicast transport protocol. Specifically, this protocol is developed for use on virtual circuits, since dynamic circuit services are now being offered by large providers, and virtual circuit networking is well suited to multicasting as it eliminates the data-plane congestion control problem of connectionless IP. The new protocol is called Virtual Circuit Multicast Transport Protocol (VCMTP). A key concept is to execute retransmissions (required due to flow control problems) at the end, i.e., after the message (file or memory data) is multicast. This leads to scalability, where the throughput for the receivers that can keep pace with the sending rate is independent of the number of receivers. A prototype was tested on Emulab, and measurements obtained. Our findings are that for disk-to-disk file transfers at a sending rate of 600 Mbps, the Emulab hosts can support multicasting with less than 0.5% retransmission rates, but at a 800 Mbps sending rate, the average throughput is only 650 Mbps because the retransmission rate increases to 9%.
NetCDF User's Guide for FORTRAN
ABSTRACT this document does not constitute an endorsement by the Unidata Program Center. Unidata ... more ABSTRACT this document does not constitute an endorsement by the Unidata Program Center. Unidata does not authorize any use of information from this publication for advertising or publicity purposes. Chapter : 1 NetCDF User's Guide for FORTRAN
Unidata LDM-7: A Hybrid Multicast/Unicast System for Highly Efficient and Reliable Real-Time Data Distribution
AGU Fall Meeting Abstracts, Dec 16, 2015
Accreditation and Quality Assurance, Oct 11, 2022
This short article discusses the units of rate constants as used in chemical kinetics and, in par... more This short article discusses the units of rate constants as used in chemical kinetics and, in particular, the aspect of non-integral powers of base units, which some might find unusual for units in the SI system. In many ways the fact that the units of the rate constants as usually defined convey information about the order of the reaction or reactions involved is very useful, but in other ways having the same (or at least very similar) quantity that has different units under different conditions is not so desirable. Furthermore, just as with chemical equilibrium constants, taking functions of the rate constant (such as the logarithm when representing the Arrhenius equation in the form ln k vs. 1∕T) needs special attention. Here we examine a possible alternative definition of rate constants in terms of an explicit ratio to the concentration standard state and although we acknowledge that this approach unlikely to be adopted by the community, it serves as a basis to discuss the meaning of rate constants.
Cluster Computing, Feb 4, 2022
A continuing trend in many scientific disciplines is the growth in the volume of data collected b... more A continuing trend in many scientific disciplines is the growth in the volume of data collected by scientific instruments and the desire to rapidly and efficiently distribute this data to the scientific community. As both the data volume and number of subscribers grows, a reliable network multicast is a promising approach to alleviate the demand for the bandwidth needed to support efficient data distribution to multiple, geographically-distributed, research communities. In prior work, we identified the need for a reliable network multicast: scientists engaged in atmospheric research subscribing to meteorological file-streams. An application called Local Data Manager (LDM) is used to disseminate meteorological data to hundreds of subscribers. This paper presents a high-performance, reliable network multicast solution, Dynamic Reliable File-Stream Multicast Service (DRFSM), and describes a trial deployment comprising eight university campuses connected via Research-and-Education Networks (RENs) and Internet2 and a DRFSM-enabled LDM (LDM7). Using this deployment, we evaluated the DRFSM architecture, which uses network multicast with a reliable transport protocol, and leverages Layer-2 (L2) multipoint Virtual LAN (VLAN/MPLS). A performance monitoring system was developed to collect the realtime performance of LDM7. The measurements showed that our proof-of-concept prototype worked significantly better than the current production LDM (LDM6) in two ways. First, LDM7 distributes data faster than LDM6. With six subscribers and a 100 Mbps bandwidth limit setting, an almost 22-fold improvement in delivery time was observed with LDM7. Second, LDM7 significantly reduces the bandwidth requirement needed to deliver data to subscribers. LDM7 needed 90% less bandwidth than LDM6 to achieve a 20 Mbps average throughput across four subscribers. Keywords File-stream distribution Á Software-defined network Á Multicast Á Control-plane protocol 1 Introduction A continuing trend in many scientific disciplines is the growth in the volume of data collected by scientific instruments and the desire to rapidly and efficiently distribute this data to the scientific community. Transferring these large data sets to a geographically distributed research community consumes significant network resources. For example, in Unidata's Internet Data Distribution (IDD) system [1], the University Corporation for Atmospheric Research (UCAR) uses an application, called Local Data Manager (LDM) [2], to distribute 30 different types [3] of meteorological data (e.g., surface observations, radar data, satellite imagery, wind profiler data, lightning data, and high-resolution computer-model output) to over 570 sites in 217 domains [4]. Approximately 420,000 data products 1 comprising 50 gigabytes (GB) are generated each hour. The volume of data and number of subscribers have both been increasing. For example, the weather satellites of the GOES-R series, such as, GOES-16 and GOES-17, which came online in recent years has 14 times higher & Yuanlong Tan
Stop squandering data: make units of measurement machine-readable
Nature
File-Stream Distribution Application on Software-Defined Networks (SDN)
2015 IEEE 39th Annual Computer Software and Applications Conference, 2015
In a meteorology data-distribution application, streams of files are served to hundreds of receiv... more In a meteorology data-distribution application, streams of files are served to hundreds of receivers every day on unicast TCP connections. Software Defined Networking (SDN) offers a more scalable solution in which a rate-guaranteed Layer-2 multipoint virtual topology can be provisioned to have switches perform Ethernet-frame multicasting to support this application. A characterization of the file streams shows that file sizes and file inter-arrival times are both right skewed. The objective of this work is to design an algorithm for determining the rate of the Layer-2 multipoint virtual topology, and the size of the sending-host buffer, based on traffic characteristics of the file streams and performance requirements. Furthermore, the traffic characteristics are not exactly the same from day-to-day. An empirical method is proposed to determine the ideal rate and buffer size based on a day's traffic, which are then used along with the current rate and buffer size in an Exponential Weighted Moving Average (EWMA) scheme to determine the rate and buffer size for the next day. Our method was evaluated using metadata obtained for the top five file-streams of this meteorological data distribution, and found to be effective.
Analysis and selection of a network service for a scientific data distribution project
The volume of scientific data collected by instruments, from experimental studies, and from simul... more The volume of scientific data collected by instruments, from experimental studies, and from simulations executed on high-performance computing platforms is growing rapidly. Scientists need to move these data files from the sources where they are generated to their laboratory compute clusters for further analysis. Such scientific data transfers are “heavyhitter” flows that consume an unfairly large portion of network resources, and thus adversely affect general-purpose flows on IP-routed networks. The development and deployment of new types of network services offer additional options for these data transfers. The problem is to develop a methodology for identifying the most suitable type of network service for each such scientific project. This paper presents a case study for this problem, in which we describe our analysis of the realtime meteorology data distributed by UCAR in the Internet Data Distribution (IDD) project, and select a suitable network service for this project. Based on this analysis, our conclusion is to experiment with reliable multicast virtual circuit service for the IDD application.
The VisAD Java Class Library for Scientific Data and Visualization
Scientific Visualization Conference (dagstuhl '97), 1997
ABSTRACT VisAD is a Java class library for interactive and collaborative visualization and analys... more ABSTRACT VisAD is a Java class library for interactive and collaborative visualization and analysis of numerical data. It is designed to support distributed computing and data sharing on the Internet through the use of distributed objects and a very general numerical data model. The data model integrates metadata for data organization, units, coordinate systems, sampling geometries and topoligies, missing data indicators, and error estimates. When data are combined in computations or visualizations, unit conversion, coordinate transforms and resampling are done implicitly as needed.
Java distributed objects for numerical visualization in VisAD
Communications of the ACM, 2002
The scientific world is evolving to require more collaboration among different institutions and d... more The scientific world is evolving to require more collaboration among different institutions and disciplines. Understanding long-term changes in the Earth environment, for example, requires models that integrate disciplines such as meteorology, oceanography, hydrology (rivers and groundwater), soil science and geology. During the past 15 years, scientists have started sharing data using FTP and software on the Internet, but collaborative work and more routine data sharing require a new kind of scientific software.
Java distributed components for numerical visualization in VisAD
Communications of the ACM, 2005
Combining a flexible data model and distributed objects, they support the sharing of data, visual... more Combining a flexible data model and distributed objects, they support the sharing of data, visualizations, and user interfaces among multiple data sources, computers, and scientific disciplines.
NetCDF User''s Guide for C
Permission is granted to make and distribute verbatim copies of this manual provided that the cop... more Permission is granted to make and distribute verbatim copies of this manual provided that the copyright notice and these paragraphs are preserved on all copies. The software and any accompanying written materials are provided " as is " without warranty of any kind. UCAR expressly disclaims all warranties of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Any opinions, findings, conclusions , or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Mention of any commercial company or product in this document does not constitute an endorsement by the Unidata Program Center. Unidata does not authorize any use of information from this publication for advertising or publicity purposes. Foreword Unidata (http://www.unidata.ucar.edu) is a National Science Foundation-sponsored program em...
NetCDF User's Guide - An Interface for Data Access Version
Permission is granted to make and distribute verbatim copies of this manual provided that the cop... more Permission is granted to make and distribute verbatim copies of this manual provided that the copyright notice and these paragraphs are preserved on all copies. The software and any accompanying written materials are provided \as is" without warranty of any kind. UCAR expressly disclaims all warranties of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and tness for a particular purpose. Any opinions, ndings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reeect the views of the National Science Foundation. Mention of any commercial company or product in this document does not constitute an endorsement by the Unidata Program Center. Unidata does not authorize any use of information from this publication for advertising or publicity purposes. Foreword 1 Foreword Unidata is a National Science Foundation-sponsored program empowering U.S. universities, thro...
Advances in Reliable File-Stream Multicasting over Multi-Domain Software Defined Networks (SDN)
In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) f... more In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) for Software Defined Networks (SDN) to support a reliable file-stream multicast application. In this work, we improved the algorithms used to set parameters: transport-layer sender retransmission timer, VLAN rate (which is also the sending rate) and sender-buffer size. Experimental evaluation using feeds with metadata collected from real meteorology file streams was conducted. A significant finding is that the throughput achieved is smaller than the VLAN/sending rate even though file blocks are multicast continuously in UDP datagrams. Sender-buffer waiting times and propagation delays are the main reasons for the degraded throughput. For example, increasing the VLAN rate from 20 Mbps to 500 Mbps, reduced the degradation from 90% to 45%. However, the degradation increased from 45% to 58% when the VLAN rate was increased from 500 Mbps to 1 Gbps. We found an increase in the number of block retransmissions at the higher rates, which explains this increased degradation. Increasing RTT from 0.1 ms to 100 ms caused throughput to drop from 274.8 Mbps to 27.6 Mbps on a 500 Mbps VLAN. If transmission delay was a significant component in total latency, then throughput degradation relative to VLAN rate would be small; however, the meteorology file-streams used in our study have small-sized data products. Due to bandwidth borrowing between VLAN and IP-routed services, VLAN utilization is not important, and hence we recommend using the smallest rate at which sender-buffer waiting times are insignificant.
Unidata's mission is to provide the data services, tools, and cyberinfrastructure leadership that... more Unidata's mission is to provide the data services, tools, and cyberinfrastructure leadership that advance Earth system science, enhance educational opportunities, and broaden participation. Several hundred institutions worldwide participate in the Unidata real-time data sharing network and many more institutions use Unidata tools and technologies in education, research, and operations.
Unidata NetCDF
NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a fre... more NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The netCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data.
Advances in Reliable File-Stream Multicasting over Multi-Domain Software Defined Networks (SDN)
2019 28th International Conference on Computer Communication and Networks (ICCCN), 2019
In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) f... more In prior work, we proposed a cross-layer architecture called Multicast-Push Unicast-Pull (MPUP) for Software Defined Networks (SDN) to support a reliable file-stream multicast application. In this work, we improved the algorithms used to set parameters: transport-layer sender retransmission timer, VLAN rate (which is also the sending rate) and sender-buffer size. Experimental evaluation using feeds with metadata collected from real meteorology file streams was conducted. A significant finding is that the throughput achieved is smaller than the VLAN/sending rate even though file blocks are multicast continuously in UDP datagrams. Sender-buffer waiting times and propagation delays are the main reasons for the degraded throughput. For example, increasing the VLAN rate from 20 Mbps to 500 Mbps, reduced the degradation from 90% to 45%. However, the degradation increased from 45% to 58% when the VLAN rate was increased from 500 Mbps to 1 Gbps. We found an increase in the number of block retransmissions at the higher rates, which explains this increased degradation. Increasing RTT from 0.1 ms to 100 ms caused throughput to drop from 274.8 Mbps to 27.6 Mbps on a 500 Mbps VLAN. If transmission delay was a significant component in total latency, then throughput degradation relative to VLAN rate would be small; however, the meteorology file-streams used in our study have small-sized data products. Due to bandwidth borrowing between VLAN and IP-routed services, VLAN utilization is not important, and hence we recommend using the smallest rate at which sender-buffer waiting times are insignificant.
Unidata’s hallmark has been democratizing access to real-time meteorological data and related too... more Unidata’s hallmark has been democratizing access to real-time meteorological data and related tools for higher education institutions. Data (both observations and operational forecast model output) are distributed in real-time to a worldwide community of users via Unidata’s Internet Data Distribution system. That network currently distributes nearly 130 GB/day of data via 22 push- and subscription-based data streams that are tailored to the receiving institution’s needs. Unidata-provided cyberinfrastructure has enriched university courses by facilitating educators’ efforts to incorporate applications of real-time data and state-of-the-art tools into student-centered learning experiences, enhanced productivity of students and researchers, and transformed the culture in atmospheric science departments. Unidata has experienced a gradual but natural evolution from a program focused primarily on synoptic scale meteorology to one that serves a broader geosciences community. The robustness...
2020 IEEE/ACM Innovating the Network for Data-Intensive Science (INDIS), 2020
A reliable message multicast transport protocol for virtual circuits
Motivated by the need to distribute large volumes of scientific data to large numbers of subscrib... more Motivated by the need to distribute large volumes of scientific data to large numbers of subscribers, we propose a reliable multicast transport protocol. Specifically, this protocol is developed for use on virtual circuits, since dynamic circuit services are now being offered by large providers, and virtual circuit networking is well suited to multicasting as it eliminates the data-plane congestion control problem of connectionless IP. The new protocol is called Virtual Circuit Multicast Transport Protocol (VCMTP). A key concept is to execute retransmissions (required due to flow control problems) at the end, i.e., after the message (file or memory data) is multicast. This leads to scalability, where the throughput for the receivers that can keep pace with the sending rate is independent of the number of receivers. A prototype was tested on Emulab, and measurements obtained. Our findings are that for disk-to-disk file transfers at a sending rate of 600 Mbps, the Emulab hosts can support multicasting with less than 0.5% retransmission rates, but at a 800 Mbps sending rate, the average throughput is only 650 Mbps because the retransmission rate increases to 9%.
NetCDF User's Guide for FORTRAN
ABSTRACT this document does not constitute an endorsement by the Unidata Program Center. Unidata ... more ABSTRACT this document does not constitute an endorsement by the Unidata Program Center. Unidata does not authorize any use of information from this publication for advertising or publicity purposes. Chapter : 1 NetCDF User's Guide for FORTRAN
Unidata LDM-7: A Hybrid Multicast/Unicast System for Highly Efficient and Reliable Real-Time Data Distribution
AGU Fall Meeting Abstracts, Dec 16, 2015
Accreditation and Quality Assurance, Oct 11, 2022
This short article discusses the units of rate constants as used in chemical kinetics and, in par... more This short article discusses the units of rate constants as used in chemical kinetics and, in particular, the aspect of non-integral powers of base units, which some might find unusual for units in the SI system. In many ways the fact that the units of the rate constants as usually defined convey information about the order of the reaction or reactions involved is very useful, but in other ways having the same (or at least very similar) quantity that has different units under different conditions is not so desirable. Furthermore, just as with chemical equilibrium constants, taking functions of the rate constant (such as the logarithm when representing the Arrhenius equation in the form ln k vs. 1∕T) needs special attention. Here we examine a possible alternative definition of rate constants in terms of an explicit ratio to the concentration standard state and although we acknowledge that this approach unlikely to be adopted by the community, it serves as a basis to discuss the meaning of rate constants.
Cluster Computing, Feb 4, 2022
A continuing trend in many scientific disciplines is the growth in the volume of data collected b... more A continuing trend in many scientific disciplines is the growth in the volume of data collected by scientific instruments and the desire to rapidly and efficiently distribute this data to the scientific community. As both the data volume and number of subscribers grows, a reliable network multicast is a promising approach to alleviate the demand for the bandwidth needed to support efficient data distribution to multiple, geographically-distributed, research communities. In prior work, we identified the need for a reliable network multicast: scientists engaged in atmospheric research subscribing to meteorological file-streams. An application called Local Data Manager (LDM) is used to disseminate meteorological data to hundreds of subscribers. This paper presents a high-performance, reliable network multicast solution, Dynamic Reliable File-Stream Multicast Service (DRFSM), and describes a trial deployment comprising eight university campuses connected via Research-and-Education Networks (RENs) and Internet2 and a DRFSM-enabled LDM (LDM7). Using this deployment, we evaluated the DRFSM architecture, which uses network multicast with a reliable transport protocol, and leverages Layer-2 (L2) multipoint Virtual LAN (VLAN/MPLS). A performance monitoring system was developed to collect the realtime performance of LDM7. The measurements showed that our proof-of-concept prototype worked significantly better than the current production LDM (LDM6) in two ways. First, LDM7 distributes data faster than LDM6. With six subscribers and a 100 Mbps bandwidth limit setting, an almost 22-fold improvement in delivery time was observed with LDM7. Second, LDM7 significantly reduces the bandwidth requirement needed to deliver data to subscribers. LDM7 needed 90% less bandwidth than LDM6 to achieve a 20 Mbps average throughput across four subscribers. Keywords File-stream distribution Á Software-defined network Á Multicast Á Control-plane protocol 1 Introduction A continuing trend in many scientific disciplines is the growth in the volume of data collected by scientific instruments and the desire to rapidly and efficiently distribute this data to the scientific community. Transferring these large data sets to a geographically distributed research community consumes significant network resources. For example, in Unidata's Internet Data Distribution (IDD) system [1], the University Corporation for Atmospheric Research (UCAR) uses an application, called Local Data Manager (LDM) [2], to distribute 30 different types [3] of meteorological data (e.g., surface observations, radar data, satellite imagery, wind profiler data, lightning data, and high-resolution computer-model output) to over 570 sites in 217 domains [4]. Approximately 420,000 data products 1 comprising 50 gigabytes (GB) are generated each hour. The volume of data and number of subscribers have both been increasing. For example, the weather satellites of the GOES-R series, such as, GOES-16 and GOES-17, which came online in recent years has 14 times higher & Yuanlong Tan
Stop squandering data: make units of measurement machine-readable
Nature
File-Stream Distribution Application on Software-Defined Networks (SDN)
2015 IEEE 39th Annual Computer Software and Applications Conference, 2015
In a meteorology data-distribution application, streams of files are served to hundreds of receiv... more In a meteorology data-distribution application, streams of files are served to hundreds of receivers every day on unicast TCP connections. Software Defined Networking (SDN) offers a more scalable solution in which a rate-guaranteed Layer-2 multipoint virtual topology can be provisioned to have switches perform Ethernet-frame multicasting to support this application. A characterization of the file streams shows that file sizes and file inter-arrival times are both right skewed. The objective of this work is to design an algorithm for determining the rate of the Layer-2 multipoint virtual topology, and the size of the sending-host buffer, based on traffic characteristics of the file streams and performance requirements. Furthermore, the traffic characteristics are not exactly the same from day-to-day. An empirical method is proposed to determine the ideal rate and buffer size based on a day's traffic, which are then used along with the current rate and buffer size in an Exponential Weighted Moving Average (EWMA) scheme to determine the rate and buffer size for the next day. Our method was evaluated using metadata obtained for the top five file-streams of this meteorological data distribution, and found to be effective.
Analysis and selection of a network service for a scientific data distribution project
The volume of scientific data collected by instruments, from experimental studies, and from simul... more The volume of scientific data collected by instruments, from experimental studies, and from simulations executed on high-performance computing platforms is growing rapidly. Scientists need to move these data files from the sources where they are generated to their laboratory compute clusters for further analysis. Such scientific data transfers are “heavyhitter” flows that consume an unfairly large portion of network resources, and thus adversely affect general-purpose flows on IP-routed networks. The development and deployment of new types of network services offer additional options for these data transfers. The problem is to develop a methodology for identifying the most suitable type of network service for each such scientific project. This paper presents a case study for this problem, in which we describe our analysis of the realtime meteorology data distributed by UCAR in the Internet Data Distribution (IDD) project, and select a suitable network service for this project. Based on this analysis, our conclusion is to experiment with reliable multicast virtual circuit service for the IDD application.
The VisAD Java Class Library for Scientific Data and Visualization
Scientific Visualization Conference (dagstuhl '97), 1997
ABSTRACT VisAD is a Java class library for interactive and collaborative visualization and analys... more ABSTRACT VisAD is a Java class library for interactive and collaborative visualization and analysis of numerical data. It is designed to support distributed computing and data sharing on the Internet through the use of distributed objects and a very general numerical data model. The data model integrates metadata for data organization, units, coordinate systems, sampling geometries and topoligies, missing data indicators, and error estimates. When data are combined in computations or visualizations, unit conversion, coordinate transforms and resampling are done implicitly as needed.
Java distributed objects for numerical visualization in VisAD
Communications of the ACM, 2002
The scientific world is evolving to require more collaboration among different institutions and d... more The scientific world is evolving to require more collaboration among different institutions and disciplines. Understanding long-term changes in the Earth environment, for example, requires models that integrate disciplines such as meteorology, oceanography, hydrology (rivers and groundwater), soil science and geology. During the past 15 years, scientists have started sharing data using FTP and software on the Internet, but collaborative work and more routine data sharing require a new kind of scientific software.
Java distributed components for numerical visualization in VisAD
Communications of the ACM, 2005
Combining a flexible data model and distributed objects, they support the sharing of data, visual... more Combining a flexible data model and distributed objects, they support the sharing of data, visualizations, and user interfaces among multiple data sources, computers, and scientific disciplines.
NetCDF User''s Guide for C
Permission is granted to make and distribute verbatim copies of this manual provided that the cop... more Permission is granted to make and distribute verbatim copies of this manual provided that the copyright notice and these paragraphs are preserved on all copies. The software and any accompanying written materials are provided " as is " without warranty of any kind. UCAR expressly disclaims all warranties of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose. Any opinions, findings, conclusions , or recommendations expressed in this publication are those of the author(s) and do not necessarily reflect the views of the National Science Foundation. Mention of any commercial company or product in this document does not constitute an endorsement by the Unidata Program Center. Unidata does not authorize any use of information from this publication for advertising or publicity purposes. Foreword Unidata (http://www.unidata.ucar.edu) is a National Science Foundation-sponsored program em...
NetCDF User's Guide - An Interface for Data Access Version
Permission is granted to make and distribute verbatim copies of this manual provided that the cop... more Permission is granted to make and distribute verbatim copies of this manual provided that the copyright notice and these paragraphs are preserved on all copies. The software and any accompanying written materials are provided \as is" without warranty of any kind. UCAR expressly disclaims all warranties of any kind, either expressed or implied, including but not limited to the implied warranties of merchantability and tness for a particular purpose. Any opinions, ndings, conclusions, or recommendations expressed in this publication are those of the author(s) and do not necessarily reeect the views of the National Science Foundation. Mention of any commercial company or product in this document does not constitute an endorsement by the Unidata Program Center. Unidata does not authorize any use of information from this publication for advertising or publicity purposes. Foreword 1 Foreword Unidata is a National Science Foundation-sponsored program empowering U.S. universities, thro...
A Virtual Circuit Multicast Transport Protocol (VCMTP) for Scientific Data Distribution
Internet2 Member Meeting, Arlington, MA, April 22-25, 2013
Distributing weather data via multipoint layer-2 paths using DYNES
Internet2 Global Summit, April 7, 2014
Distributing weather data via multipoint layer-2 paths using DYNES
A Cross- Layer Multicast-Push Unicast-Pull (MPUP) Architecture for Reliable File-Stream Distribution
Poster/demo at Network Innovators Community Event (GENI NICE), Dec. 12, 2016, Irvine, CA
Analysis and selection of a network service for a scientific data distribution project
4th International Conference on Communications, Mobility, and Computing (CMC 2012), Guilin, China, 21-23 May 2012, pp. 124-127.
File-stream distribution application on Software-Defined Networks (SDN)
IEEE COMPSAC 2015, July 1-5, 2015, Taipei, TW