Auroop Ganguly | Northeastern University (original) (raw)
Papers by Auroop Ganguly
Abstract The benefits of short-term (1-6 h), distributed quantitative precipitation forecasts (DQ... more Abstract The benefits of short-term (1-6 h), distributed quantitative precipitation forecasts (DQPFs) are well known. However, this area is acknowledged to be one of the most challenging in hydrometeorology. Previous studies suggest that the “state of the art” methods can be enhanced by exploiting relevant information from radar and numerical weather prediction (NWP) models, using process physics and data-dictated tools where each fits best.
Abstract The changing economic and labor conditions have motivated firms to outsource professiona... more Abstract The changing economic and labor conditions have motivated firms to outsource professional services activities to skilled personnel in less expensive labor markets. This offshoring phenomenon is studied from a political, economic, technological and strategic perspective. Next, an analytical model is developed for achieving strategic advantage from offshoring based on global partnerships.
Abstract—In this paper, we propose information-theoretic approaches for comparing and evaluating ... more Abstract—In this paper, we propose information-theoretic approaches for comparing and evaluating complex agent-based models. In information theoretic terms, entropy and mutual information are two measures of system complexity. We used entropy as a measure of the regularity of the number of agents in a social class; and mutual information as a measure of information shared by two social classes.
ABSTRACT This paper proposes a new approach to minimise inventory levels and their associated cos... more ABSTRACT This paper proposes a new approach to minimise inventory levels and their associated costs within large geographically dispersed organisations. For such organisations, attaining a high degree of agility is becoming increasingly important. Linear regression-based tools have traditionally been employed to assist human experts in inventory optimisation endeavours. Recently, Neural Network (NN) techniques have been proposed for this domain.
In the context of networks, the process of data partitioning or clustering is called community de... more In the context of networks, the process of data partitioning or clustering is called community detection due to its origins in social network analysis (Wasserman and Faust 1994); we refer to the partitions as clusters, which represent regions of homogeneous climatic variability. The selection of an appropriate algorithm for use on the climate networks was guided by three criteria, which are motivated largely by the network properties: ability to incorporate edge weights, suitability for dense networks, and overall computational efficiency.
ABSTRACT The inability to predict precipitation extremes under nonstationary climate remains a cr... more ABSTRACT The inability to predict precipitation extremes under nonstationary climate remains a crucial science gap. Precipitation is not a state-variable within climate models, exhibits space-time heterogeneities, and is subject to thresholds and intermittences. Atmospheric variables in the spatiotemporal neighborhood, like temperature, humidity and updraft velocity, are often better predicted than precipitation from these models, and may have information relevant for precipitation extremes.
Information by itself is no longer perceived as an asset. Billions of business transactions are r... more Information by itself is no longer perceived as an asset. Billions of business transactions are recorded in enterprise scale data warehouses every day. Acquisition, storage and management of business information are commonplace and often automated. Recent advances in (remote or other) sensor technologies have led to the development of scientific data repositories.
Abstract Wide-area sensor infrastructures, remote sensors, RFIDs, and wireless sensor networks yi... more Abstract Wide-area sensor infrastructures, remote sensors, RFIDs, and wireless sensor networks yield massive volumes of disparate, dynamic, and geographically distributed data. As such sensors are becoming ubiquitous, a set of broad requirements is beginning to emerge across high-priority applications including adaptability to climate change, electric grid monitoring, disaster preparedness and management, national or homeland security, and the management of critical infrastructures.
Abstract The analysis of climate data has relied heavily on hypothesis-driven statistical methods... more Abstract The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim towards characterizing observed phenomena as well as discovering new knowledge in the climate domain.
ABSTRACT The formation of secure transportation corridors, where cargoes and shipments from point... more ABSTRACT The formation of secure transportation corridors, where cargoes and shipments from points of entry can be dispatched safely to highly sensitive and secure locations, is a high national priority. One of the key tasks of the program is the detection of anomalous cargo based on sensor readings in truck weigh stations. Due to the high variability, dimensionality, and/or noise contentof sensor data in transportation corridors,
Accepted: Jan 15, 2011) Abstract: This article presents a comprehensive review of the literature ... more Accepted: Jan 15, 2011) Abstract: This article presents a comprehensive review of the literature that deal with ventilation and cooling technologies applied to agricultural greenhouses. The representative application of each technology as well as its advantages and limitations are discussed. Advance systems employing heat storage in phase change materials, earth-to-air heat exchangers and aquifer-coupled cavity flow heat exchangers have also been discussed.
Abstract The design of statistical predictive models for climate data gives rise to some unique c... more Abstract The design of statistical predictive models for climate data gives rise to some unique challenges due to the high dimensionality and spatio-temporal nature of the datasets, which dictate that models should exhibit parsimony in variable selection. Recently, a class of methods which promote structured sparsity in the model have been developed, which is suitable for this task. In this paper, we prove theoretical statistical consistency of estimators with tree-structured norm regularizers.
Various clustering methods have been applied to climate, ecological, and other environmental data... more Various clustering methods have been applied to climate, ecological, and other environmental datasets, for example to define climate zones, automate land-use classification, and similar tasks. Measuring the “goodness” of such clusters is generally application-dependent and highly subjective, often requiring domain expertise and/or validation with field data (which can be costly or even impossible to acquire).
Abstract While data mining aims to identify hidden knowledge from massive and high dimensional da... more Abstract While data mining aims to identify hidden knowledge from massive and high dimensional datasets, the importance of dependence structure among time, space, and between different variables is less emphasized. Analogous to the use of probability density functions in modeling individual variables, it is now possible to characterize the complete dependence space mathematically through the application of copulas.
Abstract Spatial and temporal variability of precipitation extremes are investigated by utilizing... more Abstract Spatial and temporal variability of precipitation extremes are investigated by utilizing daily observations available at 2.5 gridded fields in South America for the period 1940–2004. All 65 a of data from 1940–2004 are analyzed for spatial variability. The temporal variability is investigated at each spatial grid by utilizing 25-a moving windows from 1965–2004 and visualized through plots of the slope of the regression line in addition to its quality measure (R²).
Abstract—Time series data in climate are often characterized by a delayed relationship between tw... more Abstract—Time series data in climate are often characterized by a delayed relationship between two variables, for example precipitation and temperature anomalies occurring at a place might also occur at another place after some time. These lagged relations generally signify the time lag between the cause and the effect or the spread of a common cause and are important to study and understand as they can aid in prediction.
ABSTRACT With the advent of the World-Wide-Web, there is an over-abundance of textual information... more ABSTRACT With the advent of the World-Wide-Web, there is an over-abundance of textual information. Information present in digital documents can be utilized better if it can be extracted automatically and scalably from text and visualized using visualization tools. In this paper, we present an automated information extraction and visualization tool for human sensor data. Our system consists of three main components: FactXtractor, GeoTagger, and FEMARepViz. Named entities and entity relations are extracted using FactXtractor.
Recent studies disagree on how rainfall extremes over India have changed in space and time over t... more Recent studies disagree on how rainfall extremes over India have changed in space and time over the past half century 1, 2, 3, 4, as well as on whether the changes observed are due to global warming 5, 6 or regional urbanization 7. Although a uniform and consistent decrease in moderate rainfall has been reported 1, 3, a lack of agreement about trends in heavy rainfall may be due in part to differences in the characterization and spatial averaging of extremes.
Information by itself is no longer perceived as an asset. Billions of business transactions are r... more Information by itself is no longer perceived as an asset. Billions of business transactions are recorded in enterprise scale data warehouses every day. The acquisition, storage and management of business information are commonplace and often automated. Recent advances in (remote or other) sensor technologies have led to the development of scientific data repositories.
Abstract. The detection of unusual profiles or anomalous behavioral characteristics from sensor d... more Abstract. The detection of unusual profiles or anomalous behavioral characteristics from sensor data is especially complicated in security applications where the threat indicators may or may not be known in advance. Predictive modeling of massive volumes of historical data can yield insights on usual or baseline profiles, which in turn can be utilized to isolate unusual profiles when new data are observed in real-time. Thus, an incremental anomaly detection approach is proposed.
Abstract The benefits of short-term (1-6 h), distributed quantitative precipitation forecasts (DQ... more Abstract The benefits of short-term (1-6 h), distributed quantitative precipitation forecasts (DQPFs) are well known. However, this area is acknowledged to be one of the most challenging in hydrometeorology. Previous studies suggest that the “state of the art” methods can be enhanced by exploiting relevant information from radar and numerical weather prediction (NWP) models, using process physics and data-dictated tools where each fits best.
Abstract The changing economic and labor conditions have motivated firms to outsource professiona... more Abstract The changing economic and labor conditions have motivated firms to outsource professional services activities to skilled personnel in less expensive labor markets. This offshoring phenomenon is studied from a political, economic, technological and strategic perspective. Next, an analytical model is developed for achieving strategic advantage from offshoring based on global partnerships.
Abstract—In this paper, we propose information-theoretic approaches for comparing and evaluating ... more Abstract—In this paper, we propose information-theoretic approaches for comparing and evaluating complex agent-based models. In information theoretic terms, entropy and mutual information are two measures of system complexity. We used entropy as a measure of the regularity of the number of agents in a social class; and mutual information as a measure of information shared by two social classes.
ABSTRACT This paper proposes a new approach to minimise inventory levels and their associated cos... more ABSTRACT This paper proposes a new approach to minimise inventory levels and their associated costs within large geographically dispersed organisations. For such organisations, attaining a high degree of agility is becoming increasingly important. Linear regression-based tools have traditionally been employed to assist human experts in inventory optimisation endeavours. Recently, Neural Network (NN) techniques have been proposed for this domain.
In the context of networks, the process of data partitioning or clustering is called community de... more In the context of networks, the process of data partitioning or clustering is called community detection due to its origins in social network analysis (Wasserman and Faust 1994); we refer to the partitions as clusters, which represent regions of homogeneous climatic variability. The selection of an appropriate algorithm for use on the climate networks was guided by three criteria, which are motivated largely by the network properties: ability to incorporate edge weights, suitability for dense networks, and overall computational efficiency.
ABSTRACT The inability to predict precipitation extremes under nonstationary climate remains a cr... more ABSTRACT The inability to predict precipitation extremes under nonstationary climate remains a crucial science gap. Precipitation is not a state-variable within climate models, exhibits space-time heterogeneities, and is subject to thresholds and intermittences. Atmospheric variables in the spatiotemporal neighborhood, like temperature, humidity and updraft velocity, are often better predicted than precipitation from these models, and may have information relevant for precipitation extremes.
Information by itself is no longer perceived as an asset. Billions of business transactions are r... more Information by itself is no longer perceived as an asset. Billions of business transactions are recorded in enterprise scale data warehouses every day. Acquisition, storage and management of business information are commonplace and often automated. Recent advances in (remote or other) sensor technologies have led to the development of scientific data repositories.
Abstract Wide-area sensor infrastructures, remote sensors, RFIDs, and wireless sensor networks yi... more Abstract Wide-area sensor infrastructures, remote sensors, RFIDs, and wireless sensor networks yield massive volumes of disparate, dynamic, and geographically distributed data. As such sensors are becoming ubiquitous, a set of broad requirements is beginning to emerge across high-priority applications including adaptability to climate change, electric grid monitoring, disaster preparedness and management, national or homeland security, and the management of critical infrastructures.
Abstract The analysis of climate data has relied heavily on hypothesis-driven statistical methods... more Abstract The analysis of climate data has relied heavily on hypothesis-driven statistical methods, while projections of future climate are based primarily on physics-based computational models. However, in recent years a wealth of new datasets has become available. Therefore, we take a more data-centric approach and propose a unified framework for studying climate, with an aim towards characterizing observed phenomena as well as discovering new knowledge in the climate domain.
ABSTRACT The formation of secure transportation corridors, where cargoes and shipments from point... more ABSTRACT The formation of secure transportation corridors, where cargoes and shipments from points of entry can be dispatched safely to highly sensitive and secure locations, is a high national priority. One of the key tasks of the program is the detection of anomalous cargo based on sensor readings in truck weigh stations. Due to the high variability, dimensionality, and/or noise contentof sensor data in transportation corridors,
Accepted: Jan 15, 2011) Abstract: This article presents a comprehensive review of the literature ... more Accepted: Jan 15, 2011) Abstract: This article presents a comprehensive review of the literature that deal with ventilation and cooling technologies applied to agricultural greenhouses. The representative application of each technology as well as its advantages and limitations are discussed. Advance systems employing heat storage in phase change materials, earth-to-air heat exchangers and aquifer-coupled cavity flow heat exchangers have also been discussed.
Abstract The design of statistical predictive models for climate data gives rise to some unique c... more Abstract The design of statistical predictive models for climate data gives rise to some unique challenges due to the high dimensionality and spatio-temporal nature of the datasets, which dictate that models should exhibit parsimony in variable selection. Recently, a class of methods which promote structured sparsity in the model have been developed, which is suitable for this task. In this paper, we prove theoretical statistical consistency of estimators with tree-structured norm regularizers.
Various clustering methods have been applied to climate, ecological, and other environmental data... more Various clustering methods have been applied to climate, ecological, and other environmental datasets, for example to define climate zones, automate land-use classification, and similar tasks. Measuring the “goodness” of such clusters is generally application-dependent and highly subjective, often requiring domain expertise and/or validation with field data (which can be costly or even impossible to acquire).
Abstract While data mining aims to identify hidden knowledge from massive and high dimensional da... more Abstract While data mining aims to identify hidden knowledge from massive and high dimensional datasets, the importance of dependence structure among time, space, and between different variables is less emphasized. Analogous to the use of probability density functions in modeling individual variables, it is now possible to characterize the complete dependence space mathematically through the application of copulas.
Abstract Spatial and temporal variability of precipitation extremes are investigated by utilizing... more Abstract Spatial and temporal variability of precipitation extremes are investigated by utilizing daily observations available at 2.5 gridded fields in South America for the period 1940–2004. All 65 a of data from 1940–2004 are analyzed for spatial variability. The temporal variability is investigated at each spatial grid by utilizing 25-a moving windows from 1965–2004 and visualized through plots of the slope of the regression line in addition to its quality measure (R²).
Abstract—Time series data in climate are often characterized by a delayed relationship between tw... more Abstract—Time series data in climate are often characterized by a delayed relationship between two variables, for example precipitation and temperature anomalies occurring at a place might also occur at another place after some time. These lagged relations generally signify the time lag between the cause and the effect or the spread of a common cause and are important to study and understand as they can aid in prediction.
ABSTRACT With the advent of the World-Wide-Web, there is an over-abundance of textual information... more ABSTRACT With the advent of the World-Wide-Web, there is an over-abundance of textual information. Information present in digital documents can be utilized better if it can be extracted automatically and scalably from text and visualized using visualization tools. In this paper, we present an automated information extraction and visualization tool for human sensor data. Our system consists of three main components: FactXtractor, GeoTagger, and FEMARepViz. Named entities and entity relations are extracted using FactXtractor.
Recent studies disagree on how rainfall extremes over India have changed in space and time over t... more Recent studies disagree on how rainfall extremes over India have changed in space and time over the past half century 1, 2, 3, 4, as well as on whether the changes observed are due to global warming 5, 6 or regional urbanization 7. Although a uniform and consistent decrease in moderate rainfall has been reported 1, 3, a lack of agreement about trends in heavy rainfall may be due in part to differences in the characterization and spatial averaging of extremes.
Information by itself is no longer perceived as an asset. Billions of business transactions are r... more Information by itself is no longer perceived as an asset. Billions of business transactions are recorded in enterprise scale data warehouses every day. The acquisition, storage and management of business information are commonplace and often automated. Recent advances in (remote or other) sensor technologies have led to the development of scientific data repositories.
Abstract. The detection of unusual profiles or anomalous behavioral characteristics from sensor d... more Abstract. The detection of unusual profiles or anomalous behavioral characteristics from sensor data is especially complicated in security applications where the threat indicators may or may not be known in advance. Predictive modeling of massive volumes of historical data can yield insights on usual or baseline profiles, which in turn can be utilized to isolate unusual profiles when new data are observed in real-time. Thus, an incremental anomaly detection approach is proposed.