Sparse Data Research Papers - Academia.edu (original) (raw)

2025, Proceedings of 10th World Congress on Computational Mechanics

CO 2 sequestration in the underground is a valid alternative approach for mitigating the greenhouse effect. Nevertheless, very little is known about the effectiveness of CO 2 storage over very long periods. In this work we introduce a... more

CO 2 sequestration in the underground is a valid alternative approach for mitigating the greenhouse effect. Nevertheless, very little is known about the effectiveness of CO 2 storage over very long periods. In this work we introduce a methodology to model the gas flow and monitor the storage. For this purpose, we integrate numerical simulators of CO 2brine flow and seismic wave propagation. The simultaneous flow of brine and CO 2 is modeled with the Black-Oil formulation for two-phase flow in porous media, using PVT data as a simplified thermodynamic model. Wave propagation is based on an equivalent viscoelastic model that considers dispersion and attenuation effects. Densities and bulk and shear moduli are assumed to be dependent on pressure and saturation. The spatial pressure and CO 2 saturation distributions computed with the flow simulator are used to determine the phase velocities and attenuation coefficients of the P and S waves from White's model. Numerical examples of CO 2 injection and time-lapse seismograms are analyzed. The proposed methodology is able to identify the spatio-temporal distribution of CO 2 after its injection, and constitutes an important tool to monitor the CO 2 plume and analyze storage integrity, providing an early warning in case should any leakage may occur.

2025, AIAA Aviation Forum

Aeronautical industry has primarily relied on the p-k method for aeroelastic damping extrapolation. The p-k method can be numerically unstable in regions where modes are close to one another, and repeated iterations of the eigenproblem... more

Aeronautical industry has primarily relied on the p-k method for aeroelastic damping extrapolation. The p-k method can be numerically unstable in regions where modes are close to one another, and repeated iterations of the eigenproblem are required making the approach expensive. In this paper, an entirely new approach for aeroelastic damping estimation is proposed. It uses the Dynamic Eigen Decomposition to find flutter points adjusted by an artificially imposed structural damping. Based on the new aeroelastic equations of motion modified with the artificial structural damping, neutrally stable solutions are found. The amount of the negative structural damping added is interpreted as the true aeroelastic damping at the subcritical point. This analysis is carried out using the Nyquist stability criterion applied to the perturbed system from a nominal stable condition. For demonstration, Goland wing model with six structural modes at Mach=.7 is examined. It is shown that the proposed method can yield aeroelastic damping of critical modes, i.e., lightly damped modes, accurately close to the results of the p-k iterations without causing the mode tracking issue.

2025

The nonuniform fast Fourier transform (FFT) on a line has been of interest to a number of scientists for its practical applications. However, not much has been written on Fourier transforming sparse spatial data where the Fourier... more

The nonuniform fast Fourier transform (FFT) on a line has been of interest to a number of scientists for its practical applications. However, not much has been written on Fourier transforming sparse spatial data where the Fourier transform is needed at only sparse data points in the Fourier space in 2D or 3D. It finds applications in remote sensing, inverse problems, and synthetic aperture radar where the scattered field is related to the Fourier transform of the scatterers. We outline an algorithm to perform this transform in NlogN operations, where N is the number of spatial data available, and we assume that the number of Fourier data desired is also of O(N). The algorithm described here is motivated by the multilevel fast multipole algorithm (MLFMA), but is different from that described by Brandt (1991). In MLFMA, an embedded fast Fourier transform algorithm is inherent, where the spatial data is arbitrarily distributed, but the Fourier data is required on the Ewald sphere. In t...

2025

Characterization of aquifers hydraulic parameters is a difficult task that requires field information. Most of the time the hydrogeologist relies on a group of values coming from different test to interpret the hydrogeological setting and... more

Characterization of aquifers hydraulic parameters is a difficult task that requires field information. Most of the time the hydrogeologist relies on a group of values coming from different test to interpret the hydrogeological setting and possibly, generate a model. However, getting the best from this information can be challenging. In this thesis, three cases are explored. First, hydraulic conductivities associated with measurement scale of the order of 10−1 m and collected during an extensive field campaign near Tübingen, Germany, are analyzed. Estimates are provided at coinciding locations in the system using: the empirical Kozeny-Carman formulation, providing conductivity values, based on particle size distribution, and borehole impeller-type flowmeter tests, which infer conductivity from measurements of vertical flows within a borehole. Correlation between the two sets of estimates is virtually absent. However, statistics of the natural logarithm of both sets at the site are si...

2025, Journal of Water Management Modeling

Urban flash flooding is a serious problem in large highly populated areas such as the Dallas-Fort Worth metroplex (DFW). Being able to monitor and predict flash flooding at a high spatiotemporal resolution is critical to mitigating its... more

Urban flash flooding is a serious problem in large highly populated areas such as the Dallas-Fort Worth metroplex (DFW). Being able to monitor and predict flash flooding at a high spatiotemporal resolution is critical to mitigating its threats and for cost effective emergency management. In this work, the prototype high resolution flash flood warning system under de velopment for DFW is described and a case study of the flash flooding event of 2014-06-24 in Fort Worth presented. The high resolution (500 m, 1 min) precipitation input comes from the DFW Demonstration Network of the Collaborative Adaptive Sensing of the Atmosphere (CASA) X-band radars. The hydrologic model used is the National Weather Service Hydrology Laboratory's Distributed Hydrologic Model (HL-RDHM) operating at a 500 m resolution. The model simulation results are assessed using the flooding reports received from residents throughout the event by the City of Fort Worth.

2025, Revista Cubana de Medicina Militar

Con el fin de comparar efectos clínicos de tiopental y propofol en la terapia electroconvulsiva, se estudiaron 50 pacientes en el Hospital Militar Central "Dr. Carlos J. Finlay". Se empleó tiopental 3 mg/kg o propofol 1,5 mg/kg endovenoso... more

Con el fin de comparar efectos clínicos de tiopental y propofol en la terapia electroconvulsiva, se estudiaron 50 pacientes en el Hospital Militar Central "Dr. Carlos J. Finlay". Se empleó tiopental 3 mg/kg o propofol 1,5 mg/kg endovenoso y succinilcolina 0,5 mg/kg; se aplicó estímulo biparietal (120-150 volts). Se analizaron las variables presión arterial media, frecuencia y ritmo cardiaco, duración de la convulsión y de la recuperación. No hubo diferencias estadísticamente significativas al comparar las cifras tensionales en ambos grupos; clínicamente se presentó mayor incremento en el grupo tiopental terminada la convulsión; las arritmias cardiacas fueron más frecuentes también en este grupo (70 %) al compararlo con propofol (14 %). La duración de la convulsión fue 29,84 s en el grupo propofol y 37,24 s en el grupo tiopental, con tiempos de recuperación de 6,85 y 8,16 min promedio respectivamente. El propofol resultó mejor hipnótico para la terapia electroconvulsiva. * Diferencia estadísticamente significativa (p< 0,05). n= 50: número de pacientes estudiados (números de sesiones). Fuente: Formulario y análisis de los datos. La duración de la actividad ictal motora en el grupo tiopental promedió 37,34 s y para el grupo propofol 29,84 s; estas diferencias fueron clínica y estadísticamente significativas (p < 0,05). En cuanto al tiempo promedio de recuperación este fue menor en el grupo propofol (6,86 min) que cuando se empleó tiopental (8,16 min).

2025

A pixel array has been proposed which features a completely data driven architecture. A pixel cell has been designed that has been optimized for this readout. It retains the features of preceding designs which allow low noise operation,... more

A pixel array has been proposed which features a completely data driven architecture. A pixel cell has been designed that has been optimized for this readout. It retains the features of preceding designs which allow low noise operation, time stamping, analog signal processing, XY address recording, ghost elimination and sparse data transmission. The pixel design eliminates a number of problems

2025

Water erosion is a natural process of soil surface disturbance by rainfall and surface runoff. Phosphorus transported by surface runoff is followed by eutrophication of water bodies and water quality issues. The problem rises with climate... more

Water erosion is a natural process of soil surface disturbance by rainfall and surface runoff. Phosphorus transported by surface runoff is followed by eutrophication of water bodies and water quality issues. The problem rises with climate change and increasing climate extremity. Agriculture soil, infrastructure and water quality protection have to be ensured by suitable legislative measures. The efficiency of these measures can be proved by suitable mathematical modeling of the soil erosion and nutrient transport to watercourses and water bodies. Research provided by the Department of Irrigation Drainage and Landscape Engineering FCE CTU is focused on the water erosion modeling, including nutrients transport. This research comprises either experimental rainfall-runoff and erosion events measuring or using mathematical models for calculation of runoff and erosion intensity in small and larger basins. The long-term erosion intensity on the area of 32 thousand square kilometers has bee...

2025

A great challenge today, arising in many fields of science, is the proper mapping of datasets to explore their structure and gain information that otherwise would remain concealed due to the high-dimensionality. This task is impossible... more

A great challenge today, arising in many fields of science, is the proper mapping of datasets to explore their structure and gain information that otherwise would remain concealed due to the high-dimensionality. This task is impossible without appropriate tools helping the experts to understand the data. A promising way to support the experts in their work is the topographic mapping of the datasets to a low-dimensional space where the structure of the data can be visualized and understood. This thesis focuses on Neural Gas and Self-Organizing Maps as particularly successful methods for prototype-based topographic maps. The aim of the thesis is to extend these methods such that they can deal with real life datasets which are possibly very huge and complex, thus probably not treatable in main memory, nor embeddable in Euclidean space. As a foundation, we propose and investigate a fast batch scheme for topographic mapping which features quadratic convergence. This formulation allows to...

2025, Advances in Web Mining and Web Usage Analysis

With the amount of available information on the Web growing rapidly with each day, the need to automatically filter the information in order to ensure greater user efficiency has emerged. Within the fields of user profiling and Web... more

With the amount of available information on the Web growing rapidly with each day, the need to automatically filter the information in order to ensure greater user efficiency has emerged. Within the fields of user profiling and Web personalization several popular content filtering techniques have been developed. In this chapter we present one of such techniques -collaborative filtering. Apart from giving an overview of collaborative filtering approaches, we present the experimental results of confronting the k-Nearest Neighbor (kNN) algorithm with Support Vector Machine (SVM) in the collaborative filtering framework using datasets with different properties. While the k-Nearest Neighbor algorithm is usually used for collaborative filtering tasks, Support Vector Machine is considered a state-of-the-art classification algorithm. Since collaborative filtering can also be interpreted as a classification/regression task, virtually any supervised learning algorithm (such as SVM) can also be applied. Experiments were performed on two standard, publicly available datasets and, on the other hand, on a real-life corporate dataset that does not fit the profile of ideal data for collaborative filtering. We conclude that the quality of collaborative filtering recommendations is highly dependent on the sparsity of available data. Furthermore, we show that kNN is dominant on datasets with relatively low sparsity while SVMbased approaches may perform better on highly sparse data.

2025, Lecture Notes in Computer Science

The need for efficient decentralized recommender systems has been appreciated for some time, both for the intrinsic advantages of decentralization and the necessity of integrating recommender systems into P2P applications. On the other... more

The need for efficient decentralized recommender systems has been appreciated for some time, both for the intrinsic advantages of decentralization and the necessity of integrating recommender systems into P2P applications. On the other hand, the accuracy of recommender systems is often hurt by data sparsity. In this paper, we compare different decentralized user-based and item-based Collaborative Filtering (CF) algorithms with each other, and propose a new user-based random walk approach customized for decentralized systems, specifically designed to handle sparse data. We show how the application of random walks to decentralized environments is different from the centralized version. We examine the performance of our random walk approach in different settings by varying the sparsity, the similarity measure and the neighborhood size. In addition, we introduce the popularizing disadvantage of the significance weighting term traditionally used to increase the precision of similarity measures, and elaborate how it can affect the performance of the random walk algorithm. The simulations on MovieLens 10,000,000 ratings dataset demonstrate that over a wide range of sparsity, our algorithm outperforms other decentralized CF schemes. Moreover, our results show decentralized user-based approaches perform better than their item-based counterparts in P2P recommender applications.

2025, Proceedings of the 2003 ACM symposium on Software visualization

We present a novel information visualization technique for the graphical representation of causal relations, that is based on the metaphor of color pools spreading over time on a piece of paper. Messages between processes in the system... more

We present a novel information visualization technique for the graphical representation of causal relations, that is based on the metaphor of color pools spreading over time on a piece of paper. Messages between processes in the system affect the colors of their respective pool, making it possible to quickly see the influences each process has received. This technique, called Growing Squares, has been evaluated in a comparative user study and shown to be significantly faster and more efficient for sparse data sets than the traditional Hasse diagram visualization. Growing Squares were also more efficient for large data sets, but not significantly so. Test subjects clearly favored Growing Squares over old methods, naming the new technique easier, more efficient, and much more enjoyable to use.

2025, Animal Biodiversity and Conservation

Reintroduction is a powerful tool in our conservation toolbox. However, the necessary follow-up, i.e. long-term monitoring, is not commonplace and if instituted may lack rigor. We contend that valid monitoring is possible, even with... more

Reintroduction is a powerful tool in our conservation toolbox. However, the necessary follow-up, i.e. long-term monitoring, is not commonplace and if instituted may lack rigor. We contend that valid monitoring is possible, even with sparse data. We present a means to monitor based on demographic data and a projection model using the Wyoming toad (Bufo baxteri) as an example. Using an iterative process, existing data is built upon gradually such that demographic estimates and subsequent inferences increase in reliability. Reintroduction and defensible monitoring may become increasingly relevant as the outlook for amphibians, especially in tropical regions, continues to deteriorate and emergency collection, captive breeding, and reintroduction become necessary. Rigorous use of appropriate modeling and an adaptive approach can validate the use of reintroduction and substantially increase its value to recovery programs.

2025

Since the appearance of the web 2.0, several new concepts have emerged like social networks which generate big data, characterized by a difficult treatment using traditional administration tools. Though, they represent a rich resource of... more

Since the appearance of the web 2.0, several new concepts have emerged like social networks which generate big data, characterized by a difficult treatment using traditional administration tools. Though, they represent a rich resource of information that we can use as the basis for decisions to help managers which are always complaining about the time taken to get answers to questions in order to make decisions, and also about the quality of these decisions. In the decision-making task, the most important besides treating the data, especially big data, is to analyze and interpret the results of the treatment in a way to maximize the profit in terms of business management logic. So, we propose in this paper a process-oriented business architecture, founded on users’ interests, by analyzing tweets as an example of big data, for decisionmaking purposes, dealing with the practical functions of text mining. The business orientation of the analysis helps us to get conscious decisions whic...

2025, Acta Ornithologica

Availability of nest survival estimates over large spatial and temporal scales is necessary for the complex modelling of population dynamics. However, there may be no standardized nest monitoring schemes, as a primary source of data, for... more

Availability of nest survival estimates over large spatial and temporal scales is necessary for the complex modelling of population dynamics. However, there may be no standardized nest monitoring schemes, as a primary source of data, for many species, locations or years. Although other potential datasets often do exist, their applicability for analysing large-scale temporal patterns in nest survival is not well established. We used an alternative dataset of ringing records of 3 091 nests of the Red-backed Shrike Lanius collurio, representing five time series (6 to 42 years) from different sites within the Czech Republic, to analyse long-term variability in nest survival. We modelled trends in daily nest survival rates (DSR) over the years, either assuming a constant DSR, or accounting for unequal nest search efforts during the breeding season by assuming that DSR varies as a function of nest age and seasonal date. We found that even sparse nesting data may produce realistic estimates of nest survival. DSR varied greatly among sites, from 0.975 to 0.984, corresponding to a nest success from 48% to 62%. Both modelling approaches yielded almost identical estimates of DSR trends over the years. In this study, nest survival has either declined at all three agricultural sites or remained stable at one suburban site since the late 1980s. We conclude that sparse datasets with unequal searching effort during the nesting cycle and/or nesting season can be used to estimate long-term trends in nest survival, but this approach is warranted only if the analyses, based on different assumptions, yield consistent estimates.

2025, 8th Workshop on Compilers for Parallel …

We propose to develop and evaluate software support for improving locality for advanced scientific applica-tions. We will investigate compiler and run-time tech-niques needed to achieve high performance on both se-quential and parallel... more

We propose to develop and evaluate software support for improving locality for advanced scientific applica-tions. We will investigate compiler and run-time tech-niques needed to achieve high performance on both se-quential and parallel machines. We will focus on two ar-eas. ...

2025

Consider a large social network with possibly severe degree heterogeneity and mixed-memberships. We are interested in testing whether the network has only one community or there are more than one communities. The problem is known to be... more

Consider a large social network with possibly severe degree heterogeneity and mixed-memberships. We are interested in testing whether the network has only one community or there are more than one communities. The problem is known to be non-trivial, partially due to the presence of severe degree heterogeneity. We construct a class of test statistics using the numbers of short paths and short cycles, and the key to our approach is a general framework for canceling the effects of degree heterogeneity. The tests compare favorably with existing methods. We support our methods with careful analysis and numerical study with simulated data and a real data example.

2025, Academia Quantum

We introduce a novel hybrid quantum–analog algorithm to perform a graph clustering that exploits connections between the evolution of dynamical systems on graphs and the underlying graph spectra. This approach constitutes a new class of... more

We introduce a novel hybrid quantum–analog algorithm to perform a graph clustering that exploits connections between the evolution of dynamical systems on graphs and the underlying graph spectra. This approach constitutes a new class of algorithms that combine emerging quantum and analog platforms to accelerate computations. Our hybrid algorithm is equivalent to spectral clustering and significantly reduces the computational complexity from 𝒪(N3) to 𝒪(N), where N is the number of nodes in the graph. We achieve this speedup by circumventing the need for explicit eigendecomposition of the normalized graph Laplacian matrix, which dominates the classical complexity, and instead leveraging quantum evolution of the Schrödinger equation followed by efficient analog computation for the dynamic mode decomposition (DMD) step. Specifically, while classical spectral clustering requires 𝒪(N3) operations to perform eigendecomposition, our method exploits the natural quantum evolution of states according to the graph Laplacian Hamiltonian in linear time, combined with the linear scaling for DMD that leverages efficient matrix–vector multiplications on analog hardware. We prove and demonstrate that this hybrid approach can extract the eigenvalues and scaled eigenvectors of the normalized graph Laplacian by evolving Schrödinger dynamics on quantum computers followed by DMD computations on analog devices, providing a significant computational advantage for large-scale graph clustering problems. Our demonstrations can be reproduced using our code that has been released at on github.

2025

The majority of database benchmarks currently in use in the industry were designed for relational databases. A different class of benchmarks became required for object oriented databases once they appeared on the market. None of the... more

The majority of database benchmarks currently in use in the industry were designed for relational databases. A different class of benchmarks became required for object oriented databases once they appeared on the market. None of the currently existing benchmarks were designed to adequately exploit the distinctive features native to the semantic databases. A new semantic benchmark is proposed which allows evaluation of the performance of the features characteristic of semantic database applications. An application used in the benchmark represents a class of problems requiring databases with sparse data, complex inheritances and many-to-many relations. Such databases can be naturally accommodated by semantic databases. A predefined implementation is not enforced allowing a designer to choose the most efficient structures available in the DBMS tested. The second part of this paper compares the performance of Sem-ODB binary semantic database vs. one of the leading relational databases. The results of the benchmark are analyzed.

2025

Monitoring systems are prone to record huge amount of data, only a minor part of which could be of interest. Maintaining a giant data base from the monitoring system of a large structure equipped with dozens of sensors is a costly... more

Monitoring systems are prone to record huge amount of data, only a minor part of which could be of interest. Maintaining a giant data base from the monitoring system of a large structure equipped with dozens of sensors is a costly challenge. One way for solving this question lies in discarding everything that does not enter in the scope of what was considered as "interesting event " at the design stage, with a big risk of missing an unattended but essential event. Another way could be to store less data, making a kind of "survey " instead of recording continuously, with a possibility to rebuild events from the discontinuous records. This way of doing is suitable in the case of minutes lasting phenomena only, it is clear that a sudden and brief phenomenon could be missed in such a configuration. This method was used in the case of vortex shedding excitation of the deck of a large cable stayed bridge, in order to rebuild the history of passed excitations and evalua...

2025, 2011 XXXth URSI General Assembly and Scientific Symposium

In order to investigate the dynamics of ionospheric phenomena, perform the 3-D ionospheric tomography is effective. However, it is the ill-posed inverse problem and reconstruction is difficult because of the small number of data. The... more

In order to investigate the dynamics of ionospheric phenomena, perform the 3-D ionospheric tomography is effective. However, it is the ill-posed inverse problem and reconstruction is difficult because of the small number of data. The Residual Minimization Training Neural Network (RMTNN) tomographic approach proposed by Ma et al. has an advantage in reconstruction with sparse data. They have demonstrated few results in quiet conditions of ionosphere in Japan. Therefore, we validate the performance of reconstruction in the case of disturbed period and quite sparse data by the simulation and/or real data in this paper.

2025, Radio Science

Three‐dimensional ionospheric tomography is effective for investigations of the dynamics of ionospheric phenomena. However, it is an ill‐posed problem in the context of sparse data, and accurate electron density reconstruction is... more

Three‐dimensional ionospheric tomography is effective for investigations of the dynamics of ionospheric phenomena. However, it is an ill‐posed problem in the context of sparse data, and accurate electron density reconstruction is difficult. The Residual Minimization Training Neural Network (RMTNN) tomographic approach, a multilayer neural network trained by minimizing an objective function, allows reconstruction of sparse data. In this study, we validate the reconstruction performance of RMTNN using numerical simulations based on both sufficiently sampled and sparse data. First, we use a simple plasma‐bubble model representing the disturbed ionosphere and evaluate the reconstruction performance based on 40 GPS receivers in Japan. We subsequently apply our approach to a sparse data set obtained from 24 receivers in Indonesia. The reconstructed images from the disturbed and sparse data are consistent with the model data, except below 200 km altitude. To improve this performance and li...

2025

order models are needed for reliable, accurate and efficient prediction of aerodynamic forces to analyze fluid-structure interaction problems in turbomachinery including prop fans.

2025, Population Ecology

As part of a national strategy for recovering tiger populations, the Myanmar Government recently proposed its first and the world's largest tiger reserve in the Hukaung Valley, Kachin State. During November 2002-June 2004, camera-traps... more

As part of a national strategy for recovering tiger populations, the Myanmar Government recently proposed its first and the world's largest tiger reserve in the Hukaung Valley, Kachin State. During November 2002-June 2004, camera-traps were used to record tigers, identify individuals, and, using capture-recapture approaches, estimate density in the reserve. Despite extensive (203 trap locations, 275-558 km 2 sample plots) and intensive ([4,500 trap nights, 9 months of sampling) survey efforts, only 12 independent detections of six individual tigers were made across three study sites. Due to the sparse data, estimates of tiger abundance generated by Program CAPTURE could not be made for all survey sites. Other approaches to estimating density, based on numbers of tigers caught, or derived from borrowed estimates of detection probability, offer an alternative to capture-recapture analysis. Tiger densities fall in the range of 0.2-2.2 tigers/100 km 2 , with 7-71 tigers inside a 3,250 km 2 area of prime tiger habitat, where efforts to protect tigers are currently focused. Tiger numbers might be stabilized if strict measures are taken to protect tigers and their prey from seasonal hunting and to suppress illegal trade in wildlife. Efforts to monitor abundance trends in the tiger population will be expensive given the difficulty with which tiger data can be obtained and the lack of available surrogate indices of tiger density. Monitoring occupancy patterns, the subject of a separate ongoing study, may be more efficient.

2025, 15th Conference on Applied Climatology/13th Symposium on Meteorological Observations and Instrumentation

Daily information on precipitation is employed year-round for purposes of water management. Real time gage data from the NWS cooperative network is often used, but because of spatial density limitations may not be adequate to estimate... more

Daily information on precipitation is employed year-round for purposes of water management. Real time gage data from the NWS cooperative network is often used, but because of spatial density limitations may not be adequate to estimate gradients in precipitation and thus may be inadequate to estimate precipitation over scales finer than 30-40 km. This study compares precipitation from gage, radar, and multi-sensor estimates to evaluate differences between gage and remotely sensed precipitation estimates yearround over 2 small regions for the period February 2002 -September 2004. Daily precipitation estimates are based upon daily 1) NWS quality controlled cooperative gage data (QC_coop), 2) gage data from a dense (10 km spacing) network of weighing bucket gages in Cook County IL (CCPN), 3) gridded (4 x 4 km) Stage II radar estimates (RDR), mosaicked and distributed by the National Center for Environmental Prediction (NCEP), and 4) gridded (4 x 4 km) Stage III/IV multi-sensor estimates (MPE), produced at the River Forecast Centers and mosaicked into a national product at NCEP.

2025

A time-lapse analysis was carried out to investigate the theoretical detectability of CO 2 for the Shell Quest project. Quest is a Carbon Capture and Storage (CCS) project in Alberta conducted by Shell Canada Energy, Chevron Canada... more

A time-lapse analysis was carried out to investigate the theoretical detectability of CO 2 for the Shell Quest project. Quest is a Carbon Capture and Storage (CCS) project in Alberta conducted by Shell Canada Energy, Chevron Canada Limited, and Marathon Oil Canada Corporation. The target formation for injection is Basal Cambrian Sandstone (BCS) which is a deep saline aquifer at an approximate depth of 2000 meters below surface. The purpose of this study was to simulate the seismic response of the BCS after injecting 1.2 million tonnes of CO 2 during a one-year period of injection. This was done using Gassmann fluid substitution and seismic forward modeling. A geological model for the baseline scenario was generated based on data from well SCL-8-19-59-20W4. For the monitor case, Gassmann fluid substitution modeling was undertaken to model a CO 2 plume within BCS. Numerical stack sections for both scenarios were obtained and subtracted to study the change in the seismic response after injecting CO 2 . The difference section shows the location and the spacial distribution of the plume. Based on these results the CO 2 plume could be detected in the seismic data after a year of injection.

2025, The Astrophysical Journal

Spectroscopic analyses of 14 Her, HD 187123, and HD 210277, recently reported to harbor planets, reveal that these stars are metal rich. We find , , and for 14 Her, [Fe/H] ϭ 0.50 ‫ע‬ 0.05 0.16 ‫ע‬ 0.05 0.24 ‫ע‬ 0.05 HD 187123, and HD... more

Spectroscopic analyses of 14 Her, HD 187123, and HD 210277, recently reported to harbor planets, reveal that these stars are metal rich. We find , , and for 14 Her, [Fe/H] ϭ 0.50 ‫ע‬ 0.05 0.16 ‫ע‬ 0.05 0.24 ‫ע‬ 0.05 HD 187123, and HD 210277, respectively. This is the first spectroscopic analysis of HD 187123; our results for 14 Her and HD 210277 are in agreement with published studies. It is shown that 14 Her and r 1 Cnc are nearly identical in their bulk physical characteristics. This result, combined with their extreme metallicities, suggests that their physical parameters have been affected by the process that formed their planets. These two stars join a group of about half a dozen stars in the solar neighborhood with . It is also shown that 51 Peg [Fe/H] ≥ 0.4 and HD 187123, which have companions with similar orbital periods and masses, are nearly identical. We find km s Ϫ1 for HD 210277 from a high-resolution spectrum. v sin i ≈ 2.0

2025, Lecture Notes in Computer Science

We focus on the combinatorial analysis of physical mapping with repeated probes. We present computational complexity results, and we describe and analyze an algorithmic strategy. We are following the research avenue proposed by Karp [9]... more

We focus on the combinatorial analysis of physical mapping with repeated probes. We present computational complexity results, and we describe and analyze an algorithmic strategy. We are following the research avenue proposed by Karp [9] on modeling the problem as a combinatorial problem -the Hypergraph Superstring Problem -intimately related to the Lander-Waterman stochastic model . We show that a sparse version of the problem is MAXSNP-complete, a result that carries over to the general case. We show that the minimum Sperner decomposition of a set collection, a problem that is related to the Hypergraph Superstring problem, is NP-complete. Finally we show that the Generalized Hypergraph Superstring Problem is also MAXSNP-hard. We present an efficient algorithm for retrieving the PQ-tree of optimal zero repetition solutions, that provides a constant approximation to the optimal solution on sparse data. We provide experimental results on simulated data.

2024, Provided by the Department of Hydrology and Water Resources.

Extensive flooding occurred throughout the northeastern United States during January of 1996. The flood event cost the lives of 33 people and over a billion dollars in flood damage. Following the `Blizzard of `96 ", a warm front moved... more

Extensive flooding occurred throughout the northeastern United States during January of 1996. The flood event cost the lives of 33 people and over a billion dollars in flood damage. Following the `Blizzard of `96 ", a warm front moved into the Mid - Atlantic region bringing extensive rainfall and causing significant melting and flooding to occur. Flood forecasting is a vital part of the National Weather Service (NWS) hydrologic responsibilities. Currently, the NWS River Forecast Centers use either the Antecedent Precipitation Index (API) or the Sacramento Soil -Moisture Accounting Model (SAC -SMA). This study evaluates the API and SAC -SMA models for their effectiveness in flood forecasting during this rain -on -snow event. The SAC -SMA, in conjunction with the SNOW -17 model, is calibrated for five basins in the Mid -Atlantic region using the Shuffled Complex Evolution (SCE -UA) automatic algorithm developed at the University of Arizona. Nash -Sutcliffe forecasting efficiencies (Ef) for the calibration period range from 0.79 to 0.87, with verification values from 0.42 to 0.95. Flood simulations were performed on the five basins using the API and calibrated SAC - SMA model. The SAC -SMA model does a better job of estimating observed flood discharge on three of the five study basins, while two of the basins experience flood simulation problems with both models. Study results indicate the SAC -SMA has the potential for better flood forecasting during complex rain -on -snow events such as during the January 1996 floods in the Northeast. hydrologic services and was established in the 1940s as part of the Department of Commerce (Ingram, 1996). The first River Forecast Center (RFC) was established in the 1940s in the Ohio River Basin for the unique purpose of flood forecasting. By the 1960s, 13 River Forecast Centers were established across the United States with distinct areas of hydrologic responsibility (Figure 1.1). Today's National Weather Service, under the direction of the National Oceanic and Atmospheric Administration (NOAA), is charged with "the responsibility of providing accurate and timely hydrologic information and forecasts for watersheds and rivers throughout the United States" (Brazil and Hudlow, 1981). The NWS RFCs now issue forecasts for over 4,000 river locations across the United States . Along with flood forecasting, additional RFC responsibilities include daily operational forecasts for water supply, irrigation, reservoir operation, energy production, water quality, navigation, and recreation . RFCs in the northern climates of the U.S. also issue seasonal snowmelt forecasts for the spring melt (Brazil and Hudlow, 1981). The National Weather Service's 13 River Forecast Centers have access to an assembled system of forecasting techniques, including rainfall -runoff models, snow accumulation and ablation models, routing techniques, calibration methods, and other hydrologic procedures (Burnash, 1995). Twelve of the 13 RFCs use this centralized system known as the National Weather Service Forecast System (NWSRFS), a highly

2024

We would like to comment on this article by William DuMouchel, as it gives an interesting application of logistic regression to clinical safety data. Not to underscore the scope of the multivariate Bayesian logistic regression (MBLR)... more

We would like to comment on this article by William DuMouchel, as it gives an interesting application of logistic regression to clinical safety data. Not to underscore the scope of the multivariate Bayesian logistic regression (MBLR) model, but the use of numerical integration is arguably its most important feature. Avoiding Markov chain Monte Carlo (MCMC) sampling techniques for other data-mining tools, such as the Multiple-item Gamma Poisson Shrinker (DuMouchel, 1999), has proven successful for Dr. DuMouchel in their acceptance among nonstatisticians. With MBLR this should not be an exception. As most statisticians lack the clinical insight required to specify the appropriate MBLR model inputs, it makes MBLR an ideal tool for use by the clinicians. However, targeted users may not appreciate some subtleties of MBLR, which we present below. We also present findings from our empirical evaluation of the MBLR algorithm. This commentary provides some perspective that we have gained through multiple interactions with Dr. DuMouchel

2024

Reduced order models are needed for reliable, accurate and efficient prediction of aerodynamic forces to analyze fluid-structure interaction problems in turbomachinery including prop fans.

2024

Reduced order models are needed for reliable, accurate and efficient prediction of aerodynamic forces to analyze fluid-structure interaction problems in turbomachinery including prop fans.

2024, arXiv (Cornell University)

The meson photoproductions off nucleons in the chiral quark model are described. The role of the S-wave resonances in the second resonance region is discussed, and it is particularly important for the Kaon, η and η ′ photoproductions.

2024, Ices Journal of Marine Science

Age-length key (ALK) methods generally perform well when length samples and age samples are representative of the underlying population. It is unclear how well these methods perform when lengths are representative but age samples are... more

Age-length key (ALK) methods generally perform well when length samples and age samples are representative of the underlying population. It is unclear how well these methods perform when lengths are representative but age samples are sparse (i.e. age samples are small or missing in many years, and some length groups do not have any age observations). With western Atlantic bluefin tuna, the available age data are sparse and have been, for the most part, collected opportunistically. We evaluated two methods capable of accommodating sparse age data: a novel hybrid ALK (combining forward ALKs and cohort slicing) and the combined forward-inverse ALK. Our goal was to determine if the methods performed better than cohort slicing, which has traditionally been used to obtain catch-at-age for Atlantic bluefin tuna, given the data limitations outlined above. Simulation results indicated that the combined forward-inverse ALK performed much better than the other methods. When applied to western Atlantic bluefin tuna data, the combined forward-inverse ALK approach was able to track cohorts and identified an inconsistency in the ageing of some samples.

2024, Progress in Oceanography

Data on the occurrence of sardine (Sardina pilchardus) eggs from 42 national ichthyoplankton surveys along the European Atlantic coast were collated in order to describe the spawning habitat and spawning distribution of sardine in recent... more

Data on the occurrence of sardine (Sardina pilchardus) eggs from 42 national ichthyoplankton surveys along the European Atlantic coast were collated in order to describe the spawning habitat and spawning distribution of sardine in recent decades (1985-2005). A modification of existing spawning habitat characterisation techniques and a newly developed method to compare the probability of egg presence across surveys carried out with different sampling gears were used. Results showed that sardine spawning off the Atlantic European coast is mainly restricted to the shelf area, with the main geographical range being between the Strait of Gibraltar (the southern limit of data available for this analysis) and the middle part of the Armorican shelf (latitude around 47.5°North), and along a temperature range of 12-17°C. Spawning grounds within these limits show a nearly continuous geographical distribution, covering a large proportion of the shelf of the Iberian peninsula and adjacent waters, except for: (1) a persistent gap at the north west corner of the Iberian peninsula, (2) a small secondary break at the Spanish-French border in the inner part of the Bay of Biscay and (3) at the south west corner of the peninsula where there is a narrowing of the shelf width. These discontinuities were used to separate spawning into four nuclei and to describe the changes in spawning distribution in the time series. The relative importance of each nucleus and the degree of separation between adjacent nuclei varies between years, with the exception of the permanent gap at the northwest corner of the Iberian peninsula, which is persistent throughout the time series. Year to year changes in the proportion of the potential spawning habitat in which spawning actually occurred, changing from around 60% before the mid 1990s to around 40% thereafter, and

2024, Progress in Oceanography

Data on the occurrence of sardine (Sardina pilchardus) eggs from 42 national ichthyoplankton surveys along the European Atlantic coast were collated in order to describe the spawning habitat and spawning distribution of sardine in recent... more

Data on the occurrence of sardine (Sardina pilchardus) eggs from 42 national ichthyoplankton surveys along the European Atlantic coast were collated in order to describe the spawning habitat and spawning distribution of sardine in recent decades (1985-2005). A modification of existing spawning habitat characterisation techniques and a newly developed method to compare the probability of egg presence across surveys carried out with different sampling gears were used. Results showed that sardine spawning off the Atlantic European coast is mainly restricted to the shelf area, with the main geographical range being between the Strait of Gibraltar (the southern limit of data available for this analysis) and the middle part of the Armorican shelf (latitude around 47.5°North), and along a temperature range of 12-17°C. Spawning grounds within these limits show a nearly continuous geographical distribution, covering a large proportion of the shelf of the Iberian peninsula and adjacent waters, except for: (1) a persistent gap at the north west corner of the Iberian peninsula, (2) a small secondary break at the Spanish-French border in the inner part of the Bay of Biscay and (3) at the south west corner of the peninsula where there is a narrowing of the shelf width. These discontinuities were used to separate spawning into four nuclei and to describe the changes in spawning distribution in the time series. The relative importance of each nucleus and the degree of separation between adjacent nuclei varies between years, with the exception of the permanent gap at the northwest corner of the Iberian peninsula, which is persistent throughout the time series. Year to year changes in the proportion of the potential spawning habitat in which spawning actually occurred, changing from around 60% before the mid 1990s to around 40% thereafter, and

2024, Geophysical Prospecting

ABSTRACTWe present a new approach to enhancing weak prestack reflection signals without sacrificing higher frequencies. As a first step, we employ known multidimensional local stacking to obtain an approximate ‘model of the signal’.... more

ABSTRACTWe present a new approach to enhancing weak prestack reflection signals without sacrificing higher frequencies. As a first step, we employ known multidimensional local stacking to obtain an approximate ‘model of the signal’. Guided by phase spectra from this model, we can detect very weak signals and make them visible and coherent by ‘repairing’ corrupted phase of original data. Both presented approaches – phase substitution and phase sign corrections – show good performance on complex synthetic and field data suffering from severe near‐surface scattering where conventional processing methods are rendered ineffective. The methods are mathematically formulated as a special case of time‐frequency masking (common in speech processing) combined with the signal model from local stacking. This powerful combination opens the avenue for a completely new family of approaches for multi‐channel seismic processing that can address seismic processing of land data with nodes and single se...

2024

This paper develops a framework for fitting functions with domains in the Euclidean space, when data are sparse but a slow variation allows for a useful fit. We measure the variation by Lipschitz Bound (LB)-functions which admit smaller... more

This paper develops a framework for fitting functions with domains in the Euclidean space, when data are sparse but a slow variation allows for a useful fit. We measure the variation by Lipschitz Bound (LB)-functions which admit smaller LB are considered to vary more slowly. Since most functions in practice are wiggly and do not admit a small LB, we extend this framework by approximating a wiggly function, f , by ones which admit a smaller LB and do not deviate from f by more than a specified Bound Deviation (BD). In fact for any positive LB, one can find such a BD, thus defining a trade-off function (LB-BD function) between the variation measure (LB) and the deviation measure (BD). We show that the LB-BD function satisfies nice properties: it is non-increasing and convex. We also present a method to obtain it using convex optimization. For a function with given LB and BD, we find the optimal fit and present deterministic bounds for the prediction error of various methods. Given the LB-BD function, we discuss picking an appropriate LB-BD pair for fitting and calculating the prediction errors. The developed methods can naturally accommodate an extra assumption of periodicity to obtain better prediction errors. Finally we present the application of this framework to air pollution data with sparse observations over time.

2024

Modern document collections are too large to annotate and curate manually. As increasingly large amounts of data become available, historians, librarians and other scholars increasingly need to rely on automated systems to efficiently and... more

Modern document collections are too large to annotate and curate manually. As increasingly large amounts of data become available, historians, librarians and other scholars increasingly need to rely on automated systems to efficiently and accurately analyze the contents of their collections and to find new and interesting patterns therein. Modern techniques in Bayesian text analytics are becoming wide spread and have the potential to revolutionize the way that research is conducted. Much work has been done in the document modeling community towards this end, though most of it is focussed on modern, relatively clean text data. We present research for improved modeling of document collections that may contain textual noise or that may include real-valued metadata associated with the documents. This class of documents includes many historical document collections. Indeed, our specific motivation for this work is to help improve the modeling of historical documents, which are often nois...

2024, International Journal of Electrical and Computer Engineering (IJECE)

The recommendation system is a filtering system. It filters a collection of things based on the historical behavior of a user, it also tries to make predictions based on user preferences and make recommendations that interest customers.... more

The recommendation system is a filtering system. It filters a collection of things based on the historical behavior of a user, it also tries to make predictions based on user preferences and make recommendations that interest customers. While incredibly useful, they can face various challenges affecting their performance and utility. Some common problems are, for example, when the number of users and items grows, the computational complexity of generating recommendations increases, which can increase the accuracy and precision of recommendations. So, for this purpose and to improve recommendation system results, we propose a recommendation system combining the demographic approach with collaborative filtering, our approach is based on users' demographic information such as gender, age, zip code, occupation, and historical ratings of the users. We cluster the users based on their demographic data using the k-means algorithm and then apply collaborative filtering to the specific user cluster for recommendations. The proposed approach improves the results of the collaborative filtering recommendation system in terms of precision and recommends diverse items to users.

2024

Wing flutter poses disturbances to aircraft structures. A three-dimensional elastic wing having an active disturbance rejection control (ADRC) mechanism was investigated during transonic flow. This ADRC was modeled as a system in order to... more

Wing flutter poses disturbances to aircraft structures. A three-dimensional elastic wing having an active disturbance rejection control (ADRC) mechanism was investigated during transonic flow. This ADRC was modeled as a system in order to improve future designs by making a probe mathematically. This mathematical model based on stochastic differential equations was used to model the dynamic behavior of the signals generated during a transonic flow condition. The stochastic displacement, signal acceleration, and the Ito generator for the stochastically driven process were simulated on MATLAB and it was observed that the acceleration of the signal had peak values at 700 𝑚𝑚/𝑠 2 , a maximum frequency of 𝑓 𝑠 = 1𝐻𝑧 and an Ito generator having a non-stationary signal with peak values of 4.5 × 10 -19 𝐽/𝑚 3 and a maximum energy distribution density of 2× 10 6 J/m 3 .. For a dynamic pressure of 𝑝 = 35𝑘𝑃𝑎 based on values in wind tunnel experiments from stated studies in this work, it was observed that the curl of displacement vector increased towards the wing tip but reduced as the Mach number increased from 0.8 to 1.05. The deviatoric stresses was also observed to be uniformly distributed when transitioning from subsonic flow to supersonic flow.

2024, Neural Networks

In this paper a correspondence is derived between regularization operators used in regularization networks and support vector kernels. We prove that the Green's Functions associated with regularization operators are suitable support... more

In this paper a correspondence is derived between regularization operators used in regularization networks and support vector kernels. We prove that the Green's Functions associated with regularization operators are suitable support vector kernels with equivalent regularization properties. Moreover, the paper provides an analysis of currently used support vector kernels in the view of regularization theory and corresponding operators associated with the classes of both polynomial kernels and translation invariant kernels. The latter are also analyzed on periodical domains. As a by-product we show that a large number of radial basis functions, namely conditionally positive definite functions, may be used as support vector kernels.

2024

We review recent methods for learning with positive definite kernels. All these methods formulate learning and estimation problems as linear tasks in a reproducing kernel Hilbert space (RKHS) associated with a kernel. We cover a wide... more

We review recent methods for learning with positive definite kernels. All these methods formulate learning and estimation problems as linear tasks in a reproducing kernel Hilbert space (RKHS) associated with a kernel. We cover a wide range of methods, ranging from simple classifiers to sophisticated methods for estimation with structured data.

2024

Effective methods of capacity control via uniform convergence bounds for function expansions have been largely limited to Support Vector machines, where good bounds are obtainable by the entropy number approach. We extend these methods to... more

Effective methods of capacity control via uniform convergence bounds for function expansions have been largely limited to Support Vector machines, where good bounds are obtainable by the entropy number approach. We extend these methods to systems with expansions in terms of arbitrary (parametrized) basis functions and a wide range of regularization methods covering the whole range of general linear additive models. This is achieved by a data dependent analysis of the eigenvalues of the corresponding design matrix. machines with Mercer kernels, does not hold in general case where is expanded in terms of more or less arbitrary basis functions.

2024

A binary matrix A is said to have the "Consecutive Ones Property" (C1P) if its columns can be permuted so that in each row, the ones appear in one run (i.e., all ones are adjacent). The Consecutive Ones Submatrix (COS) problem is, given a... more

A binary matrix A is said to have the "Consecutive Ones Property" (C1P) if its columns can be permuted so that in each row, the ones appear in one run (i.e., all ones are adjacent). The Consecutive Ones Submatrix (COS) problem is, given a binary matrix A and a positive integer m 0 , to find m 0 columns of A that form a submatrix with the C1P property. The matrix reordering problem is to find a matrix A obtained by permuting the columns of A that minimizes C r (A) the number of sequences of consecutive ones in A. In this paper, by using two quadratic forms, we calculate the number C r (A). We apply the obtained results to the orthogonal matrices and Hamming matrices, in addition, the two above problems can be solved for these matrices.

2024

A key sticking point of Bayesian analysis is the choice of prior distribution, and there is a vast literature on potential defaults including uniform priors, Jeffreys’ priors, reference priors, maximum entropy priors, and weakly... more

A key sticking point of Bayesian analysis is the choice of prior distribution, and there is a vast literature on potential defaults including uniform priors, Jeffreys’ priors, reference priors, maximum entropy priors, and weakly informative priors. These methods, however, often manifest a key conceptual tension in prior modeling: a model encoding true prior information should be chosen without reference to the model of the measurement process, but almost all common prior modeling techniques are implicitly motivated by a reference likelihood. In this paper we resolve this apparent paradox by placing the choice of prior into the context of the entire Bayesian analysis, from inference to prediction to model evaluation. 1. The role of the prior distribution in a Bayesian analysis Both in theory and in practice, the prior distribution can play many roles in a Bayesian analysis. Perhaps most formally the prior serves to encode information germane to the problem being analyzed, but in prac...