Prediction error Research Papers - Academia.edu (original) (raw)

Our aim was to identify which ultrasound parameters can be most accurately measured and best predict ovine fetal weight in late gestation. Singleton pregnancies were established using embryo transfer in 32 adolescent ewes, which were subsequently overnourished to produce fetuses of variable size (1720-6260 g). Ultrasound measurements at 126-133 days gestation were compared with fetal weight/biometry at late-gestation necropsy (n 5 19) or term delivery (n 5 13). Abdominal circumference (AC) and renal volume (RV) correlated best with physical measurements (r 5 0.78-0.83) and necropsy/birth weight (r 5 0.79-0.84). Combination of AC 1 RV produced an estimated fetal weight equation [Log EFW 5 2.115 1 0.003 AC 1 0.12 RV -0.005 RV 2 ] with highest adjusted R 2 (0.72) and lowest mean absolute/percentage prediction error (396-550 g/11.1%-13.2%).

- by
- •
- Ultrasound, Biometry, Birth Weight, Pregnancy

Thailand experiences severe floods and droughts that affect agriculture New techniques, such as Data-Based- Mechanistic modelling ale being developed to study rainfall and river flow to improve flood and drought alleviation policies and practices. Dynamic Harmonic Regression models are used to analyze rainfall and discharge time series across Thailand to define seasonality, trends and to forecast rainfall and discharge and their spatial distribution. Statistical patterns in the frequency of extreme rainfall and flow periods are identified with a view to improving predictions of medium and Longer-Term rainfall and river flow patterns. The results show temporal and spatial variation within the annual rainfall pattern in the study catchments. For example, the seasonality of the rainfall In the south is less pronounced (more equatorial). The discharge seasonal pattern shows stronger Semi-Annual cycles, with the weakest pattern in the south of country, whereas the strongest discharge sea...

- by Paul Carling
- •
- Geology, Time Series, Statistical Modelling, Seasonality

Accurate wind and severe-weather forecasts are crucial for wind-energy production and grid-load management. Most of the prevailing wind power forecast methods rely heavily on statistical approaches that typically do not deal directly with weather processes. Currently employed numerical weather prediction (NWP) models are deemed insufficiently accurate by many industry stakeholders for wind power prediction, even though they have been used for such applications. The reason is partly because the NWP products used for power forecasting are typically produced by the coarse-resolution models at the major operational weather centers. Although a few of high-resolution models are run by some wind energy industries, most of these model do not contain advanced data assimilation capabilities that are required to initialize the model prediction with the important high-resolution weather information.

- by Yubao Liu
- •
- Wind Energy, Data Analysis, Data Assimilation, Numerical Weather Prediction
- by Feng-Kuei Chiang
- •
- Experimental Psychology, Working Memory, Prefrontal Cortex, Psychobiology

BACKGROUND: The aim of this study was to provide a model-based analysis of the pharmacokinetics of remifentanil in infants and children undergoing cardiac surgery with cardiopulmonary bypass (CPB). METHODS: We studied nine patients aged 0.5 to 4 years who received a continuous remifentanil infusion via a computer-controlled infusion pump during cardiac surgery with mildly hypothermic CPB were studied. Arterial blood

- by Gregory Hammer
- •
- Bioinformatics, Life Sciences, Medicine, Cardiac Surgery

The two-dimensional (2-D) parabolic equation (PE) is widely used for making radiowave propagation predictions in the troposphere. The effects of transverse terrain gradients, propagation around the sides of obstacles, and scattering from large obstacles to the side of the great circle path are not modeled, leading to prediction errors in many situations. In this paper, these errors are addressed by extending the 2-D PE to three dimensions. This changes the matrix form of the PE making it difficult to solve. A novel iterative solver technique, which is highly efficient and guaranteed to converge, is being presented. In order to confine the domain of computation, a three-dimensional (

The fast update rate and good performance of new generation electronic sector scanning sonars is now allowing practicable use of temporal information for signal processing tasks such as object classification and motion estimation. Problems remain, however, as objects change appearance, merge, maneuver, move in and out of the field of view, and split due to poor segmentation. This paper presents an approach to the segmentation, two-dimensional motion estimation, and subsequent tracking of multiple objects in sequences of sector scan sonar images. Applications such as ROV obstacle avoidance, visual servoing, and underwater surveillance are relevant. Initially, static and moving objects are distinguished in the sonar image sequence using frequency-domain filtering. Optical flow calculations are then performed on moving objects with significant size to obtain magnitude and direction motion estimates. Matches of these motion estimates, and the future positions they predict, are then used as a basis for identifying corresponding objects in adjacent scans. To enhance robustness, a tracking tree is constructed storing multiple possible correspondences and cumulative confidence values obtained from successive compatibility measures. Deferred decision making is then employed to enable best estimates of object tracks to be updated as subsequent scans produce new information. The method is shown to work well, with good tracking performance when objects merge, split, and change shape. The optical flow is demonstrated to give position prediction errors of between 10 and 50 cm (1%-5% of scan range), with no violation of smoothness assumptions using sample rates between 4 and 1 frames/s.

- by Mike Chantler
- •
- Decision Making, Signal Processing, Target Tracking, Motion estimation

Physically-based groundwater models (PBMs), such as MODFLOW, contain numerous parameters which are usually estimated using statistically-based methods, which assume that the underlying error is white noise. However, because of the practical difficulties of representing all the natural subsurface complexity, numerical simulations are often prone to large uncertainties that can result in both random and systematic model error. The systematic errors can be attributed to conceptual, parameter, and measurement uncertainty, and most often it can be difficult to determine their physical cause. In this paper, we have developed a framework to handle systematic error in physically-based groundwater flow model applications that uses error-correcting data-driven models (DDMs) in a complementary fashion. The data-driven models are separately developed to predict the MODFLOW head prediction errors, which were subsequently used to update the head predictions at existing and proposed observation wells. The framework is evaluated using a hypothetical case study developed based on a phytoremediation site at the Argonne National Laboratory. This case study includes structural, parameter, and measurement uncertainties. In terms of bias and prediction uncertainty range, the complementary modeling framework has shown substantial improvements (up to 64% reduction in RMSE and prediction error ranges) over the original MOD-FLOW model, in both the calibration and the verification periods. Moreover, the spatial and temporal correlations of the prediction errors are significantly reduced, thus resulting in reduced local biases and structures in the model prediction errors.

- by Barbara Minsker
- •
- Hydrology, Multidisciplinary, Prediction error, Error Correction

We analyze the "query by committee" algorithm, a method for filtering informative queries from a random stream of inputs. We show that if the two-member committee algorithm achieves information gain with positive lower bound, then the prediction error decreases exponentially with the number of queries. We show that, in particular, this exponential decrease holds for query learning of perceptrons.

- by Eli A A
- •
- Cognitive Science, Machine Learning, Experimental Design, Bayesian Learning

Artificial Neural Network (ANN) and regression models were developed using watershed-scale geomorphologic parameters to predict surface runoff and sediment losses of the St. Esprit watershed, Quebec, Canada. Ge o- morphological parameters describing the land surface drainage characteristics and surface water flow behav- iour were empirically associated with measured rainfall and runoff data and used as input to a three-layered back-propagation

In predictive 3-D mesh geometry coding, the position of each vertex is predicted from the previously coded neighboring vertices and the resultant prediction error vectors are coded. In this work, the prediction error vectors are represented in a local coordinate system in order to cluster them around a subset of a 2-D planar subspace and thereby increase block coding efficiency. Alphabet entropy constrained vector quantization (AECVQ) of Rao and Pearlman is preferred to the previously employed minimum distortion vector quantization (MDVQ) for block coding the prediction error vectors with high coding efficiency and low implementation complexity. Estimation and compensation of the bias in the parallelogram prediction rule and partial adaptation of the AECVQ codebook to the encoded vector source by normalization using source statistics, are the other salient features of the proposed coding system. Experimental results verify the advantage of the use of the local coordinate system over the global one. The visual error of the proposed coding system is lower than the predictive coding method of Touma and Gotsman especially at low rates, and lower than the spectral coding method of Karni and Gotsman at medium-to-high rates.

- by Ulug Bayazit
- •
- Visual Communication, Predictive coding, Prediction error, Lower Bound

Guidance systems designed for neurosurgery, hip surgery, and spine surgery, and for approaches to other anatomy that is relatively rigid can use rigid-body transformations to accomplish image registration. These systems often rely on point-based registration to determine the transformation, and many such systems use attached fiducial markers to establish accurate fiducial points for the registration, the points being established by some fiducial localization process. Accuracy is important to these systems, as is knowledge of the level of that accuracy. An advantage of marker-based systems, particularly those in which the markers are bone-implanted, is that registration error depends only on the fiducial localization error (FLE) and is thus to a large extent independent of the particular object being registered. Thus, it should be possible to predict the clinical accuracy of marker-based systems on the basis of experimental measurements made with phantoms or previous patients. This paper presents two new expressions for estimating registration accuracy of such systems and points out a danger in using a traditional measure of registration accuracy.

- by Michael Fitzpatrick
- •
- Engineering, Surgery, Image Registration, System Design

Nonlinear effects in fMRI BOLD data may substantially influence estimates of task-related activations, particularly in rapid eventrelated designs. If the BOLD response to each stimulus is assumed to be independent of the stimulation history, nonlinear interactions create a prediction error that may reduce sensitivity. When stimulus density differs among conditions, nonlinear effects can cause artifactual differences in activation. This situation can occur in rapid event-related designs or when comparing blocks of unequal lengths. We present data showing substantial nonlinear history effects for stimuli 1 s apart and use estimates of nonlinearities in response magnitude, onset time, and time to peak to form a low-dimensional parameterization of these nonlinear effects. Our estimates of nonlinearity appear relatively consistent throughout the brain, and these estimates can be used to form adjusted linear predictors for future rapid event-related fMRI studies. Adjusting the linear model for these known nonlinear effects results in a substantially better model fit. The biggest advantages to using predictors adjusted for known nonlinear effects are (1) higher sensitivity at the individual subject level of analysis, (2) better control of confounds related to nonlinear effects, and (3) more accurate estimates of design efficiency in experimental fMRI design. D

- by alberto vazquez
- •
- Individuality, Nonlinear dynamics, Magnetic Resonance Imaging, Adolescent

adaptive notch filter is derived by using a general prediction error framework. The proposed infinite impulse response filter has a special structure that guarantees the desired transfer characteristics. The filter coefficients are updated by a version of the recursive maximum likelihood algorithm. The convergence properties of the algorithm and its asymptotic behavior are discussed, and its performance is evaluated by simulation results.

- by B. Friedlander
- •
- Signal Processing, Convergence, Performance Evaluation, Maximum Likelihood

This paper describes two different but complementary approaches that can be used to perform SEU-like fault injection sessions in order to predict error rates of digital processors. The Code Emulated Upset (CEU) approach allows fault injection in processor memories (caches and register files), while the FPGA Autonomous Emulation approach allows fault injection in processor flip-flops. Results obtained for a case studied, the LEON processor, illustrate the complementary aspects of proposed strategies.

- by Wassim Mansour
- •
- Biomedical Engineering, Case Study, Fault Injection, Prediction error

In the regression context, boosting and bagging are techniques to build a committee of regressors that may be superior to a single regressor. We use regression trees as fundamental building blocks in bagging committee machines and boosting committee machines. Performance is analyzed on three non-linear functions and the Boston housing database. In all cases, boosting is at least equivalent, and

- by Harris Drucker
- •
- Prediction error, Regression Tree

The potential of noninvasive determination of glucose, lactic acid, and nisin in Lactococcus lactis subsp. lactis biofilm fermentation was investigated through fourier transform mid-infrared (FTIR) spectroscopy. Samples obtained from a biofilm bioreactor were analyzed with traditional methods and FTIR spectroscopy. The FTIR spectra were interpreted by using suitable spectra wavenumber regions through multivariate statistical techniques such as partial least square (PLS) and principal component regression (PCR). The standard error of calibration for the PLS-1 st derivative calibration models for glucose, lactic acid, and nisin were 3.87 g/l, 2.62 g/l, and 189.6 IU/ml, respectively. Prediction errors were low for glucose and lactic acid, whereas nisin could be reliably quantified when its concentration is higher than 800 IU/ml. Results indicated that FTIR spectroscopy could be used for rapid detection of glucose and lactic acid concentrations, and nisin activity in nisin fermentation.

- by Ali Demirci
- •
- Raman Spectroscopy, Near Infrared, Near Infrared Spectroscopy, Prediction error
- by Nam Ling
- •
- Data Compression, Image segmentation, Image Quality, Speech

Children seem to acquire new know-how in a continuous and open-ended manner. In this paper, we hypothesize that an intrinsic motivation to progress in learning is at the origins of the remarkable structure of children's developmental trajectories. In this view, children engage in exploratory and playful activities for their own sake, not as steps toward other extrinsic goals. The central hypothesis of this paper is that intrinsically motivating activities correspond to expected decrease in prediction error. This motivation system pushes the infant to avoid both predictable and unpredictable situations in order to focus on the ones that are expected to maximize progress in learning. Based on a computational model and a series of robotic experiments, we show how this principle can lead to organized sequences of behavior of increasing complexity characteristic of several behavioral and developmental patterns observed in humans. We then discuss the putative circuitry underlying such an intrinsic motivation system in the brain and formulate two novel hypotheses. The first one is that tonic dopamine acts as a learning progress signal. The second is that this progress signal is directly computed through a hierarchy of microcortical circuits that act both as prediction and metaprediction systems.

- by Frederic Kaplan
- •
- Bioinformatics, Cognitive Science, Life Sciences, Intrinsic motivation

%me values of 2 perfectly reveal d: if d = p -3, for example, it would be clear that d = p -1, cp = -1, and E, = -1. This feature of the model is accidental. I have verified the propositions for other error specifications which do not have the perfect-revelation property, such as, for example, when measurement errors are not cumulative and the measured damage is constrained to lie within [IL -1, p + 11, the case in the working paper version of this at&e, Rasmusen (1992b).

- by Eric Rasmusen
- •
- Law, Applied Economics, Prediction error, Law Economics

The aim of this research was to develop and validate a mathematical model coupled with an optimization technique for thermal processing of conduction-heated foods in retortable pouches in order to: (a) search for variable retort temperature profiles to minimize process time, and (b) search for variable retort temperature profiles to minimize quality gradient (thiamine) within the product.

- by Sergio Almonacid
- •
- Process Optimization, Food Engineering, Prediction error, Mathematical Model

Introduction Various experimental manipulations, usually involving drug administration, have been used to produce symptoms of psychosis in healthy volunteers. Different drugs produce both common and distinct symptoms. A challenge is to understand how apparently different manipulations can produce overlapping symptoms. We suggest that current Bayesian formulations of information processing in the brain provide a framework that maps onto neural circuitry and gives us a context within which we can relate the symptoms of psychosis to their underlying causes. This helps us to understand the similarities and differences across the common models of psychosis. Materials and methods The Bayesian approach emphasises processing of information in terms of both prior expectan-cies and current inputs. A mismatch between these leads us to update inferences about the world and to generate new predictions for the future. According to this model, what we experience shapes what we learn, and what we learn modifies how we experience things. Discussion This simple idea gives us a powerful and flexible way of understanding the symptoms of psychosis where perception, learning and inference are deranged. We examine the predictions of the cognitive model in light of what we understand about the neuropharmacology of psychotomimetic drugs and thereby attempt to account for the common and the distinctive effects of NMDA receptor antagonists, serotonergic hallucinogens, cannabinoids and dopamine agonists. Conclusion By acknowledging the importance of perception and perceptual aberration in mediating the positive symptoms of psychosis, the model also provides a useful setting in which to consider an under-researched model of psychosis-sensory deprivation.

- by Milad Jackson
- •
- Perception, Psychopharmacology, Prediction, Information Processing

The effect of dialysis on patients is conventionally predicted using a formal mathematical model. This approach requires many assumptions of the processes involved, and validation of these may be difficult. The validity of dialysis urea modeling using a formal mathematical model has been challenged. Artificial intelligence using neural networks (NNs) has been used to solve complex problems without needing a mathematical model or an understanding of the mechanisms involved. In this study, we applied an NN model to study and predict concentrations of urea during a hemodialysis session. We measured blood concentrations of urea, patient weight, and total urea removal by direct dialysate quantification (DDQ) at 30-minute intervals during the session (in 15 chronic hemodialysis patients). The NN model was trained to recognize the evolution of measured urea concentrations and was subsequently able to predict hemodialysis session time needed to reach a target solute removal index (SRI) in patients not previously studied by the NN model (in another 15 chronic hemodialysis patients). Comparing results of the NN model with the DDQ model, the prediction error was 10.9%, with a not significant difference between predicted total urea nitrogen (UN) removal and measured UN removal by DDQ. NN model predictions of time showed a not significant difference with actual intervals needed to reach the same SRI level at the same patient conditions, except for the prediction of SRI at the first 30-minute interval, which showed a significant difference (P ‫؍‬ 0.001). This indicates the sensitivity of the NN model to what is called patient clearance time; the prediction error was 8.3%. From our results, we conclude that artificial intelligence applications in urea kinetics can give an idea of intradialysis profiling according to individual clinical needs. In theory, this approach can be extended easily to other solutes, making the NN model a step forward to achieving artificial-intelligent dialysis control.

- by J. Tattersall
- •
- Kinetics, Neural Network, Hemodialysis, Artificial Intelligent

Fault simulation is commonly used in the development or evaluation of test vectors for integrated circuit designs. The computational requirements, however, often discourage, or even prohibit, complete fault simulation of circuit designs having greater than 20000 single stuck-at faults. To circumvent this problem, statistical sampling methods have been proposed [l], [2] that provide fault coverage values within a small, predictable error range by simulating only a fraction of the circuit's total faults and using the result fault coverage value as an estimate of the fault coverage for the total circuit. Since only a fraction of the stuck-at faults are used in the simulation, it requires but a fraction of the time required for complete fault simulation. As an introduction to the application of sampling methods to fault simulation of integrated circuits, the statistical theory behind these sampling methods and proposed augmentations of these methods for improving the precision of the sample fault coverage is presented. Various proposed sampling schemes are applied to example circuit designs, and the results are analyzed.

- by Troy Nagle
- •
- Engineering, Computational Modeling, Statistical Analysis, Failure Analysis

The aim of the present study was to investigate the feasibility of near infrared (NIR) spectroscopy for rapid assessment of the Bloom value, viscosity, pH and moisture content of commercial edible gelatines. NIR spectra from both dry gelatine samples and freshly made gels (matured for 30 min) were calibrated against the four parameters. The lowest prediction errors for Bloom value and pH were obtained from the gel spectra, while dry gelatine spectra provided the lowest prediction errors for viscosity and moisture content. However, all four parameters could be predicted from dry gelatine spectra with cross-validated correlation coefficients between 0.80 and 0.86. All calibration models gave correlation coefficients in the region 0.80-0.92. q

- by Reidar Schuller
- •
- Near Infrared, Quality Control, Quantitative analysis, Near Infrared Spectroscopy

This research examined whether people can accurately predict the risk preferences of others. Three experiments featuring different designs revealed a systematic bias: that participants predicted others to be more risk seeking than themselves in risky choices, regardless of whether the choices were between options with negative outcomes or with positive outcomes. This self-others discrepancy persisted even if a monetary incentive was offered for accurate prediction. However, this discrepancy occurred only if the target of prediction was abstract and vanished if the target was vivid. A risk-as-feelings hypothesis was introduced to explain these findings.

- by Elke Weber
- •
- Psychology, Cognitive Science, Social Psychology, Experimental Psychology

The battery management system (BMS) is an integral part of an automobile. It protects the battery from damage, predicts battery life, and maintains the battery in an operational condition. The BMS performs these tasks by integrating one or more of the functions, such as protecting the cell, thermal management, controlling the charge-discharge, determining the state of charge (SOC), state of health (SOH), and remaining useful life (RUL) of the battery, cell balancing, data acquisition, communication with on-board and off-board modules, as well as monitoring and storing historical data. In this paper, we propose a BMS that estimates the critical characteristics of the battery (such as SOC, SOH, and RUL) using a data-driven approach. Our estimation procedure is based on a modified Randles circuit model consisting of resistors, a capacitor, the Warburg impedance for electrochemical impedance spectroscopy test data, and a lumped parameter model for hybrid pulse power characterization test data. The resistors in a Randles circuit model usually characterize the self-discharge and internal resistance of the battery, the capacitor generally represents the charge stored in the battery, and the Warburg impedance represents the diffusion phenomenon. The Randles circuit parameters are estimated using a frequency-selective nonlinear least squares estimation technique, while the lumped parameter model parameters are estimated by the prediction error minimization method. We investigate the use of support vector machines (SVMs) to predict the capacity fade and power fade, which characterize the SOH of a battery, as well as estimate the SOC of the battery. An alternate procedure for estimating the power fade and energy fade from low-current Hybrid Pulse Power characterization (L-HPPC) test data using the lumped parameter battery model has been proposed. Predictions of RUL of the battery are obtained by support vector regression of the power fade and capacity fade estimates. Survival function estimates for reliability analysis of the battery are obtained using a hidden Markov model (HMM) trained using time-dependent estimates of capacity fade and power fade as observations. The proposed framework provides a systematic way for estimating relevant battery characteristics with a high-degree of accuracy.

Synchrony has been found to increase trust, prosociality and interpersonal cohesion, possibly via neurocognitive self-other blurring. Researchers have thus highlighted synchrony as an engine of collective identity and cooperation, particularly in religious ritual. However, many aspects of group life require coordination, not merely prosocial cooperation. In coordination, interpersonal relations and motor sequences are complementary, and leadership hierarchies often streamline functioning. It is therefore unclear whether synchrony would benefit collaborative tasks requiring complex interdependent coordination. In a two-condition paradigm, we tested synchrony's effects on a three-person, complex verbal coordination task. Groups in the synchrony condition performed more poorly on the task, reporting more conflict, less cohesion, and less similarity. These results indicate boundary conditions on prosocial synchrony: in settings that require complex, interdependent social coordination, self-other blurring may disrupt complementary functioning. These findings dovetail with the anthropological observation that real-world ritual often generates and maintains social distinctions rather than social unison.

- by Connor Wood and +2
- •
- Cognitive Psychology, Cognitive Science, Evolution of cooperation (Evolutionary Biology), Leadership
- by Ira Victoria
- •
- Statistics, Machine Learning, Biometrics, Public Health

In the last two decades, interest in species distribution models (SDMs) of plants and animals has grown dramatically. Recent advances in SDMs allow us to potentially forecast anthropogenic effects on patterns of biodiversity at different spatial scales. However, some limitations still preclude the use of SDMs in many theoretical and practical applications. Here, we provide an overview of recent advances in this field, discuss the ecological principles and assumptions underpinning SDMs, and highlight critical limitations and decisions inherent in the construction and evaluation of SDMs. Particular emphasis is given to the use of SDMs for the assessment of climate change impacts and conservation management issues. We suggest new avenues for incorporating species migration, population dynamics, biotic interactions and community ecology into SDMs at multiple spatial scales. Addressing all these issues requires a better integration of SDMs with ecological theory.

- by Antoine Guisan
- •
- Climate Change, Community Ecology, Ecological Niche Modeling, Biotic interactions

The Earth orientation parameters (EOP) are determined by space geodetic techniques with very high accuracy. However, the accuracy of their prediction, even for a few days in the future, is several times lower and still unsatisfactory for practical use. The main problem of each prediction technique is to predict simultaneously long and short period oscillations of the EOP. It has been shown that the combination of the prediction methods which are different for deterministic and stochastic components of the EOP can provide the best accuracy of prediction. Several prediction techniques, e.g. combination of the least-squares with autoregressive and autocovariance methods as well as combination of wavelet transform decomposition and autocovariance prediction, were used to predict x, y pole coordinates in polar coordinate system and U T 1 − U T C or length of day (∆) data. Different prediction algorithms were compared considering the difference between the EOP data and their predictions at different starting prediction epochs as well as by comparing the mean EOP prediction errors.

- by Wieslaw Kosek
- •
- Wavelet Transform, Prediction error, Oscillations, Length of day

This paper introduces the idea of predicting 'designer error' by evaluating devices using Human Error Identification (HEI) techniques. This is demonstrated using Systematic Human Error Reduction and Prediction Approach (SHERPA) and Task Analysis for Error Identification (TAFEI) to evaluate a vending machine. Appraisal criteria which rely upon user opinion, face validity and utilisation are questioned. Instead a quantitative approach, based upon signal detection theory, is recommended. The performance of people using SHERPA and TAFEI are compared with heuristic judgement and each other. The results of these studies show that both SHERPA and TAFEI are better at predicting errors than the heuristic technique. The performance of SHERPA and TAFEI are comparable, giving some confidence in the use of these approaches. It is suggested that using HEI techniques as part of the design and evaluation process could help to make devices easier to use.

- by Neville Stanton
- •
- Information Systems, Design Methods, Signal Detection Theory, Design studies

The bootstrap, extensively studied during the last decade, has become a powerful tool in different areas of Statistical Inference. In this work, we present the main ideas of bootstrap methodology in several contexts, citing the most... more

- by Juan Romo
- •
- Statistical Inference, Prediction error, Cross Validation, Censored data

A problem of supervised learning from the multivariate time series (MTS) data where the target variable is potentially a highly complex function of MTS features is considered. This paper focuses on finding a compressed representation of MTS while preserving its predictive potential. Each time se- quence is decomposed into Chebyshev polynomials, and the decomposition coefficients are used as predictors in a statisti- cal learning model. The feature selection method capable of handling true multivariate effects is then applied to identify relevant Chebyshev features. MTS compression is achieved by keeping only those predictors that are pertinent to the re- sponse. The paper considers a problem of multivariate time series compression. We start with a dataset where each sample con- sists of several time series and a single response value. Indi- vidual series from the same sample correspond to the differ- ent variables with different physical characteristics generated by a process. Our g...

- by Peter Raulefs
- •
- Signal Processing, Time Series, Data Compression, Feature Selection

Modeling thermal signatures of complex objects in natural environments involves the object surface temperatures computation. Variations in air/ground temperatures, wind, solar load, etc. contribute to complexify the thermal modeling problem. This paper concerns the tem- perature computation. We consider two thermal codes: RadThermIR (1) and OSMOSIS (2). OSMOSIS computes the temperatures in near real- time and handles in thermal calculations the surface emissivity as a function of wavelength, temperature, and orientation of an optical beam incident upon the surface. In contrary, RadThermIR does not include these variations, leaving thermal calculations with two parameters (solar absorptivity and thermal emissivity). In this paper, results of both thermal codes are compared with each other and with experimental data collected on the L-shaped object CUBI (3). We show that RadThermIR is the most accurate. However, it requires a high computation time. OSMOSIS is less time-consuming, but ...

- by Arcady Reinov and +1
- •
- Land Surface Temperature, Prediction error, Natural Environment, Experimental Data

This work presents a new prediction-based portfolio optimization model that can capture short-term investment opportunities. We used neural network predictors to predict stocks' returns and derived a risk measure, based on the prediction errors, that have the same statistical foundation of the mean-variance model. The efficient diversification effects holds thanks to the selection of predictors with low and complementary pairwise error profiles.

- by Alberto Ferreira De Souza
- •
- Engineering, Time Series, Neural Networks, Neural Network

Pragmatical, visually oriented methods for assessing and optimising bi-linear regression models are described, and ap-Ž . plied to PLS Regression PLSR analysis of multi-response data from controlled experiments. The paper outlines some ways to stabilise the PLSR method to extend its range of applicability to the analysis of effects in designed experiments. Two ways of passifying unreliable variables are shown. A method for estimating the reliability of the cross-validated prediction error RMSEP is demonstrated. Some recently developed jack-knifing extensions are illustrated, for estimating the reliability of the linear and bi-linear model parameter estimates. The paper illustrates how the obtained PLSR AsignificanceB probabilities are similar to those from conventional factorial ANOVA, but the PLSR is shown to give important additional overview plots of the main relevant structures in the multi-response data.

- by Frank Westad
- •
- Analytical Chemistry, Sensory, Experimental Design, Multivariate Data Analysis

Landslides cause damage to property and unfortunately pose a threat even to human lives. Good landslide susceptibility, hazard, and risk models could help mitigate or even avoid the unwanted consequences resulted from such hillslope mass movements. For the purpose of landslide susceptibility assessment the study area in the central Slovenia was divided to 78 365 slope units, for which 24 statistical variables were calculated. For the land-use and vegetation data, multi-spectral highresolution images were merged using Principal Component Analysis method and classified with an unsupervised classification. Using multivariate statistical analysis (factor analysis), the interactions between factors and landslide distribution were tested, and the importance of individual factors for landslide occurrence was defined. The results show that the slope, the lithology, the terrain roughness, and the cover type play important roles in landslide susceptibility. The importance of other spatial factors varies depending on the landslide type. Based on the statistical results several landslide susceptibility models were developed using the Analytical Hierarchy Process method. These models gave very different results, with a prediction error ranging from 4.3% to 73%. As a final result of the research, the weights of important spatial factors from the best models were derived with the AHP method. Using probability measures, potentially hazardous areas were located in relation to population and road distribution, and hazard classes were assessed. D

- by Marko Komac
- •
- Geology, Geomorphology, Multivariate Statistics, Principal Component Analysis

Fourier transform Raman spectroscopy and chemometric tools have been used for exploratory analysis of pure corn and cassava starch samples and mixtures of both starches, as well as for the quantification of amylose content in corn and cassava starch samples. The exploratory analysis using principal component analysis shows that two natural groups of similar samples can be obtained, according to the amylose content, and consequently the botanical origins. The Raman band at 480 cm −1 , assigned to the ring vibration of starches, has the major contribution to the separation of the corn and cassava starch samples. This region was used as a marker to identify the presence of starch in different samples, as well as to characterize amylose and amylopectin. Two calibration models were developed based on partial least squares regression involving pure corn and cassava, and a third model with both starch samples was also built; the results were compared with the results of the standard colorimetric method. The samples were separated into two groups of calibration and validation by employing the Kennard-Stone algorithm and the optimum number of latent variables was chosen by the root mean square error of cross-validation obtained from the calibration set by internal validation (leave one out). The performance of each model was evaluated by the root mean square errors of calibration and prediction, and the results obtained indicate that Fourier transform Raman spectroscopy can be used for rapid determination of apparent amylose in starch samples with prediction errors similar to those of the standard method.

- by Rafael Alves
- •
- Engineering, Colorimetry, Chemometrics, Principal Component Analysis

1. We conducted a statistical reassessment of data previously reported in the lake total phosphorus (TP) input/output literature (n ¼ 305) to determine which lake characteristics are most strongly associated with lake phosphorus concentration and retention. We tested five different hypotheses for predicting lake TP concentrations and phosphorus retention. 2. The Vollenweider phosphorus mass loading model can be expressed as: TP out ¼ TP in / (1 + rs w), where TP in is the flow-weighted input TP concentration, s w is the lake hydraulic retention time and r is a first-order rate constant for phosphorus loss. 3. The inflow-weighted TP input concentration is a moderately strong predictor (r 2 ¼ 0.71) of lake phosphorus concentrations when using log-log transformed data. Lake TP retention is negatively correlated with lake hydraulic retention time (r 2 ¼ 0.35). 4. Of the approaches tested, the best fit to observed data was obtained by estimating r as an inverse function of the lake's hydraulic retention time. Although this mass balance approach explained 84% of the variability in log-log transformed data, the prediction error for individual lakes was quite high. 5. Estimating r as the ratio of a putative particle settling velocity to the mean lake depth yielded poorer predictions of lake TP (r 2 ¼ 0.77) than the approach described above, and in fact did not improve model performance compared with simply assuming that r is a constant for all lakes. 6. Our results also demonstrate that changing the flow-weighted input concentration should always have a directly proportionate impact on lake phosphorus concentrations, provided the type of phosphorus loaded (e.g. dissolved or particulate) does not vary.

- by Michael T. Brett
- •
- Environmental Science, Hydraulics, Freshwater Biology, Hydrobiology

Fine-pitch ball grid array (BGA) and underfills have been used in benign office environments and wireless applications for a number of years, however their reliability in automotive underhood environment is not well understood. In this work, the reliability of fine-pitch plastic ball grid array (PBGA) packages has been evaluated in the automotive underhood environment. Experimental studies indicate that the coefficient of thermal expansion (CTE) as measured by thermomechanical analyzer (TMA) typically starts to change at 10-15°C lower temperature than the Tg specified by differential scanning calorimetry (DSC) potentially extending the change in CTE well into the accelerated test envelope in the neighborhood of 125°C. High Tg substrates with glass-transition temperatures much higher than the 125°C high temperature limit, are therefore not subject to the effect of high coefficient of thermal expansion close to the high temperature of the accelerated test. Darveaux's damage relationships were derived on ceramic ball grid array (CBGA) assemblies, with predominantly solder mask defined (SMD) pads and 62Sn36Pb2Ag solder. In addition to significant differences in the crack propagation paths for the two pad constructions, SMD pads fail significantly faster than the non solder mask defined (NSMD) pads in thermal fatigue. The thermal mismatch on CBGAs is much larger than PBGA assemblies. Crack propagation in CBGAs is often observed predominantly on the package side as opposed to both package and board side for PBGAs. In the present study, crack propagation data has been acquired on assemblies with 15, 17, and 23mm size plastic BGAs with NSMD pads and 63Sn37Pb on high-Tg printed circuit boards. The data has been benchmarked against Darveaux's data on CBGA assemblies. Experimental matrix also encompasses the effect of bis-maleimide triazine (BT) substrate thickness on reliability. Damage constants have been developed and compared against existing Darveaux Constants. Prediction error has been quantified for both sets of constants.

Three methods of nonlinear time series analysis, Lempel–Ziv complexity, prediction error and covariance complexity were employed to distinguish between the electroencephalograms (EEGs) of normal children, children with mild autism, and children with severe autism. Five EEG tracings per cluster of children aged three to seven medically diagnosed with mild, severe and no autism were used in the analysis. A general trend seen was that the EEGs of children with mild autism were significantly different from those with severe or no autism. No significant difference was observed between normal children and children with severe autism. Among the three methods used, the method that was best able to distinguish between EEG tracings of children with mild and severe autism was found to be the prediction error, with a t-Test confidence level of above 98%.

- by Perry Esguerra and +2
- •
- Autism, EEG, Time series analysis, Prediction error

A mechanistic model that predicts nutrient requirements and biological values of feeds for sheep (Cornell Net Carbohydrate and Protein System; CNCPS-S) was expanded to include goats and the name was changed to the Small Ruminant Nutrition System (SRNS). The SRNS uses animal and environmental factors to predict metabolizable energy (ME) and protein, and Ca and P requirements. Requirements for goats in the SRNS are predicted based on the equations developed for CNCPS-S, modified to account for specific requirements of goats, including maintenance, lactation, and pregnancy requirements, and body reserves. Feed biological values are predicted based on carbohydrate and protein fractions and their ruminal fermentation rates, forage, concentrate and liquid passage rates, and microbial growth. The evaluation of the SRNS for sheep using published papers (19 treatment means) indicated no mean bias (MB; 1.1 g/100 g) and low root mean square prediction error (RMSPE; 3.6 g/100g) when predicting dietary organic matter digestibility for diets not deficient in ruminal nitrogen. The SRNS accurately predicted gains and losses of shrunk body weight (SBW) of adult sheep (15 treatment means; MB = 5.8 g/d and RMSPE = 30 g/d) when diets were not deficient in ruminal nitrogen. The SRNS for sheep had MB varying from -34 to 1 g/d and RSME varying from 37 to 56 g/d when predicting average daily gain (ADG) of growing lambs (42 treatment means). The evaluation of the SRNS for goats based on literature data showed accurate predictions for ADG of kids (31 treatment means; RMSEP = 32.5 g/d; r2= 0.85; concordance correlation coefficient, CCC, = 0.91), daily ME intake (21 treatment means; RMSEP = 0.24 Mcal/d g/d; r2 = 0.99; CCC = 0.99), and energy balance (21 treatment means; RMSEP = 0.20 Mcal/d g/d; r2 = 0.87; CCC = 0.90) of goats. In conclusion, the SRNS for sheep can accurately predict dietary organic matter digestibility, ADG of growing lambs

- by Luis Orlindo Tedeschi
- •
- Animal Production, Energy Balance, Nitrogen, Prediction error

Low cost gas multisensor devices can represent an efficient solution for densifying the sparse urban air pollution monitoring mesh. In a previous work, we proposed and evaluated the calibration of such a device using short term on-field recorded data for the Benzene pollution quantification. In this work, we present and discuss the results obtained for CO, NO 2 and total NOx pollutants concentration estimation with the same set up. Conventional air pollution monitoring station is used to provide reference data.

- by E. Massera
- •
- Materials Engineering, Analytical Chemistry, Air pollution, Artificial Neural Networks

Classification and regression trees are prediction models constructed by recursively partitioning a data set and fitting a simple model to each partition. Their name derives from the usual practice of describing the partitioning process by a decision tree. This article reviews some widely available algorithms and compares their capabilities, strengths and weaknesses in two examples.

- by A H
- •
- Statistics, Machine Learning, Biometrics, Public Health

SUMMARY In the present study we evaluate the performance of different Machine Learning Methods to predict the unreported data on Caretta caretta by-catch by the Uruguayan longline fishery in the Southwestern Atlantic Ocean. The methods evaluated were Classification And Regression Trees, Random Forest, CForest and Support Vector Machines, and was selected the model with minor predictive error rate. We used on board observed data to predict logbook unreported loggerhead by-catch during 1998 to 2007 using different explanatory variables. Random Forests and CForest were the method selected because its presents the minor predictive error rate. The Random Forest approach predicted a total capture of 13 065 and CForest 12 892 loggerhead turtles during the study period. We also evaluate the variable importance in the prediction for both methods. The year, type of fishing gear and month are the variables most important in the by-catch of loggerhead sea turtles. Machine Learning methods appea...

- by Soledad Marroni
- •
- Geography, Machine Learning, Support Vector Machines, Access To Information

Wind energy is having an increasing influence on the energy supply in many countries, but in contrast to conventional power plants, it is a fluctuating energy source. For its integration into the electricity supply structure, it is necessary to predict the wind power hours or days ahead. There are models based on physical, statistical and artificial intelligence approaches for the prediction of wind power. This paper introduces a new short-term prediction method based on the application of evolutionary optimization algorithms for the automated specification of two well-known time series prediction models, i.e., neural networks and the nearest neighbour search. Two optimization algorithms are applied and compared, namely particle swarm optimization and differential evolution. To predict the power output of a certain wind farm, this method uses predicted weather data and historic power data of that wind farm, as well as historic power data of other wind farms far from the location of the wind farm considered. Using these optimization algorithms, we get a reduction of the prediction error compared to the model based on neural networks with standard manually selected variables. An additional reduction in error can be obtained by using the mean model output of the neural network model and of the nearest neighbour search based prediction approach.

- by Kurt Rohrig
- •
- Marketing, Econometrics, Wind Energy, Forecasting

THE SURPRISING HISTORY OF THE "HRmax=220-age" EQUATION. Robert A. Robergs, Roberto Landwehr. JEPonline. 2002;5(2):1-10. The estimation of maximal heart rate (HRmax) has been a feature of exercise physiology and related applied sciences since the late 1930's. The estimation of HRmax has been largely based on the formula; HRmax=220-age. This equation is often presented in textbooks without explanation or citation to original research. In addition, the formula and related concepts are included in most certification exams within sports medicine, exercise physiology, and fitness. Despite the acceptance of this formula, research spanning more than two decades reveals the large error inherent in the estimation of HRmax (Sxy=7-11 b/min). Ironically, inquiry into the history of this formula reveals that it was not developed from original research, but resulted from observation based on data from approximately 11 references consisting of published research or unpublished scientific compilations. Consequently, the formula HRmax=220-age has no scientific merit for use in exercise physiology and related fields. A brief review of alternate HRmax prediction formula reveals that the majority of age-based univariate prediction equations also have large prediction errors (>10 b/min). Clearly, more research of HRmax needs to be done using a multivariate model, and equations may need to be developed that are population (fitness, health status, age, exercise mode) specific.

- by Rob Landwehr
- •
- Sports Medicine, Physical Education, Fitness, Heart rate

Empirical QSAR models are only valid in the domain they were trained and validated. Application of the model to substances outside the domain of the model can lead to grossly erroneous predictions. Partial least squares (PLS) regression provides tools for prediction diagnostics that can be used to decide whether or not a substance is within the model domain, i.e. if the model prediction can be trusted. QSAR models for four different environmental end-points are used to demonstrate the importance of appropriate training set selection and how the reliability of QSAR predictions can be increased by outlier diagnostics. All models showed consistent results; test set prediction errors were very similar in magnitude to training set estimation errors when prediction outlier diagnostics were used to detect and remove outliers in the prediction data. Test set prediction errors for substances classified as outliers were much larger. The difference in the number of outliers between models with a randomly and systematically selected training illustrates well the need of representative training data.

- by Magnus Rahmberg
- •
- Crustacea, Risk assessment, Daphnia, Multidisciplinary

This paper aims to offer an account of affective experiences within Predictive Processing, a novel framework that considers the brain to be a dynamical, hierarchical, Bayesian hypothesis-testing mechanism. We begin by outlining a set of common features of affective experiences (or feelings) that a PP-theory should aim to explain: feelings are conscious, they have valence, they motivate behaviour, and they are intentional states with particular and formal objects. We then review existing theories of affective experiences within Predictive Processing and delineate two families of theories: Interoceptive Inference Theories (which state that feelings are determined by interoceptive predictions) and Error Dynamics Theories (which state that feelings are determined by properties of error dynamics). We highlight the strengths and shortcomings of each family of theories and develop a synthesis: the Affective Inference Theory. Affective Inference Theory claims that valence corresponds to the expected rate of prediction error reduction. In turn, the particular object of a feeling is the object predicted to be the most likely cause of expected changes in prediction error rate, and the formal object of a feeling is a predictive model of the expected changes in prediction error rate caused by a given particular object. Finally, our theory shows how affective experiences bias action selection, directing the organism towards allostasis and towards optimal levels of uncertainty in order to minimise prediction error over time.

- by Slawa Loev and +1
- •
- Emotion, Dynamical Systems, Prediction, Affect/Emotion