Bayesian Modeling Research Papers - Academia.edu (original) (raw)
Remote Sensing images exhibit an enormous amount of information. In order to extract this information in a robust way and to make it available as efficient indices for query by image content, we present a scheme of hierarchical stochastic... more
Remote Sensing images exhibit an enormous amount of information. In order to extract this information in a robust way and to make it available as efficient indices for query by image content, we present a scheme of hierarchical stochastic description. The different levels in this hierarchy are derived from the different levels of abstraction: image data (0), image features (1), meta features (2), image classification (3), geometric features (4), and user-specific semantics (5). We describe this hierarchical scheme and the processes of Bayesian inference between these levels and present a case study using synthetic aperture radar (SAR) data.
In an ecosystem, there is need to establish the quantity and quality of resources and their suitability for a certain range of land uses in order to assure its future productivity and sustainability of biodiversity. Parametric methods... more
In an ecosystem, there is need to establish the
quantity and quality of resources and their suitability for a
certain range of land uses in order to assure its future
productivity and sustainability of biodiversity. Parametric
methods are widely used for land suitability evaluation. A new
parametric concept “equation” of land suitability evaluation
has been proposed to improve results of land suitability
evaluation. Land suitability assessment for wheat production
was conducted in order to compare results of the suggest
method with classical parametric methods. Organic matter,
CaCO3, pH, Slope, texture, drainage, depth, EC and altitude
were recognized as factors affecting land suitability for wheat
production in the study area. Comparing results of the three
parametric methods used showed that the proposed equation
gave higher suitability index values than classical methods.
Great correlation has been found between results of the three
methods. Organic matter, topology and pH were found to be the
limiting factors for wheat production in the study area.
Generally, the proposed equation may improve land suitability
assessment process and gives better realistic results.
Linear regression models where the response variable is censored are often considered in statistical analysis. A parametric relationship between the response variable and covariates and normality of random errors are assumptions typically... more
Linear regression models where the response variable is censored are often considered in statistical analysis. A parametric relationship between the response variable and covariates and normality of random errors are assumptions typically considered in modeling censored responses. In this context, the aim of this paper is to extend the normal censored regression model by considering on one hand that the response variable is linearly dependent on some covariates whereas its relation to other variables is characterized by nonparametric functions, and on the other hand that error terms of the regression model belong to a class of symmetric heavy-tailed distributions capable of accommodating outliers and/or influential observations in a better way than the normal distribution. We achieve a fully Bayesian inference using pthdegree spline smooth functions to approximate the nonparametric functions. The likelihood function is utilized to compute not only some Bayesian model selection measures but also to develop Bayesian case-deletion influence diagnostics based on the q-divergence measures. The newly developed procedures are illustrated with an application and simulated data.
Improving road safety through proper pavement maintenance is one of the goals of pavement management. Many studies have found that pavement conditions significantly influence traffic safety. Although several studies have explored the... more
Improving road safety through proper pavement maintenance is one of the goals of pavement management. Many studies have found that pavement conditions significantly influence traffic safety. Although several studies have explored the relationship between pavement conditions and crash occurrence, the effect of poor pavement conditions on crash severity levels has not been investigated, especially by using a discrete model that can handle ordered data. This paper focuses on the development of the relationship between poor pavement conditions and crash severity levels using a series of Bayesian ordered logistic models for low/medium/high speed roads and single/multiple collision cases. The Bayesian ordered logistic regression models indicated that the poor pavement condition decreases the severity of singlevehicle collisions on low-speed roads whereas it increases their severity on high-speed roads. On the other hand, the poor pavement condition increases the severity of multiple-vehic...
Visual servoing, or the control of motion on the basis of image analysis in a closed loop, is more and more recognized as 9 an important tool in modern robotics. Here, we present a new model-driven approach to derive a description of the... more
Visual servoing, or the control of motion on the basis of image analysis in a closed loop, is more and more recognized as 9 an important tool in modern robotics. Here, we present a new model-driven approach to derive a description of the motion 10 of a target object. This method can be subdivided into an illumination invariant target detection stage and a servoing process 11 which uses an adaptive Kalman filter to update the model of the non-linear system. This technique can be applied to any 12 pan-tilt zoom camera mounted on a mobile vehicle as well as to a static camera tracking moving environmental features.
Improving road safety through proper pavement maintenance is one of the goals of pavement management. Many studies have found that pavement conditions significantly influence traffic safety. Although several studies have explored the... more
Improving road safety through proper pavement maintenance is one of the goals of pavement management. Many studies have found that pavement conditions significantly influence traffic safety. Although several studies have explored the relationship between pavement conditions and crash occurrence, the effect of poor pavement conditions on crash severity levels has not been investigated, especially by using a discrete model that can handle ordered data. This paper focuses on the development of the relationship between poor pavement conditions and crash severity levels using a series of Bayesian ordered logistic models for low/medium/high speed roads and single/multiple collision cases. The Bayesian ordered logistic regression models indicated that the poor pavement condition decreases the severity of singlevehicle collisions on low-speed roads whereas it increases their severity on high-speed roads. On the other hand, the poor pavement condition increases the severity of multiple-vehic...
In the paper we use Bayesian modeling of site-based and regional 14C calendar chronologies to provide calendar age estimates for events related to the transition between the Neolithic and the Eneolithic period in the late 5th millennium... more
In the paper we use Bayesian modeling of site-based and regional 14C calendar chronologies to provide calendar age estimates for events related to the transition between the Neolithic and the Eneolithic period in the late 5th millennium cal BC in Slovenia and Croatia. We discuss the limitations of the established relative chronologies, their underlying assumptions and explanations. We suggest a more complex temporality of change and continuity and present some implications of our results for calendar chronology of Southeastern Europe.
- by Marko Sraka
- •
- Neolithic, Eneolithic, Croatia, Slovenia
Human intentional communication is marked by its flexibility and context sensitivity. Hypothesized brain mechanisms can provide convincing and complete explanations of the human capacity for intentional communication only insofar as they... more
Human intentional communication is marked by its flexibility and context sensitivity. Hypothesized brain mechanisms can provide convincing and complete explanations of the human capacity for intentional communication only insofar as they can match the computational power required for displaying that capacity. It is thus of importance for cognitive neuroscience to know how computationally complex intentional communication actually is. Though the subject of considerable debate, the computational complexity of communication remains so far unknown. In this paper we defend the position that the computational complexity of communication is not a constant, as some views of communication seem to hold, but rather a function of situational factors. We present a methodology for studying and characterizing the computational complexity of communication under different situational constraints. We illustrate our methodology for a model of the problems solved by receivers and senders during a communicative exchange. This approach opens the way to a principled identification of putative model parameters that control cognitive processes supporting intentional communication.
Doktorska disertacija je nastajala med leti 2011 in 2016 v času moje zaposlitve na Oddelku za arheologijo v okviru programa mladi raziskovalci, ki ga je financirala Javna agencija za raziskovalno dejavnost v okviru Ministrstva za... more
Doktorska disertacija je nastajala med leti 2011 in 2016 v času moje zaposlitve na Oddelku za arheologijo v okviru programa mladi raziskovalci, ki ga je financirala Javna agencija za raziskovalno dejavnost v okviru Ministrstva za izobraževanje, znanost, kulturo in šport Republike Slovenije. Spodbudno je bilo tudi delo s projektno skupino v okviru raziskovalnega programa "Arheologija (P6-0247-058)" in predvsem raziskovalnega projekta "Arheologije lovcev, poljedelcev in metalurgov: kulture, populacije, paleogospodarstva in okolje (J6-4085-0581)".
We propose a new reconstruction procedure for X-ray computed tomography (CT) based on Bayesian modeling. We utilize the knowledge that the human body is composed of only a limited number of materials whose CT values are roughly known in... more
We propose a new reconstruction procedure for X-ray computed tomography (CT) based on Bayesian modeling. We utilize the knowledge that the human body is composed of only a limited number of materials whose CT values are roughly known in advance. Although the exact Bayesian inference of our model is intractable, we propose an efficient algorithm based on the variational Bayes technique. Experiments show that the proposed method performs better than the existing methods in severe situations where samples are limited or metal is inserted into the body.
Previous psychophysical experiments have demonstrated that various factors can exert a considerable inXuence on the apparent velocity of visual stimuli. Here, we investigated the eVects of superimposing static luminance texture on the... more
Previous psychophysical experiments have demonstrated that various factors can exert a considerable inXuence on the apparent velocity of visual stimuli. Here, we investigated the eVects of superimposing static luminance texture on the apparent speed of a drifting grating. In Experiment 1, we demonstrate that superimposing static luminance texture on a drifting luminance modulated grating can produce an increase in perceived speed. This supports the hypothesis that texture changes perceived speed by providing landmarks to assess relative motion. In Experiment 2, we showed that contrary to static luminance texture, dynamic luminance texture did not increase perceived speed. This demonstrates that texture must provide reliable spatial landmarks in order to generate an increase in perceived speed. The results of Experiment 3 demonstrate that perceived speed depends on the size of the area covered by texture. This suggests that luminance texture and the motion stimulus interacted with each other over a limited spatial scale and that these local responses are then pooled to determine the speed of the motion stimulus. In Experiment 4, we showed that static texture contrast could produce a greater eVect than motion stimulus contrast on perceived speed and that these eVects could still be observed at brief presentation times. We discuss these Wndings in the context of models proposed to account for phenomena in the perception of speed.
With the broad reach of the Internet, online users frequently resort to various word-of-mouth (WOM) sources, such as online user reviews and professional reviews, during online decision making. Although prior studies generally agree on... more
With the broad reach of the Internet, online users frequently resort to various word-of-mouth (WOM) sources, such as online user reviews and professional reviews, during online decision making. Although prior studies generally agree on the importance of online WOM, we have little knowledge of the interplay between online user reviews and professional reviews. This paper empirically investigates a mediation model in which online user reviews mediate the impact of professional reviews on online user decisions. Using software download data, we show that a higher professional rating not only directly promotes software download but also results in more active user-generated WOM interactions, which indirectly lead to more downloads. The indirect impact of professional reviews can be as large as 20% of the corresponding total impact. These findings deepen our understanding of online WOM effect, and provide managerial suggestions about WOM marketing and the prediction of online user choices.
- by wenqi zhou and +1
- •
- Word of Mouth, Mediation Models, Bayesian Modeling, Online User Reviews
Language comprehension is expectation-based (e.g. Venhuizen et al. 2019). Statistical regularities in the linguistic input set up expectations that are utilized during incremental interpretation. A central part of language comprehension... more
Language comprehension is expectation-based (e.g. Venhuizen et al. 2019). Statistical regularities in the linguistic input set up expectations that are utilized during incremental interpretation. A central part of language comprehension involves assigning grammatical functions (GFs) to NPs, thereby determining how participants are related to events or states. In many languages, speakers have many ways to encode GFs morphosyntactically (e.g. word order, case), and their encoding preferences depend on an interplay between NP properties (e.g., animacy) and verb semantic properties (e.g., volitionality) (Hörberg 2016). This creates complex statistical patterns in the distribution of these GF information types that can be utilized during on-line GF processing. I will present evidence indicating that GF assignment in transitive sentences in written Swedish is expectation-based, drawing upon such statistical patterns. I present a corpus-based probabilistic model of incremental GF assignmen...
Epidemic data often possess certain characteristics, such as the presence of many zeros, the spatial nature of the disease spread mechanism, environmental noise, serial correlation and dependence on time-varying factors. This paper... more
Epidemic data often possess certain characteristics, such as the presence of many zeros, the spatial nature of the disease spread mechanism, environmental noise, serial correlation and dependence on time-varying factors. This paper addresses these issues via suitable Bayesian modelling. In doing so, we utilize a general class of stochastic regression models appropriate for spatio-temporal count data with an excess number of zeros. The developed regression framework does incorporate serial correlation and time-varying covariates through an Ornstein-Uhlenbeck process formulation. In addition, we explore the effect of different priors, including default options and variations of mixtures of g-priors. The effect of different distance kernels for the epidemic model component is investigated. We proceed by developing branching process-based methods for testing scenarios for disease control, thus linking traditional epidemiological models with stochastic epidemic processes, useful in polic...
Comparing the inductive biases of simple neural networks and Bayesian models Thomas L. Griffiths (tom griffiths@berkeley.edu) Joseph L. Austerweil (joseph.austerweil@gmail.com) Vincent G. Berthiaume (vberthiaume@berkeley.edu) Department of Psychology, University of California, Berkeley, CA 94720 USA Abstract Understanding the relationship between connectionist and probabilistic models is important for evaluating the compati- bility of these approaches. We use mathematical analyses and computer simulations to show that a linear neural network can approximate the generalization performance of a probabilis- tic model of property induction, and that training this network by gradient descent with early stopping results in similar per- formance to Bayesian inference with a particular prior. How- ever, this prior differs from distributions defined using discrete structure, suggesting that neural networks have inductive bi- ases that can be differentiated from probabilistic models with stru...
Groundwater flow and mass transport predictions are always subject to uncertainty due to the scarcity of data with which models are built. Only a few measurements of aquifer parameters, such as hydraulic conductivity or porosity, are used... more
Groundwater flow and mass transport predictions are always subject to uncertainty due to the scarcity of data with which models are built. Only a few measurements of aquifer parameters, such as hydraulic conductivity or porosity, are used to construct a model, and a few measurements on the aquifer state, such as piezometric heads or solute concentrations, are employed to verify/calibrate the goodness of the model. Yet, at unsampled locations, neither the parameter values nor the aquifer state can be predicted (in space and/or time) without uncertainty. We demonstrate the applicability of a new blocking Markov chain Monte Carlo (BMcMC) algorithm for uncertainty assessment using, as a reference, a synthetic aquifer in which all parameter values and state variables are known. We also analyze the worth of different types of data for the characterization of the aquifer and for reduction of uncertainty in parameters and variables. The BMcMC method allows the generation of multiple plausible representations of the aquifer parameters, and their corresponding aquifer state, honoring all available information on both parameters and state variables. The realizations are also coherent with an a priori statistical model for the spatial variability of the aquifer parameters. BMcMC is capable of direct-conditioning (on model parameter data) and inverse conditioning (on state variable data). We demonstrate the flexibility of BMcMC to inverse condition on piezometric head data as well as on travel time data, what permits identification of the impact that each data type has on the uncertainty about hydraulic conductivity, piezometric head, and travel time.
While many constraints on learning must be relatively experience-independent, past experience provides a rich source of guidance for subsequent learning. Discovering structure in some domain can inform a learner's future hypotheses about... more
While many constraints on learning must be relatively experience-independent, past experience provides a rich source of guidance for subsequent learning. Discovering structure in some domain can inform a learner's future hypotheses about that domain. If a general property accounts for particular sub-patterns, a rational learner should not stipulate separate explanations for each detail without additional evidence, as the general structure has ''explained away'' the original evidence. In a grammar-learning experiment using tone sequences, manipulating learners' prior exposure to a tone environment affects their sensitivity to the grammar-defining feature, in this case consecutive repeated tones. Grammar-learning performance is worse if context melodies are ''smooth''-when small intervals occur more than large ones-as Smoothness is a general property accounting for a high rate of repetition. We present an idealized Bayesian model as a ''best case'' benchmark for learning repetition grammars. When context melodies are Smooth, the model places greater weight on the small-interval constraint, and does not learn the repetition rule as well as when context melodies are not Smooth, paralleling the human learners. These findings support an account of abstract grammar-induction in which learners rationally assess the statistical evidence for underlying structure based on a generative model of the environment.
Whereas breathalysers have been shown to provide blood alcohol concentration (BAC) measurements comparable to those obtained by gas chromatography, such evidence has not been reported in low and middle income countries where measures for... more
Whereas breathalysers have been shown to provide blood alcohol concentration (BAC) measurements comparable to those obtained by gas chromatography, such evidence has not been reported in low and middle income countries where measures for preventing alcohol-related injuries are virtually non-existent. Before promoting any method of blood alcohol evaluation, as a routine procedure for monitoring the association of alcohol with different types of injuries in Kenya, we sought to assess the reliability and validity of blood alcohol results obtained by a breathalyser, using gas chromatography analysis values as the reference, in a sample of 179 trauma-affected adults presenting to casualty departments. No differences in proportions of subjects with high levels of blood alcohol (equal to or greater than 50 mg%) were detected by breath and blood test procedures (58.7 vs 60.3%). Breathalyser readings yielded high levels of sensitivity and specificity (97.2 and 100%, respectively) with optimal positive and negative predictive values (100 and 95.9%, respectively) at higher BACs (] 50 mg%). The study thus reaffirms that breathalyser tests are of value in detecting high blood alcohol levels and can be used to rapidly identify intoxicated subjects. The procedure is easy to perform and can be used for monitoring the association between blood alcohol level and driving in low-income developing countries.
Most agricultural models do not adequately represent real-life development decisions, not least because they fail to consider the impact of the full range of biophysical, socioeconomic , political and cultural factors that affect decision... more
Most agricultural models do not adequately represent real-life development decisions, not least because they fail to consider the impact of the full range of biophysical, socioeconomic , political and cultural factors that affect decision outcomes. Many modelling exercises restrict their scope to system aspects that can be characterised with precision, but this can lead to biased recommendations. For instance, only considering annual crop yields while neglecting ecosystem services provided by trees systematically undervalues agroforestry systems. Similarly, crop models that only consider abiotic factors but leave out pests, weeds and diseases may favour crop varieties that fit poorly in smallholder farming systems. To produce more holistic assessments that respond to the needs of development decision-makers, agricultural modelling needs new strategies that enhance its ability to deal with real-world complexities and allow capturing system aspects that defy precise quantification. Decision analysis, a decision-support approach from the private sector, aims to make recommendations for specific decisions based on currently available knowledge. To gain a holistic perspective, it starts with an assessment — often involving decision-makers, stake-holders and experts — of all decision-relevant aspects and their interconnections. Results from this assessment are translated into causal decision models, in which all factors are considered in quantitative terms, represented using probability distributions. For each input variable, all available sources of information, including hard data and expert opinion, are used to construct the distributions. Simulations produce probability distributions expressing the range of plausible decision outcomes. These outputs are often sufficient for identifying preferable decision options. If not, tools such as Value of Information analysis are used to highlight critical knowledge gaps where further information is needed to reduce uncertainty and clarify the best decision alternative. Decision analysis approaches are new to agricultural research for development, but several successful applications across Africa, e.g. forecasting the impacts of agricultural interventions in Kenya or prioritising among strategies for reservoir protection in Burkina Faso, have underscored their potential. Experiences so far indicate that decision analysis could emerge as a new paradigm for holistic, decision-focused agricultural modelling.
- by Cory Whitney and +1
- •
- Decision Analysis, Bayesian Modeling
Epidemic data often possess certain characteristics, such as the presence of many zeros, the spatial nature of the disease spread mechanism , environmental noise, serial correlation and dependence on time varying factors. This paper... more
Epidemic data often possess certain characteristics, such as the presence of many zeros, the spatial nature of the disease spread mechanism , environmental noise, serial correlation and dependence on time varying factors. This paper addresses these issues via suitable Bayesian modelling. In doing so we utilise a general class of stochastic regression models appropriate for spatio-temporal count data with an excess number of zeros. The developed regression framework does incorporate serial correlation and time varying covariates through an Ornstein Uhlenbeck process formulation. In addition, we explore the effect of different priors, including default options and variations of mixtures of g-priors. The effect of different distance kernels for the epidemic model component is investigated. We proceed by developing branching process-based methods for testing scenarios for disease control, thus linking traditional epidemiological models with stochastic epidemic processes, useful in policy-focused decision making. The approach is illustrated with an application to a sheep pox dataset from the Evros region, Greece.
We present a Bayesian variable selection procedure that is applicable to genomewide studies involving a combination of clinical, gene expression and genotype information. We use the Mode Oriented Stochastic Search (MOSS) algorithm of... more
We present a Bayesian variable selection procedure that is applicable to genomewide studies involving a combination of clinical, gene expression and genotype information. We use the Mode Oriented Stochastic Search (MOSS) algorithm of Dobra and Massam (2010) to explore regions of high posterior probability for regression models involving discrete covariates and to perform hierarchical log-linear model search to identify the most relevant associations among the resulting subsets of regressors. We illustrate our methodology with simulated data, expression data and SNP data.
How children go about learning the general regularities that govern language, as well as keeping track of the exceptions to them, remains one of the challenging open questions in the cognitive science of language. Computational modeling... more
How children go about learning the general regularities that govern language, as well as keeping track of the exceptions to them, remains one of the challenging open questions in the cognitive science of language. Computational modeling is an important methodology in research aimed at addressing this issue. We must determine appropriate learning mechanisms that can grasp generalizations from examples of specific usages, and that exhibit patterns of behavior over the course of learning similar to those in children. Early learning of verb argument structure is an area of language acquisition that provides an interesting testbed for such approaches due to the complexity of verb usages. A range of linguistic factors interact in determining the felicitous use of a verb in various constructionsassociations between syntactic forms and properties of meaning that form the basis for a number of linguistic and psycholinguistic theories of language. This article presents a computational model for the representation, acquisition, and use of verbs and constructions. The Bayesian framework is founded on a novel view of constructions as a probabilistic association between syntactic and semantic features. The computational experiments reported here demonstrate the feasibility of learning general constructions, and their exceptions, from individual usages of verbs. The behavior of the model over the timecourse of acquisition mimics, in relevant aspects, the stages of learning exhibited by children. Therefore, this proposal sheds light on the possible mechanisms at work in forming linguistic generalizations and maintaining knowledge of exceptions.
We compared the performance of tuberculin skin test (TST), Quantiferon-TB Gold in-tube (QFT-GIT), and T-SPOT.TB in diagnosing latent tuberculosis (LTBI) among childhood TB contacts in a TB endemic setting with high BCG coverage. We... more
We compared the performance of tuberculin skin test (TST), Quantiferon-TB Gold in-tube (QFT-GIT), and T-SPOT.TB in diagnosing latent tuberculosis (LTBI) among childhood TB contacts in a TB endemic setting with high BCG coverage. We evaluated the performance of interferon gamma release assays (IGRAs) and TST when combined in an algorithm. Childhood contacts of newly diagnosed TB patients were tested with TST, QFT-GIT, and T-SPOT. The level of exposure in contacts was categorized according to whether they slept in the same room, same house, or a different house as the index case. For the evaluation of combined test performance, prior estimates for prevalence of latent TB were used in Bayesian models that assumed conditional dependence between tests. A total of 285 children were recruited. Overall, 26.5%, 33.0%, and 33.5% were positive for TST, T-SPOT, or QFT-GIT, respectively. All 3 tests responded to the gradient of sleeping proximity to the index case. Neither TST nor IGRA results were confounded by BCG vaccination. There was moderate agreement (kappa = 0.40-0.68) between all 3 tests. Combination of either IGRA with TST increased sensitivity (by 9.3%-9.6%) especially in contacts in the highest exposure category but was associated with loss of specificity (9.9%-11.3%). IGRAs and TST are similar in their diagnostic performance for LTBI. An approximate 10% sensitivity benefit for using the TST and an IGRA in combination is associated with a slightly greater specificity loss. Testing strategies combining an IGRA and TST with an "or" statement may be useful only in situations where there is a high pretest probability of latent infection.
A large number of crashes occur on curves even though they account for only a small percentage of a system's mileage. Excessive speed has been identified as a primary factor in both lane departure and curve-related crashes. A number of... more
A large number of crashes occur on curves even though they account for only a small percentage of a system's mileage. Excessive speed has been identified as a primary factor in both lane departure and curve-related crashes. A number of countermeasures have been proposed to reduce driver speeds on curves, which ideally result in successful curve negotiation and fewer crashes. Dynamic speed feedback sign (DSFS) systems are traffic control devices that have been used to reduce vehicle speeds successfully and, subsequently, crashes in applications such as traffic calming on urban roads. DSFS systems show promise, but they have not been fully evaluated for rural curves. To better understand the effectiveness of DSFS systems in reducing crashes on curves, a national field evaluation of DSFS systems on curves on rural two lane roadways was conducted. Two different DSFS systems were selected and placed at 22 sites in seven states. Control sites were also identified. A full Bayes modeling methodology was utilized to develop crash modification factors (CMFs) for several scenarios including total crashes for both directions, total crashes in the direction of the sign, total single-vehicle crashes, and single-vehicle crashes in the direction of the sign. Using quarterly crash frequency as the response variable, crash modification factors were developed and results showed that crashes were 5% to 7% lower after installation of the signs depending on the model.
The current USEPA cancer risk assessment for dichloromethane (DCM) is based on deterministic physiologically based pharmacokinetic (PBPK) modeling involving comparative metabolism of DCM by the GST pathway in the lung and liver of humans... more
The current USEPA cancer risk assessment for dichloromethane (DCM) is based on deterministic physiologically based pharmacokinetic (PBPK) modeling involving comparative metabolism of DCM by the GST pathway in the lung and liver of humans and mice. Recent advances ...
This article describes an extension of classical χ 2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involves evaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its... more
This article describes an extension of classical χ 2 goodness-of-fit tests to Bayesian model assessment. The extension, which essentially involves evaluating Pearson's goodness-of-fit statistic at a parameter value drawn from its posterior distribution, has the important property that it is asymptotically distributed as a χ 2 random variable on K − 1 degrees of freedom, independently of the dimension of the underlying parameter vector. By examining the posterior distribution of this statistic, global goodness-of-fit diagnostics are obtained. Advantages of these diagnostics include ease of interpretation, computational convenience and favorable power properties. The proposed diagnostics can be used to assess the adequacy of a broad class of Bayesian models, essentially requiring only a finite-dimensional parameter vector and conditionally independent observations.
Zero-inflated versions of standard distributions for count data are often required in order to account for excess zeros when modeling the abundance of organisms. Such distributions typically have as parameters k, the mean of the count... more
Zero-inflated versions of standard distributions for count data are often required in order to account for excess zeros when modeling the abundance of organisms. Such distributions typically have as parameters k, the mean of the count distribution, and p, the probability of an excess zero. Implementations of zero-inflated models in ecology typically model k using a set of predictor variables, and p is fit either as a constant or with its own separate model. Neither of these approaches makes use of any relationship that might exist between p and k. However, for many species, the rate of occupancy is closely and positively related to its average abundance. Here, this relationship was incorporated into the model for zero inflation by functionally linking p to k, and was demonstrated in a study of snapper (Pagrus auratus) in and around a marine reserve. This approach has several potential practical advantages, including better computational performance and more straightforward model interpretation. It is concluded that, where appropriate, directly linking p to k can produce more ecologically accurate and parsimonious statistical models of species abundance data.
Ensuring adequate use of the computing resources for highly fluctuating availability in multi-user computational environments requires effective prediction models, which play a key role in achieving application performance for large-scale... more
Ensuring adequate use of the computing resources for highly fluctuating availability in multi-user computational environments requires effective prediction models, which play a key role in achieving application performance for large-scale distributed applications. Predicting the processor availability for scheduling a new process or task in a distributed environment is a basic problem that arises in many important contexts. The present paper aims at developing a model for single-step-ahead CPU load prediction that can be used to predict the future CPU load in a dynamic environment. Our prediction model is based on the control of multiple Local Adaptive Network-based Fuzzy Inference Systems Predictors (LAPs) via the Naïve Bayesian Network inference between clusters states of CPU load time points obtained by the C-means clustering process. Experimental results show that our model performs better and has less overhead than other approaches reported in the literature.
The concept of Quality by Design (QbD) as published in ICH-Q8 is currently one of the most recurrent topics in the pharmaceutical literature. This guideline recommends the use of information and prior knowledge gathered during... more
The concept of Quality by Design (QbD) as published in ICH-Q8 is currently one of the most recurrent topics in the pharmaceutical literature. This guideline recommends the use of information and prior knowledge gathered during pharmaceutical development studies to provide a scientific rationale for the manufacturing process of a product and provide guarantee of future quality. This poses several challenges from a statistical standpoint and requires a shift in paradigm from traditional statistical practices. First, to provide "assurance of quality" of future lots implies the need to make predictions regarding quality given past evidence and data. Second, the Quality Attributes described in the Q8 guidelines are not always a set of unique, independent measurements. In many cases, these criteria are complicated longitudinal data with successive acceptance criteria over a defined period of time. A common example is a dissolution profile for a modified or extended release solid dosage form that must fall within acceptance limits at several time points. A Bayesian approach for longitudinal data obtained in various conditions of a Design of Experiment is provided to elegantly address the ICH-Q8 recommendation to provide assurance of quality and derive a scientifically sound Design Space.
The target article provides important theoretical contributions to psychology and Bayesian modeling. Despite the article’s excellent points, we suggest that it succumbs to a few misconceptions about evolutionary psychology (EP). These... more
The target article provides important theoretical contributions
to psychology and Bayesian modeling. Despite the article’s excellent
points, we suggest that it succumbs to a few misconceptions about
evolutionary psychology (EP). These include a mischaracterization of
evolutionary psychology’s approach to optimality; failure to appreciate
the centrality of mechanism in EP; and an incorrect depiction of
hypothesis testing. An accurate characterization of EP offers more
promise for successful integration with Bayesian modeling.
Human intentional communication is marked by its flexibility and context sensitivity. Hypothesized brain mechanisms can provide convincing and complete explanations of the human capacity for intentional communication only insofar as they... more
Human intentional communication is marked by its flexibility and context sensitivity. Hypothesized brain mechanisms can provide convincing and complete explanations of the human capacity for intentional communication only insofar as they can match the computational power required for displaying that capacity. It is thus of importance for cognitive neuroscience to know how computationally complex intentional communication actually is. Though the subject of considerable debate, the computational complexity of communication remains so far unknown. In this paper we defend the position that the computational complexity of communication is not a constant, as some views of communication seem to hold, but rather a function of situational factors. We present a methodology for studying and characterizing the computational complexity of communication under different situational constraints. We illustrate our methodology for a model of the problems solved by receivers and senders during a communicative exchange. This approach opens the way to a principled identification of putative model parameters that control cognitive processes supporting intentional communication.
1] Geomagnetic paleointensities have been determined from a single archaeological site in Lübeck, Germany, where a sequence of 25 bread oven floors has been preserved in a bakery from medieval times until today. Age dating confines the... more
1] Geomagnetic paleointensities have been determined from a single archaeological site in Lübeck, Germany, where a sequence of 25 bread oven floors has been preserved in a bakery from medieval times until today. Age dating confines the time interval from about 1300 A.D. to about 1750 A.D. Paleomagnetic directions have been published from each oven floor and are updated here. The specimens have very stable directions and no or only weak secondary components. The oven floor material was characterized rock magnetically using Thellier viscosity indices, median destructive field values, Curie point determinations, and hysteresis measurements. Magnetic carriers are mixtures of SD, PSD, and minor MD magnetite and/or maghemite together with small amounts of hematite. Paleointensity was measured from selected specimens with the double-heating Thellier method including pTRM checks and determination of TRM anisotropy tensors. Corrections for anisotropy as well as for cooling rate turned out to be unnecessary. Ninety-two percent of the Thellier experiments passed the assigned acceptance criteria and provided four to six reliable paleointensity estimates per oven floor. Mean paleointensity values derived from 22 oven floors show maxima in the 15th and early 17th centuries A.D., followed by a decrease of paleointensity of about 20% until 1750 A.D. Together with the directions the record represents about 450 years of full vector secular variation. The results compare well with historical models of the Earth's magnetic field as well as with a selected high-quality paleointensity data set for western and central Europe.
This paper demonstrates the relative strengths and weaknesses of SEM and Bayesian approaches to combining different sources of data when estimating latent variables. Data on party left-right positioning collected from party manifestos and... more
This paper demonstrates the relative strengths and weaknesses of SEM and Bayesian approaches to combining different sources of data when estimating latent variables. Data on party left-right positioning collected from party manifestos and surveys of party experts, MPs and voters are used to illustrate the two techniques. Although widely used and accepted, the SEM approach is less useful than the Bayesian approach, particularly when using the latent variable in subsequent predictive estimations.
We analyzed effects of three land management alternatives on 31 terrestrial vertebrates of conservation concern within the interior Columbia river basin study area. The three alternatives were proposed in a Supplemental Draft... more
We analyzed effects of three land management alternatives on 31 terrestrial vertebrates of conservation concern within the interior Columbia river basin study area. The three alternatives were proposed in a Supplemental Draft Environmental Impact Statement (SDEIS) that was developed for lands in the study area administered by the US Department of Agriculture (USDA) Forest Service (FS) and US Department of Interior (USDA) Bureau of Land Management (BLM). To evaluate effects of these alternatives, we developed Bayesian belief network (BBN) models, which allowed empirical and hypothesized relations to be combined in probability-based projections of conditions. We used the BBN models to project abundance and distribution of habitat to support potential populations (population outcomes) for each species across the entire study area. Population outcomes were defined in five classes, referred to as outcomes A-E. Under outcome A, populations are abundant and well distributed, with little or no likelihood of extirpation. By contrast, populations under outcome E are scarce and patchy, with a high likelihood of local or regional extirpation. Outcomes B-D represent gradients of conditions between the extremes of classes A and E. Most species (65%, or 20 of 31) were associated with outcome A historically and with outcomes D or E currently (55%, or 17 of 31). Population outcomes projected 100 years into the future were similar for all three alternatives but substantially different from historical and current outcomes. For species dependent on old-forest conditions, population outcomes typically improved one outcome class -usually from E or D to D or C -from current to the future under the alternatives. By contrast, population outcomes for rangeland species generally did not improve under the alternatives, with most species remaining in outcomes C, D, or E. Our results suggest that all three management alternatives will substantially improve conditions for most forest-associated species but provide few improvements for rangeland-associated vertebrates. Continued displacement of native vegetation by exotic plants, as facilitated by a variety of human-associated disturbances, will be an on-going challenge to the improvement of future conditions for rangeland species. #
Recent work on causal learning has investigated the possible role of generic priors in guiding human judgments of causal strength. One proposal has been that people have a preference for causes that are sparse and strong-i.e., few in... more
Recent work on causal learning has investigated the possible role of generic priors in guiding human judgments of causal strength. One proposal has been that people have a preference for causes that are sparse and strong-i.e., few in number and individually strong . Sparse-and-strong priors predict that competition can be observed between candidate causes of the same polarity (i.e., generative or else preventive) even if they occur independently. For instance, the strength of a moderately strong cause should be underestimated when a strong cause is also present, relative to when a weaker cause is present. In previous work we found such competition effects for causal setups involving multiple generative causes. Here we investigate whether analogous competition is found for strength judgments about multiple preventive causes. An experiment revealed that a cue competition effect is indeed observed for preventive causes; moreover, the effect appears to be more persistent (as the number of observations increases) than the corresponding effect observed for generative causes. These findings, which are consistent with predictions of a Bayesian learning model with sparse-and-strong priors, provide further evidence that a preference for parsimony guides inferences about causal strength.
The Bronze Age site of Ķivutkalns with its massive amount of archaeological artifacts and human remains is considered the largest bronze-working center in Latvia. The site is a unique combination of cemetery and hillfort believed to be... more
The Bronze Age site of Ķivutkalns with its massive amount of archaeological artifacts and human remains is considered the largest bronze-working center in Latvia. The site is a unique combination of cemetery and hillfort believed to be built on top of each other. This work presents new radiocarbon dates on human and animal bone collagen that somewhat challenge this interpretation. Based on analyses using a Bayesian modeling framework, the present data suggest overlapping calendar year distributions for the contexts within the 1st millennium BC. The carbon and nitrogen isotopic ratios indicate mainly terrestrial dietary habits of studied individuals and nuclear family remains buried in one of the graves. The older charcoal data may be subject to the old-wood effect and the results are partly limited by the limited amount of data and the 14 C calibration curve plateau of the 1st millennium BC. Therefore, the ultimate conclusions on contemporaneity of the cemetery and hillfort need to wait for further analyses on the massive amounts of bone material.
Our ability to understand and predict the response of ecosystems to a changing environment depends on quantifying vegetation functional diversity. However, representing this diversity at the global scale is challenging. Typically, in... more
Our ability to understand and predict the response of ecosystems to a changing environment depends on quantifying vegetation functional diversity. However, representing this diversity at the global scale is challenging. Typically, in Earth system models, characterization of plant diversity has been limited to grouping related species into plant functional types (PFTs), with all trait variation in a PFT collapsed into a single mean value that is applied globally. Using the largest global plant trait database and state of the art Bayesian modeling, we created fine-grained global maps of plant trait distributions that can be applied to Earth system models. Focusing on a set of plant traits closely coupled to photosynthesis and foliar respiration-specific leaf area (SLA) and dry mass-based concentrations of leaf nitrogen ([Formula: see text]) and phosphorus ([Formula: see text]), we characterize how traits vary within and among over 50,000 [Formula: see text]-km cells across the entire ...
Coherent Attributions with Co-occurring and Interacting Causes
Firms exhibit or “manifest” three types of branding strategies: corporate branding, house of brands, or mixed branding. These strategies differ in their essential structure and in their potential costs and benefits to the firm. Prior... more
Firms exhibit or “manifest” three types of branding strategies: corporate branding, house of brands, or mixed branding. These strategies differ in their essential structure and in their potential costs and benefits to the firm. Prior research has failed to understand how these branding strategies are related to the intangible value of the firm. The authors investigate this relationship using five-year data for a sample of 113 U.S. firms. They find that corporate branding strategy is associated with higher values of Tobin's q, and mixed branding strategy is associated with lower values of Tobin's q, after controlling for the effects of several important and relevant factors. The relationships of the control variables are consistent with prior expectations. In addition, most of the firms would have been able to improve their Tobin's q had they adopted a branding strategy different from the one their brand portfolios revealed. The authors also discuss implications and futur...
In an ecosystem, there is need to establish the quantity and quality of resources and their suitability for a certain range of land uses in order to assure its future productivity and sustainability of biodiversity. Parametric methods are... more
In an ecosystem, there is need to establish the quantity and quality of resources and their suitability for a certain range of land uses in order to assure its future productivity and sustainability of biodiversity. Parametric methods are widely used for land suitability evaluation. A new parametric concept "equation" of land suitability evaluation has been proposed to improve results of land suitability evaluation. Land suitability assessment for wheat production was conducted in order to compare results of the suggest method with classical parametric methods. Organic matter, CaCO3, pH, Slope, texture, drainage, depth, EC and altitude were recognized as factors affecting land suitability for wheat production in the study area. Comparing results of the three parametric methods used showed that the proposed equation gave higher suitability index values than classical methods. Great correlation has been found between results of the three methods. Organic matter, topology and pH were found to be the limiting factors for wheat production in the study area. Generally, the proposed equation may improve land suitability assessment process and gives better realistic results.
An updated PBPK model of methylene chloride (DCM, dichloromethane) carcinogenicity in mice was recently published using Bayesian statistical methods (Marino et al., 2006). In this work, this model was applied to humans, as recommended by... more
An updated PBPK model of methylene chloride (DCM, dichloromethane) carcinogenicity in mice was recently published using Bayesian statistical methods (Marino et al., 2006). In this work, this model was applied to humans, as recommended by Sweeney et al. (2004). Physiological parameters for input into the MCMC analysis were selected from multiple sources reXecting, in each case, the source that was considered to represent the most current scientiWc evidence for each parameter. Metabolic data for individual subjects from Wve human studies were combined into a single data set and population values derived using MCSim. These population values were used for calibration of the human model. The PBPK model using the calibrated metabolic parameters was used to perform a cancer risk assessment for DCM, using the same tumor incidence and exposure concentration data relied upon in the current EPA (1991) IRIS entry. Unit risks, i.e., the risk of cancer from exposure to 1 g/m 3 over a lifetime, for DCM were estimated using the calibrated human model. The results indicate skewed distributions for liver and lung tumor risks, alone or in combination, with a mean unit risk (per g/m 3) of 1.05 £ 10 ¡9 , considering both liver and lung tumors. Adding the distribution of genetic polymorphisms for metabolism to the ultimate carcinogen, the unit risks range from 0 (which is expected given that approximately 20% of the US population is estimated to be nonconjugators) up to a unit risk of 2.70 £ 10 ¡9 at the 95th percentile. The median, or 50th percentile, is 9.33 £ 10 ¡10 , which is approximately a factor of 500 lower than the current EPA unit risk of 4.7 £ 10 ¡7 using a previous PBPK model. These values represent the best estimates to date for DCM cancer risk because all available human data sets were used, and a probabilistic methodology was followed.
Recent work on causal learning has investigated the possible role of generic priors in guiding human judgments of ca usal strength. One proposal has been that people have a preference for causes that are sparse and strong —i.e., few in... more
Recent work on causal learning has investigated the possible role of generic priors in guiding human judgments of ca usal strength. One proposal has been that people have a preference for causes that are sparse and strong —i.e., few in number and individually strong (Lu et al., 2008) . Sparse-and-strong priors predict that competition can be observed between candidate causes of the same polarity (i.e., g enerative or else preventive) even if they occur independently . For instance, the strength of a moderately strong cause should be underestimated when a strong cause is also present, relative to when a weaker cause is present. In previous work (Powell et al., 2013) we found such competition effects for ca usal set - ups involving multiple generative causes . Here we investigate whether analogous competition is found for strength judgments about multiple preventive causes. An experim ent revealed that a cue competition effect is indeed observed for preventive causes; moreover, the ef...
The ability to model cognitive agents depends crucially on being able to encode and infer with contextual information at many levels (such as situational, psychological, social, organizational, political levels). We present initial... more
The ability to model cognitive agents depends crucially on being able to encode and infer with contextual information at many levels (such as situational, psychological, social, organizational, political levels). We present initial results from a novel computational framework, Coordinated Probabilistic Relational Models (CPRM), that can potentially model the combined impact of multiple contextual information sources for analysis and prediction.
Classical Bayesian spatial interpolation methods are based on the Gaussian assumption and therefore lead to unreliable results when applied to extreme valued data. Specifically, they give wrong estimates of the prediction uncertainty.... more
Classical Bayesian spatial interpolation methods are based on the Gaussian assumption and therefore lead to unreliable results when applied to extreme valued data. Specifically, they give wrong estimates of the prediction uncertainty. Copulas have recently attracted much attention in spatial statistics and are used as a flexible alternative to traditional methods for non-Gaussian spatial modeling and interpolation. We adopt this methodology and show how it can be incorporated in a Bayesian framework by assigning priors to all model parameters. In the absence of simple analytical expressions for the joint posterior distribution we propose a Metropolis-Hastings algorithm to obtain posterior samples. The posterior predictive density is approximated by averaging the plug-in predictive densities. Furthermore, we discuss the deficiencies of the existing spatial copula models with regard to modeling extreme events. It is shown that the non-Gaussian w 2 Àcopula model suffers from the same lack of tail dependence as the Gaussian copula and thus offers no advantage over the latter with respect to modeling extremes. We illustrate the proposed methodology by analyzing a dataset here referred to as the Helicopter dataset, which includes strongly skewed radioactivity measurements in the city of Oranienburg, Germany.
Large-scale inference for random spatial surfaces over a region using spatial process models has been well studied. Under such models, local analysis of the surface (eg, gradients at given points) has received recent attention. A more... more
Large-scale inference for random spatial surfaces over a region using spatial process models has been well studied. Under such models, local analysis of the surface (eg, gradients at given points) has received recent attention. A more ambitious objective is to move from points to ...
Large-scale inference for random spatial surfaces over a region using spatial process models has been well studied. Under such models, local analysis of the surface (e.g., gradients at given points) has received recent attention. A more... more
Large-scale inference for random spatial surfaces over a region using spatial process models has been well studied. Under such models, local analysis of the surface (e.g., gradients at given points) has received recent attention. A more ambitious objective is to move from points to curves, to attempt to assign a meaningful gradient to a curve. For a point, if the gradient in a particular direction is large (positive or negative), then the surface is rapidly increasing or decreasing in that direction. For a curve, if the gradients in the direction orthogonal to the curve tend to be large, then the curve tracks a path through the region where the surface is rapidly changing. In the literature, learning about where the surface exhibits rapid change is called wombling, and a curve such as we have described is called a wombling boundary. Existing wombling methods have focused mostly on identifying points and then connecting these points using an ad hoc algorithm to create curvilinear wombling boundaries. Such methods are not easily incorporated into a statistical modeling setting. The contribution of this article is to formalize the notion of a curvilinear wombling boundary in a vector analytic framework using parametric curves and to develop a comprehensive statistical framework for curvilinear boundary analysis based on spatial process models for point-referenced data. For a given curve that may represent a natural feature (e.g., a mountain, a river, or a political boundary), we address the issue of testing or assessing whether it is a wombling boundary. Our approach is applicable to both spatial response surfaces and, often more appropriately, spatial residual surfaces. We illustrate our methodology with a simulation study, a weather dataset for the state of Colorado, and a species presence/absence dataset from Connecticut.