Expectation Maximization Research Papers - Academia.edu (original) (raw)

Our previous works suggest that fractal texture feature is useful to detect pediatric brain tumor in multimodal MRI. In this study, we systematically investigate efficacy of using several different image features such as intensity,... more

Our previous works suggest that fractal texture feature is useful to detect pediatric brain tumor in multimodal MRI. In this study, we systematically investigate efficacy of using several different image features such as intensity, fractal texture, and level-set shape in segmentation of posterior-fossa (PF) tumor for pediatric patients. We explore effectiveness of using four different feature selection and three different segmentation techniques, respectively, to discriminate tumor regions from normal tissue in multimodal brain MRI. We further study the selective fusion of these features for improved PF tumor segmentation. Our result suggests that Kullback-Leibler divergence measure for feature ranking and selection and the expectation maximization algorithm for feature fusion and tumor segmentation offer the best results for the patient data in this study. We show that for T1 and fluid attenuation inversion recovery (FLAIR) MRI modalities, the best PF tumor segmentation is obtained using the texture feature such as multifractional Brownian motion (mBm) while that for T2 MRI is obtained by fusing level-set shape with intensity features. In multimodality fused MRI (T1, T2, and FLAIR), mBm feature offers the best PF tumor segmentation performance. We use different similarity metrics to evaluate quality and robustness of these selected features for PF tumor segmentation in MRI for ten pediatric patients.

Traditionally multi-variate normal distributions have been the staple of data modeling in most domains. For some domains, the model they provide is either inadequate or incorrect because of the disregard for the directional components of... more

Traditionally multi-variate normal distributions have been the staple of data modeling in most domains. For some domains, the model they provide is either inadequate or incorrect because of the disregard for the directional components of the data. We present a ...

We describe a model for strings of characters that is loosely based on the Lempel Ziv model with the addition that a repeated substring can be an approximate match to the original substring; this is close to the situation of DNA, for... more

We describe a model for strings of characters that is loosely based on the Lempel Ziv model with the addition that a repeated substring can be an approximate match to the original substring; this is close to the situation of DNA, for example. Typically there are many explanations for a given string under the model, some optimal and many suboptimal. Rather than commit to one optimal explanation, we sum the probabilities over all explanations under the model because this gives the probability of the data under the model. The model has a small number of parameters and these can be estimated from the given string by an expectation-maximization (EM) algorithm. Each iteration of the EM algorithm takes O(n2) time and a few iterations are typically sufficient. O(n2) complexity is impractical for strings of more than a few tens of thousands of characters and a faster approximation algorithm is also given. The model is further extended to include approximate reverse complementary repeats when...

Collaborative filtering (CF) is popular algorithm for recommender systems. Therefore items which are recommended to users are determined by surveying their communities. CF has good perspective because it can cast off limitation of... more

Collaborative filtering (CF) is popular algorithm for recommender systems. Therefore items which are recommended to users are determined by surveying their communities. CF has good perspective because it can cast off limitation of recommendation by discovering more potential items hidden under communities. Such items are likely to be suitable to users and they should be recommended to users. There are two main approaches for CF: memory-based and model-based. Memory-based algorithm loads entire database into system memory and make prediction for recommendation based on such in-line memory database. It is simple but encounters the problem of huge data. Model-based algorithm tries to compress huge database into a model and performs recommendation task by applying reference mechanism into this model. Model-based CF can response user's request instantly. This paper surveys common techniques for implementing model-based algorithms. We also give a new idea for model-based approach so as to gain high accuracy and solve the problem of sparse matrix by applying evidence-based inference techniques.

As recent advances in calcium sensing technologies facilitate simultaneously imaging action potentials in neuronal populations, complementary analytical tools must also be developed to maximize the utility of this experimental paradigm.... more

As recent advances in calcium sensing technologies facilitate simultaneously imaging action potentials in neuronal populations, complementary analytical tools must also be developed to maximize the utility of this experimental paradigm. Although the observations here are fluorescence movies, the signals of interest—spike trains and/or time varying intracellular calcium concentrations—are hidden. Inferring these hidden signals is often problematic due to noise, nonlinearities, slow imaging rate, and unknown biophysical parameters. We overcome these difficulties by developing sequential Monte Carlo methods (particle filters) based on biophysical models of spiking, calcium dynamics, and fluorescence. We show that even in simple cases, the particle filters outperform the optimal linear (i.e., Wiener) filter, both by obtaining better estimates and by providing error bars. We then relax a number of our model assumptions to incorporate nonlinear saturation of the fluorescence signal, as well external stimulus and spike history dependence (e.g., refractoriness) of the spike trains. Using both simulations and in vitro fluorescence observations, we demonstrate temporal superresolution by inferring when within a frame each spike occurs. Furthermore, the model parameters may be estimated using expectation maximization with only a very limited amount of data (e.g., ∼5–10 s or 5–40 spikes), without the requirement of any simultaneous electrophysiology or imaging experiments.

Abstract: A class of adaptive sampling methods is introduced for efficient posterior and predictive simulation. The proposed methods are robust in the sense that they can handle target distributions that exhibit non-elliptical shapes such... more

Abstract: A class of adaptive sampling methods is introduced for efficient posterior and predictive simulation. The proposed methods are robust in the sense that they can handle target distributions that exhibit non-elliptical shapes such as multimodality and skewness. ...

This paper presents the Visual Simultaneous Localization and Mapping (vSLAMTM) algorithm, a novel algorithm for simultaneous localization and mapping (SLAM). The algorithm is vision-and odometry-based, and enables low-cost navigation in... more

This paper presents the Visual Simultaneous Localization and Mapping (vSLAMTM) algorithm, a novel algorithm for simultaneous localization and mapping (SLAM). The algorithm is vision-and odometry-based, and enables low-cost navigation in cluttered and populated environments. No initial map is required, and it satisfactorily handles dynamic changes in the environment, for example, lighting changes, moving objects and/or people. Typically, vSLAM recovers quickly from dramatic disturbances, such as “kidnapping”.

Cette étude est focalisée sur la comparaison des performances de certains algorithmes de clustering en présence de données manquantes dans notre échantillon de travail. Plusieurs études ont été faites sur la possibilité de tenir compte... more

Cette étude est focalisée sur la comparaison des performances de certains algorithmes de clustering en présence de données manquantes dans notre échantillon de travail. Plusieurs études ont été faites sur la possibilité de tenir compte des valeurs manquantes dans la base de donnée pour effectuer un meilleur clustering. Notre étude s'est essentiellement basée sur l'écriture et l'implémentation de l'algorithme EM dans le cas des données manquantes. Nous avons adopté une méthode très particulière pour contrôler le processus générateur des données manquantes via des fonctions de répartition bien connues. Les résultats de l'implémentation de cet algorithme nous montrent une croissance continue de la log-vraisemblance au fur et à mesure que le nombre d'itération augmente. Nous sommes arrivés à la conclusion que les écritures théoriques de l'EM en tenant compte des données manquantes MCAR et MAR sont satisfaisantes et peuvent être sujet de comparaison par rapport à d'autres algorithmes de clustering gérant également des données manquantes.

In this study, we investigate the clustering capability of two unsupervised learning clustering methods: K-means and Expectation Maximization (EM). We train the methods on soccer match data of the Spanish competition La Liga, which... more

In this study, we investigate the clustering capability of two unsupervised learning clustering methods: K-means and Expectation Maximization (EM). We train the methods on soccer match data of the Spanish competition La Liga, which contains matches from 2004 to 2019. We classify both clustering methods with soccer player positions to visualize a correlation between player positions using Principal Component Analysis (PCA). In these visualizations, we use 4 and 11 clusters that correspond to player positions in the field. To interpret K-means and EM, we use purity and the silhouette score. Results show that K-means classifies the data better than EM. With the use of feature selection methods Laplacian score and correlation mean, we increase the performance of K-means by 37%. We see that a cluster size of 8 clusters has the best separability, which suggests that there are 8 different types of soccer players on the field during a match.

The effects of a job training program, Job Corps, on both employment and wages are evaluated using data from a randomized study. Principal stratification is used to address, simultaneously, the complications of noncompliance, wages that... more

The effects of a job training program, Job Corps, on both employment and wages are evaluated using data from a randomized study. Principal stratification is used to address, simultaneously, the complications of noncompliance, wages that are only partially defined because of nonemployment, and unintended missing outcomes. The first two complications are of substantive interest, whereas the third is a nuisance. The objective is to find a parsimonious model that can be used to inform public policy. We conduct a likelihood-based analysis using finite mixture models estimated by the expectation-maximization (EM) algorithm. We maintain an exclusion restriction assumption for the effect of assignment on employment and wages for noncompliers, but not on missingness. We provide estimates under the “missing at random” assumption, and assess the robustness of our results to deviations from it. The plausibility of meaningful restrictions is investigated by means of scaled log-likelihood ratio statistics. Substantive conclusions include the following. For compliers, the effect on employment is negative in the short term; it becomes positive in the long term, but these effects are small at best. For always employed compliers, that is, compliers who are employed whether trained or not trained, positive effects on wages are found at all time periods. Our analysis reveals that background characteristics of individuals differ markedly across the principal strata. We found evidence that the program should have been better targeted, in the sense of being designed differently for different groups of people, and specific suggestions are offered. Previous analyses of this dataset, which did not address all complications in a principled manner, led to less nuanced conclusions about Job Corps.

Purpose PETbox is a low cost bench top preclinical PET scanner dedicated to pharmacokinetic and pharmacodynamic mouse studies. A prototype system was developed at our institute, and this manuscript characterizes the performance of the... more

Purpose PETbox is a low cost bench top preclinical PET scanner dedicated to pharmacokinetic and pharmacodynamic mouse studies. A prototype system was developed at our institute, and this manuscript characterizes the performance of the prototype system. Procedures The PETbox detector consists of a 20 × 44 bismuth germanate crystal array with a thickness of 5 mm and cross-section size of 2.05 × 2.05 mm. Two such detectors are placed facing each other at a spacing of 5 cm, forming a dual-head geometry optimized for imaging mice. The detectors are kept stationary during the scan, making PETbox a limited angle tomography system. 3D images are reconstructed using a maximum likelihood and expectation maximization (ML–EM) method. The performance of the prototype system was characterized based on a modified set of the NEMA NU 4-2008 standards. Results In-plane image spatial resolution was measured to be an average of 1.53 mm full width at half maximum for coronal images and 2.65 mm for the anterior–posterior direction. The volumetric reconstructed resolution was below 8 mm3 at most locations in the field of view (FOV). The sensitivity, scatter fraction, and noise equivalent count rate (NECR) were measured for different energy windows. With an energy window of 150 - 650 keV and a timing window of 20 ns optimized for mouse imaging, the peak absolute sensitivity was 3.99% at the center of FOV and a peak NECR of 20 kcps was achieved for a total activity of 3.2 MBq (86.8 μCi). Phantom and in vivo imaging studies were performed and demonstrated the utility of the system at low activity levels. The quantitation capabilities of the system were also characterized showing that despite the limited angle tomography, reasonably good quantification accuracy was achieved over a large dynamic range of activity levels. Conclusions The presented results demonstrate the potential of this new tomograph for small animal imaging.

In this paper, a distributed multi-target tracking (MTT) algorithm suitable for implementation in wireless sensor networks is proposed. For this purpose, the Monte Carlo (MC) implementation of joint probabilistic data-association filter... more

In this paper, a distributed multi-target tracking (MTT) algorithm suitable for implementation in wireless sensor networks is proposed. For this purpose, the Monte Carlo (MC) implementation of joint probabilistic data-association filter (JPDAF) is applied to the well-known problem of multi-target tracking in a cluttered area. Also, to make the tracking algorithm scalable and usable for sensor networks of many nodes,

The imbalances between the In-phase (I) and Quadrature-phase (Q) branches represent a significant source of impairment in the orthogonal frequency division multiplexing (OFDM) systems. Recently, it has been shown that the unwanted IQ... more

The imbalances between the In-phase (I) and Quadrature-phase (Q) branches represent a significant source of impairment in the orthogonal frequency division multiplexing (OFDM) systems. Recently, it has been shown that the unwanted IQ imbalances can be actually exploited to achieve a diversity gain. In this contribution, by taking into account the diversity gain resulting from the IQ imbalances, we develop

A model is a representation of a real world phenomena. Probability theory gives us a framework to quantify uncertainty in a mathematically rigorous manner [4]. A random variable is a function on the elements in the sample space. A random... more

A model is a representation of a real world phenomena. Probability theory gives us a framework to quantify uncertainty in a mathematically rigorous manner [4]. A random variable is a function on the elements in the sample space. A random variable takes on values and the act of a random variable taking on values can be described by a probability distribution. Conceptualizing entities in terms of random variables and representing the joint probability distribution over these entities is essential to statistical modelling and supervised machine learning. A graph is a data structure which consists of a collection of nodes and edges. Graphs as mathematical structures are studied in graph theory [9]. Using graphs to describe probability distributions over random variables gives us a potent way to map the flow of influence, interdependencies and independencies between the random variables. Graph based manipulations give us enhanced computational efficiency and add to the descriptive power, performance of our model. In this paper, we attempt to understand the essential concepts concerning probabilistic graphical models. We will see representation using Bayesian belief networks, Markov networks, techniques for inference and learning in probabilistic graphical models.