Mikhail Langovoy - Academia.edu (original) (raw)

Papers by Mikhail Langovoy

Research paper thumbnail of Unsupervised nonparametric detection of unknown objects in noisy images based on percolation theory

arXiv (Cornell University), Feb 24, 2011

We develop an unsupervised, nonparametric, and scalable statistical learning method for detection... more We develop an unsupervised, nonparametric, and scalable statistical learning method for detection of unknown objects in noisy images. The method uses results from percolation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes and sizes in the presence of nonparametric noise of unknown level. The noise density is assumed to be unknown and can be very irregular. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove strong consistency and scalability of our method in this setup with minimal assumptions.

Research paper thumbnail of Unsupervised robust nonparametric learning of hidden community properties

arXiv (Cornell University), Jul 11, 2017

We consider learning of fundamental properties of communities in large noisy networks, in the pro... more We consider learning of fundamental properties of communities in large noisy networks, in the prototypical situation where the nodes or users are split into two classes according to a binary property, e.g., according to their opinions or preferences on a topic. For learning these properties, we propose a nonparametric, unsupervised, and scalable graph scan procedure that is, in addition, robust against a class of powerful adversaries. In our setup, one of the communities can fall under the influence of a knowledgeable adversarial leader, who knows the full network structure, has unlimited computational resources and can completely foresee our planned actions on the network. We prove strong consistency of our results in this setup with minimal assumptions. In particular, the learning procedure estimates the baseline activity of normal users asymptotically correctly with probability 1; the only assumption being the existence of a single implicit community of asymptotically negligible logarithmic size. We provide experiments on real and synthetic data to illustrate the performance of our method, including examples with adversaries.

Research paper thumbnail of and Olaf Wittich

Abstract: We propose a novel statistical hypothesis testing method for detection of objects in no... more Abstract: We propose a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from per-colation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes in the presence of nonpara-metric noise of unknown level and of unknown distribution. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object’s interior is required. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. In this paper, we develop further the mathematical formalism of our method and explore important connections to the mathematical theory of percolation and statistical physics. We prove results on consistency and algorithmic complexity of our testing procedure. In addition, we address not only an asymptotic behavior of the method, but also a finite sample performance of our test.

Research paper thumbnail of Prediction of cybersickness in virtual environments using topological data analysis and machine learning

Frontiers in Virtual Reality

Recent significant progress in Virtual Reality (VR) applications and environments raised several ... more Recent significant progress in Virtual Reality (VR) applications and environments raised several challenges. They proved to have side effects on specific users, thus reducing the usability of the VR technology in some critical domains, such as flight and car simulators. One of the common side effects is cybersickness. Some significant commonly reported symptoms are nausea, oculomotor discomfort, and disorientation. To mitigate these symptoms and consequently improve the usability of VR systems, it is necessary to predict the incidence of cybersickness. This paper proposes a machine learning approach to VR’s cybersickness prediction based on physiological and subjective data. We investigated combinations of topological data analysis with a range of classifier algorithms and assessed classification performance. The highest performance of Topological Data Analysis (TDA) based methods was achieved in combination with SVMs with Gaussian RBF kernel, indicating that Gaussian RBF kernels pr...

Research paper thumbnail of Computationally efficient algorithms for statistical image processing. implementation in r

Abstract: In the series of our earlier papers on the subject, we proposed a novel statistical hyp... more Abstract: In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

Research paper thumbnail of A Unified Approach to Data Analysis and Modeling of the Appearance of Materials for Computer Graphics and Multidimensional Reflectometry

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional , computer vision and computer graphics. In this paper, we outline a unified perception-based approach to modeling of the appearance of materials for computer graphics and reflectometry. We discuss the differences and the common points of data analysis and modeling for BRDFs in both physical and in virtual application domains. We outline a mathematical framework that captures important problems in both types of application domains, and allows for application and performance comparisons of statistical and machine learning methods. For comparisons between methods, we use criteria that are relevant to both statistics and machine learning, as well as to both virtual and physical application domains. Additionally, we propose a class of multiple testing procedures to test a hypothesis that a material has diffuse reflection in a generalized sense. We treat a general case where the number of hy...

Research paper thumbnail of xD-Reflect - >Multidimensional Reflectometry for Industry> a research project of the European Metrology Research Program (EMRP)

Andreas Hope et al.; 12th International Conference, Otaniemi, Espoo, Helsinki (Finland), 24-27 Ju... more Andreas Hope et al.; 12th International Conference, Otaniemi, Espoo, Helsinki (Finland), 24-27 June, 2014; http://newrad2014.aalto.fi/

Research paper thumbnail of Detection of objects in noisy images and site percolation on square lattices

ArXiv, 2011

We propose a novel probabilistic method for detection of ob- jects in noisy images. The method us... more We propose a novel probabilistic method for detection of ob- jects in noisy images. The method uses results from percolation and random graph theories. We present an algorithm that allows to detect objects of un- known shapes in the presence of random noise. Our procedure substantially diers from wavelets-based algorithms. The algorithm has linear complex- ity and exponential accuracy and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedure.

Research paper thumbnail of Statistical Analysis of BRDF Data for Computer Graphics and Metrology

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflectance distribution function. BRDF is one of the fundamental concepts in such diverse fields as multidimensional reflectometry, computer graphics and computer vision. In this paper, we treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifold. We argue that statistical data analysis of BRDF measurements has to account both for nonlinear structure of the data as well as for ill-behaved noise. Standard statistical methods can not be safely directly applied to BRDF data. Our study clarifies certain pitfalls in analysis of BRDF data, and helps to develop more refined estimates and unsupervised learning procedures for BRDF models. We also apply the notion of Pitman closeness to compare different estimators...

Research paper thumbnail of Randomized algorithms for statistical image analysis based on percolation theory

Research paper thumbnail of Ein hybrides RNN-Modell für die mittel- bis langfristige Vorhersage des Strombedarfs unter Berücksichtigung von Wettereinflüssen

at - Automatisierungstechnik, 2021

Zusammenfassung Im täglichen Stadtbetrieb sollte die Stromversorgung unterbrechungsfrei sein, was... more Zusammenfassung Im täglichen Stadtbetrieb sollte die Stromversorgung unterbrechungsfrei sein, was das moderne Energiemanagement vor Herausforderungen stellt. Die Prognose des Energiebedarfs kann die Strategie des Energiemanagements optimieren und die Energieeffizienz verbessern. Das traditionelle LSTM-Modell, das auf einer Codierungs-Decodierungs-Struktur basiert, codiert alle historischen Informationen als Vektor fester Länge, was zum Informationsverlust führt, wenn der vorhergesagte Wert von den Merkmalen abhängt die weit in der Vergangenheit liegen. Dies ist bei Energieprognosen aufgrund der Periodizität des Energieverbrauchs üblich. Um das oben genannte Problem zu lösen und das Potenzial der Betriebsdaten von Kraftwerken für Energievorhersagen vollständig auszuschöpfen, wird in diesem Artikel ein Energievorhersagemodell vorgeschlagen, das auf dem Aufmerksamkeitsmechanismus basiert. Ausgehend von der traditionellen Codierungs-Decodierungs-Architektur wird der räumliche und zeitli...

Research paper thumbnail of Detection of objects in noisy images based on percolation theory

arXiv: Statistics Theory, 2011

P. L. Davies, M. Langovoy and O. Wittich/Detection in noisy images and percolation.2Abstract: We ... more P. L. Davies, M. Langovoy and O. Wittich/Detection in noisy images and percolation.2Abstract: We propose a novel statistical method for detection of objectsin noisy images. The method uses results from percolation and randomgraph theories. We present an algorithm that allows to detect objectsof unknown shapes in the presence of nonparametric noise of unknownlevel. The noise density is assumed to be unknown and can be veryirregular. Our procedure substantially di ers from wavelets-based algo-rithms. The algorithm has linear complexity and exponential accuracyand is appropriate for real-time systems. We prove results on consistencyand algorithmic complexity of our procedure.Keywords and phrases: Image analysis, signal detection, image re-construction, percolation, noisy image, unknown noise.

Research paper thumbnail of Efficient tests for the deconvolution hypothesis

Research paper thumbnail of Computationally efficient algorithms for statistical image processing. Implementation in R

ArXiv, 2011

In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis te... more In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

Research paper thumbnail of Machine Learning and Statistical Analysis for BRDF Data from Computer Graphics and Multidimensional Reflectometry

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflectance distribution function. BRDF is one of the fundamental concepts in such diverse fields as multidimensional reflectometry, computer graphics and computer vision. In this paper, we treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifolds. We argue that any realistic statistical analysis of BRDF measurements, or any parameter or manifold learning procedure applied to BRDF measurements has to account both for nonlinear structure of the data as well as for a very ill-behaved noise. Standard statistical and machine learning methods can not be safely directly applied to BRDF data. We discuss the differences and the common points of data analysis and modelling for BRDFs in both physical as well as in ...

Research paper thumbnail of Multiple testing, uncertainty and realistic pictures

arXiv: Statistics Theory, 2011

We study statistical detection of grayscale objects in noisy images. The object of interest is of... more We study statistical detection of grayscale objects in noisy images. The object of interest is of unknown shape and has an unknown intensity, that can be varying over the object and can be negative. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object's interior is required. We propose an algorithm that can be used to detect grayscale objects of unknown shapes in the presence of nonparametric noise of unknown level. Our algorithm is based on a nonparametric multiple testing procedure. We establish the limit of applicability of our method via an explicit, closed-form, non-asymptotic and nonparametric consistency bound. This bound is valid for a wide class of nonparametric noise distributions. We achieve this by proving an uncertainty principle for percolation on finite lattices.

Research paper thumbnail of Data-driven goodness-of-fit tests

arXiv: Statistics Theory, 2007

We propose and study a general method for construction of consistent statistical tests on the bas... more We propose and study a general method for construction of consistent statistical tests on the basis of possibly indirect, corrupted, or partially available observations. The class of tests devised in the paper contains Neyman's smooth tests, data-driven score tests, and some types of multi-sample tests as basic examples. Our tests are data-driven and are additionally incorporated with model selection rules. The method allows to use a wide class of model selection rules that are based on the penalization idea. In particular, many of the optimal penalties, derived in statistical literature, can be used in our tests. We establish the behavior of model selection rules and data-driven tests under both the null hypothesis and the alternative hypothesis, derive an explicit detectability rule for alternative hypotheses, and prove a master consistency theorem for the tests from the class. The paper shows that the tests are applicable to a wide range of problems, including hypothesis test...

Research paper thumbnail of Statistical estimation for optimization problems on graphs

ArXiv, 2011

Large graphs abound in machine learning, data mining, and several related areas. A useful step to... more Large graphs abound in machine learning, data mining, and several related areas. A useful step towards analyzing such graphs is that of obtaining certain summary statistics - e.g., or the expected length of a shortest path between two nodes, or the expected weight of a minimum spanning tree of the graph, etc. These statistics provide insight into the structure of a graph, and they can help predict global properties of a graph. Motivated thus, we propose to study statistical properties of structured subgraphs (of a given graph), in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs. The general task is very difficult, if not unsolvable; so for concreteness we describe a more specific statistical estimation problem based on spanning trees. We hope that our position paper encourages others to also study other types of graphical structures for which one can prove nontrivial statistical estimates.

Research paper thumbnail of Generalized active learning and design of statistical experiments for manifold-valued data

ArXiv, 2019

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflectance distribution function (BRDF). We treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifolds. BRDF manifolds form an infinite-dimensional space, but typically the available measurements are very scarce for complicated problems such as BRDF estimation. Therefore, an efficient learning strategy is crucial when performing the measurements. In this paper, we build the foundation of a mathematical framework that allows to develop and apply new techniques within statistical design of experiments and generalized proactive learning, in order to establish more efficient sampling and measurement strategies for BRDF data manifolds.

Research paper thumbnail of Approach to Data Analysis and Modeling of the Appearance of Materials for Computer Graphics and Multidimensional Re fl ectometry

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional , computer vision and computer graphics. In this paper, we outline a unified perception-based approach to modeling of the appearance of materials for computer graphics and reflectometry. We discuss the differences and the common points of data analysis and modeling for BRDFs in both physical and in virtual application domains. We outline a mathematical framework that captures important problems in both types of application domains, and allows for application and performance comparisons of statistical and machine learning methods. For comparisons between methods, we use criteria that are relevant to both statistics and machine learning, as well as to both virtual and physical application domains. Additionally, we propose a class of multiple testing procedures to test a hypothesis that a material has diffuse reflection in a generalized sense. We treat a general case where the number of hy...

Research paper thumbnail of Unsupervised nonparametric detection of unknown objects in noisy images based on percolation theory

arXiv (Cornell University), Feb 24, 2011

We develop an unsupervised, nonparametric, and scalable statistical learning method for detection... more We develop an unsupervised, nonparametric, and scalable statistical learning method for detection of unknown objects in noisy images. The method uses results from percolation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes and sizes in the presence of nonparametric noise of unknown level. The noise density is assumed to be unknown and can be very irregular. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. We prove strong consistency and scalability of our method in this setup with minimal assumptions.

Research paper thumbnail of Unsupervised robust nonparametric learning of hidden community properties

arXiv (Cornell University), Jul 11, 2017

We consider learning of fundamental properties of communities in large noisy networks, in the pro... more We consider learning of fundamental properties of communities in large noisy networks, in the prototypical situation where the nodes or users are split into two classes according to a binary property, e.g., according to their opinions or preferences on a topic. For learning these properties, we propose a nonparametric, unsupervised, and scalable graph scan procedure that is, in addition, robust against a class of powerful adversaries. In our setup, one of the communities can fall under the influence of a knowledgeable adversarial leader, who knows the full network structure, has unlimited computational resources and can completely foresee our planned actions on the network. We prove strong consistency of our results in this setup with minimal assumptions. In particular, the learning procedure estimates the baseline activity of normal users asymptotically correctly with probability 1; the only assumption being the existence of a single implicit community of asymptotically negligible logarithmic size. We provide experiments on real and synthetic data to illustrate the performance of our method, including examples with adversaries.

Research paper thumbnail of and Olaf Wittich

Abstract: We propose a novel statistical hypothesis testing method for detection of objects in no... more Abstract: We propose a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from per-colation theory and random graph theory. We present an algorithm that allows to detect objects of unknown shapes in the presence of nonpara-metric noise of unknown level and of unknown distribution. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object’s interior is required. The algorithm has linear complexity and exponential accuracy and is appropriate for real-time systems. In this paper, we develop further the mathematical formalism of our method and explore important connections to the mathematical theory of percolation and statistical physics. We prove results on consistency and algorithmic complexity of our testing procedure. In addition, we address not only an asymptotic behavior of the method, but also a finite sample performance of our test.

Research paper thumbnail of Prediction of cybersickness in virtual environments using topological data analysis and machine learning

Frontiers in Virtual Reality

Recent significant progress in Virtual Reality (VR) applications and environments raised several ... more Recent significant progress in Virtual Reality (VR) applications and environments raised several challenges. They proved to have side effects on specific users, thus reducing the usability of the VR technology in some critical domains, such as flight and car simulators. One of the common side effects is cybersickness. Some significant commonly reported symptoms are nausea, oculomotor discomfort, and disorientation. To mitigate these symptoms and consequently improve the usability of VR systems, it is necessary to predict the incidence of cybersickness. This paper proposes a machine learning approach to VR’s cybersickness prediction based on physiological and subjective data. We investigated combinations of topological data analysis with a range of classifier algorithms and assessed classification performance. The highest performance of Topological Data Analysis (TDA) based methods was achieved in combination with SVMs with Gaussian RBF kernel, indicating that Gaussian RBF kernels pr...

Research paper thumbnail of Computationally efficient algorithms for statistical image processing. implementation in r

Abstract: In the series of our earlier papers on the subject, we proposed a novel statistical hyp... more Abstract: In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

Research paper thumbnail of A Unified Approach to Data Analysis and Modeling of the Appearance of Materials for Computer Graphics and Multidimensional Reflectometry

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional , computer vision and computer graphics. In this paper, we outline a unified perception-based approach to modeling of the appearance of materials for computer graphics and reflectometry. We discuss the differences and the common points of data analysis and modeling for BRDFs in both physical and in virtual application domains. We outline a mathematical framework that captures important problems in both types of application domains, and allows for application and performance comparisons of statistical and machine learning methods. For comparisons between methods, we use criteria that are relevant to both statistics and machine learning, as well as to both virtual and physical application domains. Additionally, we propose a class of multiple testing procedures to test a hypothesis that a material has diffuse reflection in a generalized sense. We treat a general case where the number of hy...

Research paper thumbnail of xD-Reflect - >Multidimensional Reflectometry for Industry> a research project of the European Metrology Research Program (EMRP)

Andreas Hope et al.; 12th International Conference, Otaniemi, Espoo, Helsinki (Finland), 24-27 Ju... more Andreas Hope et al.; 12th International Conference, Otaniemi, Espoo, Helsinki (Finland), 24-27 June, 2014; http://newrad2014.aalto.fi/

Research paper thumbnail of Detection of objects in noisy images and site percolation on square lattices

ArXiv, 2011

We propose a novel probabilistic method for detection of ob- jects in noisy images. The method us... more We propose a novel probabilistic method for detection of ob- jects in noisy images. The method uses results from percolation and random graph theories. We present an algorithm that allows to detect objects of un- known shapes in the presence of random noise. Our procedure substantially diers from wavelets-based algorithms. The algorithm has linear complex- ity and exponential accuracy and is appropriate for real-time systems. We prove results on consistency and algorithmic complexity of our procedure.

Research paper thumbnail of Statistical Analysis of BRDF Data for Computer Graphics and Metrology

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflectance distribution function. BRDF is one of the fundamental concepts in such diverse fields as multidimensional reflectometry, computer graphics and computer vision. In this paper, we treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifold. We argue that statistical data analysis of BRDF measurements has to account both for nonlinear structure of the data as well as for ill-behaved noise. Standard statistical methods can not be safely directly applied to BRDF data. Our study clarifies certain pitfalls in analysis of BRDF data, and helps to develop more refined estimates and unsupervised learning procedures for BRDF models. We also apply the notion of Pitman closeness to compare different estimators...

Research paper thumbnail of Randomized algorithms for statistical image analysis based on percolation theory

Research paper thumbnail of Ein hybrides RNN-Modell für die mittel- bis langfristige Vorhersage des Strombedarfs unter Berücksichtigung von Wettereinflüssen

at - Automatisierungstechnik, 2021

Zusammenfassung Im täglichen Stadtbetrieb sollte die Stromversorgung unterbrechungsfrei sein, was... more Zusammenfassung Im täglichen Stadtbetrieb sollte die Stromversorgung unterbrechungsfrei sein, was das moderne Energiemanagement vor Herausforderungen stellt. Die Prognose des Energiebedarfs kann die Strategie des Energiemanagements optimieren und die Energieeffizienz verbessern. Das traditionelle LSTM-Modell, das auf einer Codierungs-Decodierungs-Struktur basiert, codiert alle historischen Informationen als Vektor fester Länge, was zum Informationsverlust führt, wenn der vorhergesagte Wert von den Merkmalen abhängt die weit in der Vergangenheit liegen. Dies ist bei Energieprognosen aufgrund der Periodizität des Energieverbrauchs üblich. Um das oben genannte Problem zu lösen und das Potenzial der Betriebsdaten von Kraftwerken für Energievorhersagen vollständig auszuschöpfen, wird in diesem Artikel ein Energievorhersagemodell vorgeschlagen, das auf dem Aufmerksamkeitsmechanismus basiert. Ausgehend von der traditionellen Codierungs-Decodierungs-Architektur wird der räumliche und zeitli...

Research paper thumbnail of Detection of objects in noisy images based on percolation theory

arXiv: Statistics Theory, 2011

P. L. Davies, M. Langovoy and O. Wittich/Detection in noisy images and percolation.2Abstract: We ... more P. L. Davies, M. Langovoy and O. Wittich/Detection in noisy images and percolation.2Abstract: We propose a novel statistical method for detection of objectsin noisy images. The method uses results from percolation and randomgraph theories. We present an algorithm that allows to detect objectsof unknown shapes in the presence of nonparametric noise of unknownlevel. The noise density is assumed to be unknown and can be veryirregular. Our procedure substantially di ers from wavelets-based algo-rithms. The algorithm has linear complexity and exponential accuracyand is appropriate for real-time systems. We prove results on consistencyand algorithmic complexity of our procedure.Keywords and phrases: Image analysis, signal detection, image re-construction, percolation, noisy image, unknown noise.

Research paper thumbnail of Efficient tests for the deconvolution hypothesis

Research paper thumbnail of Computationally efficient algorithms for statistical image processing. Implementation in R

ArXiv, 2011

In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis te... more In the series of our earlier papers on the subject, we proposed a novel statistical hypothesis testing method for detection of objects in noisy images. The method uses results from percolation theory and random graph theory. We developed algorithms that allowed to detect objects of unknown shapes in the presence of nonparametric noise of unknown level and of unknown distribution. No boundary shape constraints were imposed on the objects, only a weak bulk condition for the object's interior was required. Our algorithms have linear complexity and exponential accuracy. In the present paper, we describe an implementation of our nonparametric hypothesis testing method. We provide a program that can be used for statistical experiments in image processing. This program is written in the statistical programming language R.

Research paper thumbnail of Machine Learning and Statistical Analysis for BRDF Data from Computer Graphics and Multidimensional Reflectometry

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflectance distribution function. BRDF is one of the fundamental concepts in such diverse fields as multidimensional reflectometry, computer graphics and computer vision. In this paper, we treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifolds. We argue that any realistic statistical analysis of BRDF measurements, or any parameter or manifold learning procedure applied to BRDF measurements has to account both for nonlinear structure of the data as well as for a very ill-behaved noise. Standard statistical and machine learning methods can not be safely directly applied to BRDF data. We discuss the differences and the common points of data analysis and modelling for BRDFs in both physical as well as in ...

Research paper thumbnail of Multiple testing, uncertainty and realistic pictures

arXiv: Statistics Theory, 2011

We study statistical detection of grayscale objects in noisy images. The object of interest is of... more We study statistical detection of grayscale objects in noisy images. The object of interest is of unknown shape and has an unknown intensity, that can be varying over the object and can be negative. No boundary shape constraints are imposed on the object, only a weak bulk condition for the object's interior is required. We propose an algorithm that can be used to detect grayscale objects of unknown shapes in the presence of nonparametric noise of unknown level. Our algorithm is based on a nonparametric multiple testing procedure. We establish the limit of applicability of our method via an explicit, closed-form, non-asymptotic and nonparametric consistency bound. This bound is valid for a wide class of nonparametric noise distributions. We achieve this by proving an uncertainty principle for percolation on finite lattices.

Research paper thumbnail of Data-driven goodness-of-fit tests

arXiv: Statistics Theory, 2007

We propose and study a general method for construction of consistent statistical tests on the bas... more We propose and study a general method for construction of consistent statistical tests on the basis of possibly indirect, corrupted, or partially available observations. The class of tests devised in the paper contains Neyman's smooth tests, data-driven score tests, and some types of multi-sample tests as basic examples. Our tests are data-driven and are additionally incorporated with model selection rules. The method allows to use a wide class of model selection rules that are based on the penalization idea. In particular, many of the optimal penalties, derived in statistical literature, can be used in our tests. We establish the behavior of model selection rules and data-driven tests under both the null hypothesis and the alternative hypothesis, derive an explicit detectability rule for alternative hypotheses, and prove a master consistency theorem for the tests from the class. The paper shows that the tests are applicable to a wide range of problems, including hypothesis test...

Research paper thumbnail of Statistical estimation for optimization problems on graphs

ArXiv, 2011

Large graphs abound in machine learning, data mining, and several related areas. A useful step to... more Large graphs abound in machine learning, data mining, and several related areas. A useful step towards analyzing such graphs is that of obtaining certain summary statistics - e.g., or the expected length of a shortest path between two nodes, or the expected weight of a minimum spanning tree of the graph, etc. These statistics provide insight into the structure of a graph, and they can help predict global properties of a graph. Motivated thus, we propose to study statistical properties of structured subgraphs (of a given graph), in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs. The general task is very difficult, if not unsolvable; so for concreteness we describe a more specific statistical estimation problem based on spanning trees. We hope that our position paper encourages others to also study other types of graphical structures for which one can prove nontrivial statistical estimates.

Research paper thumbnail of Generalized active learning and design of statistical experiments for manifold-valued data

ArXiv, 2019

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional reflectometry, computer vision and computer graphics. For many applications, appearance is sufficiently well characterized by the bidirectional reflectance distribution function (BRDF). We treat BRDF measurements as samples of points from high-dimensional non-linear non-convex manifolds. BRDF manifolds form an infinite-dimensional space, but typically the available measurements are very scarce for complicated problems such as BRDF estimation. Therefore, an efficient learning strategy is crucial when performing the measurements. In this paper, we build the foundation of a mathematical framework that allows to develop and apply new techniques within statistical design of experiments and generalized proactive learning, in order to establish more efficient sampling and measurement strategies for BRDF data manifolds.

Research paper thumbnail of Approach to Data Analysis and Modeling of the Appearance of Materials for Computer Graphics and Multidimensional Re fl ectometry

Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional... more Characterizing the appearance of real-world surfaces is a fundamental problem in multidimensional , computer vision and computer graphics. In this paper, we outline a unified perception-based approach to modeling of the appearance of materials for computer graphics and reflectometry. We discuss the differences and the common points of data analysis and modeling for BRDFs in both physical and in virtual application domains. We outline a mathematical framework that captures important problems in both types of application domains, and allows for application and performance comparisons of statistical and machine learning methods. For comparisons between methods, we use criteria that are relevant to both statistics and machine learning, as well as to both virtual and physical application domains. Additionally, we propose a class of multiple testing procedures to test a hypothesis that a material has diffuse reflection in a generalized sense. We treat a general case where the number of hy...