Sujit Ghosh | North Carolina State University (original) (raw)

Papers by Sujit Ghosh

Research paper thumbnail of Probabilistic Detection and Estimation of Conic Sections from Noisy Data

arXiv (Cornell University), Oct 30, 2019

Research paper thumbnail of Distributional outcome regression via quantile functions and its application to modelling continuously monitored heart rate and physical activity

arXiv (Cornell University), Jan 26, 2023

Research paper thumbnail of Dual Efficient Forecasting Framework for Time Series Data

arXiv (Cornell University), Oct 27, 2022

Research paper thumbnail of Shape-constrained estimation in functional regression with Bernstein polynomials

Computational Statistics & Data Analysis

Research paper thumbnail of A Model for Overdispersion and Underdispersion using Latent Markov Processes

Research paper thumbnail of Dynamic correlation multivariate stochastic volatility with latent factors

Statistica Neerlandica, 2017

Modeling the correlation structure of returns is essential in many financial applications. Consid... more Modeling the correlation structure of returns is essential in many financial applications. Considerable evidence from empirical studies has shown that the correlation among asset returns is not stable over time. A recent development in the multivariate stochastic volatility literature is the application of inverse Wishart processes to characterize the evolution of return correlation matrices. Within the inverse Wishart multivariate stochastic volatility framework, we propose a flexible correlated latent factor model to achieve dimension reduction and capture the stylized fact of ‘correlation breakdown’ simultaneously. The parameter estimation is based on existing Markov chain Monte Carlo methods. We illustrate the proposed model with several empirical studies. In particular, we use high‐dimensional stock return data to compare our model with competing models based on multiple performance metrics and tests. The results show that the proposed model not only describes historic stylized...

Research paper thumbnail of Data transforming augmentation for heteroscedastic models

Journal of Computational and Graphical Statistics, 2020

Research paper thumbnail of A comparative study of the dose-response analysis with application to the target dose estimation

Journal of statistical theory and practice, Nov 18, 2016

Research paper thumbnail of Generalized Linear Models

CRC Press eBooks, May 25, 2000

Research paper thumbnail of Model Validation of a Single Degree-of-Freedom Oscillator: A Case Study

Stats

In this paper, we investigate a validation process in order to assess the predictive capabilities... more In this paper, we investigate a validation process in order to assess the predictive capabilities of a single degree-of-freedom oscillator. Model validation is understood here as the process of determining the accuracy with which a model can predict observed physical events or important features of the physical system. Therefore, assessment of the model needs to be performed with respect to the conditions under which the model is used in actual simulations of the system and to specific quantities of interest used for decision-making. Model validation also supposes that the model be trained and tested against experimental data. In this work, virtual data are produced from a non-linear single degree-of-freedom oscillator, the so-called oracle model, which is supposed to provide an accurate representation of reality. The mathematical model to be validated is derived from the oracle model by simply neglecting the non-linear term. The model parameters are identified via Bayesian updating...

Research paper thumbnail of EMFlow: Data Imputation in Latent Space via EM and Deep Flow Models

arXiv (Cornell University), Jun 9, 2021

Research paper thumbnail of How High the Hedge: Relationships between Prices and Yields in the Federal Crop Insurance Program

The theory of the natural hedge states that agricultural yields and prices are inversely related.... more The theory of the natural hedge states that agricultural yields and prices are inversely related. Actuarial rules for U.S. crop revenue insurance assume that dependence between yield and price is constant across all counties within a state and that dependence can be adequately described by the Gaussian copula. We use nonlinear measures of association and a selection of bivariate copulas to empirically characterize spatially-varying dependence between prices and yields and examine premium rate sensitivity for all corn producing counties in the United States. A simulation analysis across copula types and parameter values exposes hypothetical impacts of actuarial changes.

Research paper thumbnail of Least Squares Method of Estimation Using Bernstein Polynomials for Density Estimation

Research paper thumbnail of Internucleotide Movements during Formation of 16 S rRNA–rRNA Photocrosslinks and their Connection to the 30 S Subunit Conformational Dynamics

Journal of Molecular Biology, Nov 1, 2005

Research paper thumbnail of Forecasting the Stock Exchange Rate of Thailand Index by Conditional Heteroscedastic Autoregressive Nonlinear Model with Autocorrelated Errors

Research paper thumbnail of Predicting exoplanet mass from radius and incident flux: a Bayesian mixture model

Monthly Notices of the Royal Astronomical Society, 2021

The relationship between mass and radius (M–R relation) is the key for inferring the planetary co... more The relationship between mass and radius (M–R relation) is the key for inferring the planetary compositions and thus valuable for the studies of formation and migration models. However, the M–R relation alone is not enough for planetary characterization due to the dependence of it on other confounding variables. This paper provides a non-trivial extension of the M–R relation by including the incident flux as an additional variable. By using Bayesian hierarchical modelling (BHM) that leverages the flexibility of finite mixture models, a probabilistic mass–radius–flux relationship (M–R–F relation) is obtained based on a sample of 319 exoplanets. We find that the flux has non-negligible impact on the M–R relation, while such impact is strongest for hot Jupiters. On the population level, the planets with higher level of flux tend to be denser, and high flux could trigger significant mass loss for plants with radii larger than 13R⊕. As a result, failing to account for the flux in mass pr...

Research paper thumbnail of Bayesian Analysis of First-Order Markov Models for Autocorrelated Binary Responses

Journal of Statistical Theory and Practice

Research paper thumbnail of Multivariate Density Estimation with Missing Data

arXiv: Methodology, 2018

Multivariate density estimation is a popular technique in statistics with wide applications inclu... more Multivariate density estimation is a popular technique in statistics with wide applications including regression models allowing for heteroskedasticity in conditional variances. The estimation problems become more challenging when observations are missing in one or more variables of the multivariate vector. A flexible class of mixture of tensor products of kernel densities is proposed which allows for easy implementation of imputation methods using Gibbs sampling and shown to have superior performance compared to some of the exisiting imputation methods currently available in literature. Numerical illustrations are provided using several simulated data scenarios and applications to couple of case studies are also presented.

Research paper thumbnail of Generalized Linear Models: A Bayesian Perspective

Journal of the American Statistical Association, 2001

... The chapter by Basu and Mukhopadhyay presents a semiparametric method to model link functions... more ... The chapter by Basu and Mukhopadhyay presents a semiparametric method to model link functions for the binary response data. ... for considering our proposal. Our special thanks go to Debosri, Swagata and Mou for their encouragements in this project. ...

Research paper thumbnail of Nonparametric estimation of multivariate copula using empirical bayes method

In the field of finance, insurance, and system reliability, etc., it is often of interest to meas... more In the field of finance, insurance, and system reliability, etc., it is often of interest to measure the dependence among variables by modeling a multivariate distribution using a copula. The copula models with parametric assumptions are easy to estimate but can be highly biased when such assumptions are false, while the empirical copulas are non-smooth and often not genuine copula making the inference about dependence challenging in practice. As a compromise, the empirical Bernstein copula provides a smooth estimator but the estimation of tuning parameters remains elusive. In this paper, by using the so-called empirical checkerboard copula we build a hierarchical empirical Bayes model that enables the estimation of a smooth copula function for arbitrary dimensions. The proposed estimator based on the multivariate Bernstein polynomials is itself a genuine copula and the selection of its dimension-varying degrees is data-dependent. We also show that the proposed copula estimator prov...

Research paper thumbnail of Probabilistic Detection and Estimation of Conic Sections from Noisy Data

arXiv (Cornell University), Oct 30, 2019

Research paper thumbnail of Distributional outcome regression via quantile functions and its application to modelling continuously monitored heart rate and physical activity

arXiv (Cornell University), Jan 26, 2023

Research paper thumbnail of Dual Efficient Forecasting Framework for Time Series Data

arXiv (Cornell University), Oct 27, 2022

Research paper thumbnail of Shape-constrained estimation in functional regression with Bernstein polynomials

Computational Statistics & Data Analysis

Research paper thumbnail of A Model for Overdispersion and Underdispersion using Latent Markov Processes

Research paper thumbnail of Dynamic correlation multivariate stochastic volatility with latent factors

Statistica Neerlandica, 2017

Modeling the correlation structure of returns is essential in many financial applications. Consid... more Modeling the correlation structure of returns is essential in many financial applications. Considerable evidence from empirical studies has shown that the correlation among asset returns is not stable over time. A recent development in the multivariate stochastic volatility literature is the application of inverse Wishart processes to characterize the evolution of return correlation matrices. Within the inverse Wishart multivariate stochastic volatility framework, we propose a flexible correlated latent factor model to achieve dimension reduction and capture the stylized fact of ‘correlation breakdown’ simultaneously. The parameter estimation is based on existing Markov chain Monte Carlo methods. We illustrate the proposed model with several empirical studies. In particular, we use high‐dimensional stock return data to compare our model with competing models based on multiple performance metrics and tests. The results show that the proposed model not only describes historic stylized...

Research paper thumbnail of Data transforming augmentation for heteroscedastic models

Journal of Computational and Graphical Statistics, 2020

Research paper thumbnail of A comparative study of the dose-response analysis with application to the target dose estimation

Journal of statistical theory and practice, Nov 18, 2016

Research paper thumbnail of Generalized Linear Models

CRC Press eBooks, May 25, 2000

Research paper thumbnail of Model Validation of a Single Degree-of-Freedom Oscillator: A Case Study

Stats

In this paper, we investigate a validation process in order to assess the predictive capabilities... more In this paper, we investigate a validation process in order to assess the predictive capabilities of a single degree-of-freedom oscillator. Model validation is understood here as the process of determining the accuracy with which a model can predict observed physical events or important features of the physical system. Therefore, assessment of the model needs to be performed with respect to the conditions under which the model is used in actual simulations of the system and to specific quantities of interest used for decision-making. Model validation also supposes that the model be trained and tested against experimental data. In this work, virtual data are produced from a non-linear single degree-of-freedom oscillator, the so-called oracle model, which is supposed to provide an accurate representation of reality. The mathematical model to be validated is derived from the oracle model by simply neglecting the non-linear term. The model parameters are identified via Bayesian updating...

Research paper thumbnail of EMFlow: Data Imputation in Latent Space via EM and Deep Flow Models

arXiv (Cornell University), Jun 9, 2021

Research paper thumbnail of How High the Hedge: Relationships between Prices and Yields in the Federal Crop Insurance Program

The theory of the natural hedge states that agricultural yields and prices are inversely related.... more The theory of the natural hedge states that agricultural yields and prices are inversely related. Actuarial rules for U.S. crop revenue insurance assume that dependence between yield and price is constant across all counties within a state and that dependence can be adequately described by the Gaussian copula. We use nonlinear measures of association and a selection of bivariate copulas to empirically characterize spatially-varying dependence between prices and yields and examine premium rate sensitivity for all corn producing counties in the United States. A simulation analysis across copula types and parameter values exposes hypothetical impacts of actuarial changes.

Research paper thumbnail of Least Squares Method of Estimation Using Bernstein Polynomials for Density Estimation

Research paper thumbnail of Internucleotide Movements during Formation of 16 S rRNA–rRNA Photocrosslinks and their Connection to the 30 S Subunit Conformational Dynamics

Journal of Molecular Biology, Nov 1, 2005

Research paper thumbnail of Forecasting the Stock Exchange Rate of Thailand Index by Conditional Heteroscedastic Autoregressive Nonlinear Model with Autocorrelated Errors

Research paper thumbnail of Predicting exoplanet mass from radius and incident flux: a Bayesian mixture model

Monthly Notices of the Royal Astronomical Society, 2021

The relationship between mass and radius (M–R relation) is the key for inferring the planetary co... more The relationship between mass and radius (M–R relation) is the key for inferring the planetary compositions and thus valuable for the studies of formation and migration models. However, the M–R relation alone is not enough for planetary characterization due to the dependence of it on other confounding variables. This paper provides a non-trivial extension of the M–R relation by including the incident flux as an additional variable. By using Bayesian hierarchical modelling (BHM) that leverages the flexibility of finite mixture models, a probabilistic mass–radius–flux relationship (M–R–F relation) is obtained based on a sample of 319 exoplanets. We find that the flux has non-negligible impact on the M–R relation, while such impact is strongest for hot Jupiters. On the population level, the planets with higher level of flux tend to be denser, and high flux could trigger significant mass loss for plants with radii larger than 13R⊕. As a result, failing to account for the flux in mass pr...

Research paper thumbnail of Bayesian Analysis of First-Order Markov Models for Autocorrelated Binary Responses

Journal of Statistical Theory and Practice

Research paper thumbnail of Multivariate Density Estimation with Missing Data

arXiv: Methodology, 2018

Multivariate density estimation is a popular technique in statistics with wide applications inclu... more Multivariate density estimation is a popular technique in statistics with wide applications including regression models allowing for heteroskedasticity in conditional variances. The estimation problems become more challenging when observations are missing in one or more variables of the multivariate vector. A flexible class of mixture of tensor products of kernel densities is proposed which allows for easy implementation of imputation methods using Gibbs sampling and shown to have superior performance compared to some of the exisiting imputation methods currently available in literature. Numerical illustrations are provided using several simulated data scenarios and applications to couple of case studies are also presented.

Research paper thumbnail of Generalized Linear Models: A Bayesian Perspective

Journal of the American Statistical Association, 2001

... The chapter by Basu and Mukhopadhyay presents a semiparametric method to model link functions... more ... The chapter by Basu and Mukhopadhyay presents a semiparametric method to model link functions for the binary response data. ... for considering our proposal. Our special thanks go to Debosri, Swagata and Mou for their encouragements in this project. ...

Research paper thumbnail of Nonparametric estimation of multivariate copula using empirical bayes method

In the field of finance, insurance, and system reliability, etc., it is often of interest to meas... more In the field of finance, insurance, and system reliability, etc., it is often of interest to measure the dependence among variables by modeling a multivariate distribution using a copula. The copula models with parametric assumptions are easy to estimate but can be highly biased when such assumptions are false, while the empirical copulas are non-smooth and often not genuine copula making the inference about dependence challenging in practice. As a compromise, the empirical Bernstein copula provides a smooth estimator but the estimation of tuning parameters remains elusive. In this paper, by using the so-called empirical checkerboard copula we build a hierarchical empirical Bayes model that enables the estimation of a smooth copula function for arbitrary dimensions. The proposed estimator based on the multivariate Bernstein polynomials is itself a genuine copula and the selection of its dimension-varying degrees is data-dependent. We also show that the proposed copula estimator prov...