Sourish Das - Academia.edu (original) (raw)

Papers by Sourish Das

Research paper thumbnail of A statistical machine learning approach to yield curve forecasting

2017 International Conference on Computational Intelligence in Data Science(ICCIDS), 2017

Research paper thumbnail of Big Data Classification Using Augmented Decision Trees

ArXiv, 2017

We present an algorithm for classification tasks on big data. Experiments conducted as part of th... more We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble methods, the models produced by the algorithm can be easily interpreted. The algorithm is based on a divide and conquer strategy and consists of two steps. The first step consists of using a decision tree to segment the large dataset. By construction, decision trees attempt to create homogeneous class distributions in their leaf nodes. However, non-homogeneous leaf nodes are usually produced. The second step of the algorithm consists of using a suitable classifier to determine the class labels for the non-homogeneous leaf nodes. The decision tree segment provides a coarse segment profile while the leaf level classifier can provide information about the attributes that affect the label within a segment.

Research paper thumbnail of A Bayesian perspective of statistical machine learning for big data

Computational Statistics, 2020

Research paper thumbnail of Clustering Mixed Datasets Using Homogeneity Analysis with Applications to Big Data

Calcutta Statistical Association Bulletin, 2018

Datasets with a mixture of numerical and categorical attributes are routinely encountered in many... more Datasets with a mixture of numerical and categorical attributes are routinely encountered in many application domains. Such datasets do not have a direct representation in Euclidean space. As a consequence, dissimilarity measures such as the Gower distance are used when partitioning clustering approaches are used with such datasets. Homogeneity analysis (HA) can be used to determine a Euclidean representation of mixed datasets. Such a representation can be analysed by leveraging the large body of tools and techniques for data with a Euclidean representation. The utility of the representation obtained from HA is not limited to clustering. This representation can be used to visualize mixed datasets and generate succinct numerical summaries. Such summaries can yield clues about associations between variables which may be difficult to discover otherwise. AMS Classification Code: 62-07

Research paper thumbnail of Fast Gaussian Process Regression for Big Data

Research paper thumbnail of Modeling Nelson–Siegel Yield Curve Using Bayesian Approach

New Economic Windows, 2019

Research paper thumbnail of Efficacy of endoscopic ultrasound (EUS) guided celiac plexus neurolysis (CPN) for managing abdominal pain associated with pancreas cancer: a meta-analysis

Gastrointestinal Endoscopy, 2009

Research paper thumbnail of Efficacy of Endoscopic Ultrasound-guided Celiac Plexus Block and Celiac Plexus Neurolysis for Managing Abdominal Pain Associated With Chronic Pancreatitis and Pancreatic Cancer

Journal of Clinical Gastroenterology, 2010

Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) h... more Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) have become important interventions in the management of pain due to chronic pancreatitis and pancreatic cancer. However, only a few well-structured studies have been performed to evaluate their efficacy. Given limited data, their use remains controversial. Herein, we evaluate the efficacy of EUS-guided CPB and CPN in alleviating chronic abdominal pain due to chronic pancreatitis and pancreatic cancer respectively. Using Medline, Pubmed, and Embase databases from January 1966 through December 2007, a thorough search of the English literature for studies evaluating the efficacy of EUS-guided CPB and CPN for the management of chronic abdominal pain due to chronic pancreatitis and pancreatic cancer was conducted, along with a hand search of reference lists. Studies that involved less than 10 patients were excluded. Data on pain relief was extracted, pooled, and analyzed. A total of 9 studies were included in the final analysis. For chronic pancreatitis, 6 relevant studies were identified, comprising a total of 221 patients. EUS-guided CPB was effective in alleviating abdominal pain in 51.46% of patients. For pancreatic cancer, 5 relevant studies were identified with a total of 119 patients. EUS-guided CPN was effective in alleviating abdominal pain in 72.54% of patients. EUS-guided CPB was 51.46% effective in managing chronic abdominal pain in patients with chronic pancreatitis, but warrants improvement in patient selection and refinement of technique, whereas EUS-guided CPN was 72.54% effective in managing pain due to pancreatic cancer and is a reasonable option for patients with tolerance to narcotic analgesics.

Research paper thumbnail of S1332 Efficacy of Endoscopic Ultrasound (EUS) Guided Celiac Plexus Block (CPB) for Managing Abdominal Pain Associated with Chronic Pancreatitis (CP): A Meta-Analysis

Research paper thumbnail of M2032 Efficacy of Endoscopic Ultrasound (EUS) Guided Celiac Plexus Neurolysis (CPN) for Managing Abdominal Pain Associated with Pancreas Cancer: A Meta-Analysis

Research paper thumbnail of On Bayesian Analysis of Generalized Linear Models: A New Perspective

Research paper thumbnail of On Bayesian inference for generalized multivariate gamma distribution

Statistics & Probability Letters, 2010

Research paper thumbnail of Variation in GABRA2 Predicts Drinking Behavior in Project MATCH Subjects

Alcoholism-clinical and Experimental Research, 2007

Background: Previous studies demonstrated, and replicated, an association between single nucleot... more Background: Previous studies demonstrated, and replicated, an association between single nucleotide polymorphisms (SNPs) within the GABRA2 gene and risk for alcohol dependence. The present study examines the association of a GABRA2 SNP with another definition of alcohol involvement and with the effects of psychosocial treatment.Methods: European-American subjects (n = 812, 73.4% male) provided DNA samples for the analysis. All were participants in Project Matching Alcoholism Treatment to Client Heterogeneity (MATCH), a multi-center randomized clinical trial evaluating the efficacy of 3 types of psychosocial treatment for alcoholism: Cognitive Behavioral Therapy (CBT), Motivational Enhancement Therapy (MET), or twelve-step facilitation (TSF). The daily probabilities of drinking and heavy drinking were estimated during the 12-week treatment and 12-month post-treatment periods.Results: Subjects homozygous for the allele associated with low risk for alcohol dependence in previous studies had lower daily probabilities of drinking and heavy drinking in the present study. This low-risk allele was also associated with a greater difference in drinking outcomes between the treatments. In addition, it enhanced the relative superiority of TSF over CBT and MET. Population stratification was excluded as a confound using ancestry informative marker analysis.Conclusions: The assessment of genetic vulnerability may be relevant to studies of the efficacy of psychosocial treatment: GABRA2 genotype modifies the variance in drinking and can therefore moderate power for resolving differences between treatments.

Research paper thumbnail of Analysis of extreme drinking in patients with alcohol dependence using Pareto regression

Statistics in Medicine, 2010

Research paper thumbnail of On Bayesian Analysis of Generalized Linear Models Using the Jacobian Technique

American Statistician, 2006

Research paper thumbnail of A statistical machine learning approach to yield curve forecasting

2017 International Conference on Computational Intelligence in Data Science(ICCIDS), 2017

Research paper thumbnail of Big Data Classification Using Augmented Decision Trees

ArXiv, 2017

We present an algorithm for classification tasks on big data. Experiments conducted as part of th... more We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble methods, the models produced by the algorithm can be easily interpreted. The algorithm is based on a divide and conquer strategy and consists of two steps. The first step consists of using a decision tree to segment the large dataset. By construction, decision trees attempt to create homogeneous class distributions in their leaf nodes. However, non-homogeneous leaf nodes are usually produced. The second step of the algorithm consists of using a suitable classifier to determine the class labels for the non-homogeneous leaf nodes. The decision tree segment provides a coarse segment profile while the leaf level classifier can provide information about the attributes that affect the label within a segment.

Research paper thumbnail of A Bayesian perspective of statistical machine learning for big data

Computational Statistics, 2020

Research paper thumbnail of Clustering Mixed Datasets Using Homogeneity Analysis with Applications to Big Data

Calcutta Statistical Association Bulletin, 2018

Datasets with a mixture of numerical and categorical attributes are routinely encountered in many... more Datasets with a mixture of numerical and categorical attributes are routinely encountered in many application domains. Such datasets do not have a direct representation in Euclidean space. As a consequence, dissimilarity measures such as the Gower distance are used when partitioning clustering approaches are used with such datasets. Homogeneity analysis (HA) can be used to determine a Euclidean representation of mixed datasets. Such a representation can be analysed by leveraging the large body of tools and techniques for data with a Euclidean representation. The utility of the representation obtained from HA is not limited to clustering. This representation can be used to visualize mixed datasets and generate succinct numerical summaries. Such summaries can yield clues about associations between variables which may be difficult to discover otherwise. AMS Classification Code: 62-07

Research paper thumbnail of Fast Gaussian Process Regression for Big Data

Research paper thumbnail of Modeling Nelson–Siegel Yield Curve Using Bayesian Approach

New Economic Windows, 2019

Research paper thumbnail of Efficacy of endoscopic ultrasound (EUS) guided celiac plexus neurolysis (CPN) for managing abdominal pain associated with pancreas cancer: a meta-analysis

Gastrointestinal Endoscopy, 2009

Research paper thumbnail of Efficacy of Endoscopic Ultrasound-guided Celiac Plexus Block and Celiac Plexus Neurolysis for Managing Abdominal Pain Associated With Chronic Pancreatitis and Pancreatic Cancer

Journal of Clinical Gastroenterology, 2010

Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) h... more Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) have become important interventions in the management of pain due to chronic pancreatitis and pancreatic cancer. However, only a few well-structured studies have been performed to evaluate their efficacy. Given limited data, their use remains controversial. Herein, we evaluate the efficacy of EUS-guided CPB and CPN in alleviating chronic abdominal pain due to chronic pancreatitis and pancreatic cancer respectively. Using Medline, Pubmed, and Embase databases from January 1966 through December 2007, a thorough search of the English literature for studies evaluating the efficacy of EUS-guided CPB and CPN for the management of chronic abdominal pain due to chronic pancreatitis and pancreatic cancer was conducted, along with a hand search of reference lists. Studies that involved less than 10 patients were excluded. Data on pain relief was extracted, pooled, and analyzed. A total of 9 studies were included in the final analysis. For chronic pancreatitis, 6 relevant studies were identified, comprising a total of 221 patients. EUS-guided CPB was effective in alleviating abdominal pain in 51.46% of patients. For pancreatic cancer, 5 relevant studies were identified with a total of 119 patients. EUS-guided CPN was effective in alleviating abdominal pain in 72.54% of patients. EUS-guided CPB was 51.46% effective in managing chronic abdominal pain in patients with chronic pancreatitis, but warrants improvement in patient selection and refinement of technique, whereas EUS-guided CPN was 72.54% effective in managing pain due to pancreatic cancer and is a reasonable option for patients with tolerance to narcotic analgesics.

Research paper thumbnail of S1332 Efficacy of Endoscopic Ultrasound (EUS) Guided Celiac Plexus Block (CPB) for Managing Abdominal Pain Associated with Chronic Pancreatitis (CP): A Meta-Analysis

Research paper thumbnail of M2032 Efficacy of Endoscopic Ultrasound (EUS) Guided Celiac Plexus Neurolysis (CPN) for Managing Abdominal Pain Associated with Pancreas Cancer: A Meta-Analysis

Research paper thumbnail of On Bayesian Analysis of Generalized Linear Models: A New Perspective

Research paper thumbnail of On Bayesian inference for generalized multivariate gamma distribution

Statistics & Probability Letters, 2010

Research paper thumbnail of Variation in GABRA2 Predicts Drinking Behavior in Project MATCH Subjects

Alcoholism-clinical and Experimental Research, 2007

Background: Previous studies demonstrated, and replicated, an association between single nucleot... more Background: Previous studies demonstrated, and replicated, an association between single nucleotide polymorphisms (SNPs) within the GABRA2 gene and risk for alcohol dependence. The present study examines the association of a GABRA2 SNP with another definition of alcohol involvement and with the effects of psychosocial treatment.Methods: European-American subjects (n = 812, 73.4% male) provided DNA samples for the analysis. All were participants in Project Matching Alcoholism Treatment to Client Heterogeneity (MATCH), a multi-center randomized clinical trial evaluating the efficacy of 3 types of psychosocial treatment for alcoholism: Cognitive Behavioral Therapy (CBT), Motivational Enhancement Therapy (MET), or twelve-step facilitation (TSF). The daily probabilities of drinking and heavy drinking were estimated during the 12-week treatment and 12-month post-treatment periods.Results: Subjects homozygous for the allele associated with low risk for alcohol dependence in previous studies had lower daily probabilities of drinking and heavy drinking in the present study. This low-risk allele was also associated with a greater difference in drinking outcomes between the treatments. In addition, it enhanced the relative superiority of TSF over CBT and MET. Population stratification was excluded as a confound using ancestry informative marker analysis.Conclusions: The assessment of genetic vulnerability may be relevant to studies of the efficacy of psychosocial treatment: GABRA2 genotype modifies the variance in drinking and can therefore moderate power for resolving differences between treatments.

Research paper thumbnail of Analysis of extreme drinking in patients with alcohol dependence using Pareto regression

Statistics in Medicine, 2010

Research paper thumbnail of On Bayesian Analysis of Generalized Linear Models Using the Jacobian Technique

American Statistician, 2006