Sourish Das - Academia.edu (original) (raw)
Papers by Sourish Das
2017 International Conference on Computational Intelligence in Data Science(ICCIDS), 2017
ArXiv, 2017
We present an algorithm for classification tasks on big data. Experiments conducted as part of th... more We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble methods, the models produced by the algorithm can be easily interpreted. The algorithm is based on a divide and conquer strategy and consists of two steps. The first step consists of using a decision tree to segment the large dataset. By construction, decision trees attempt to create homogeneous class distributions in their leaf nodes. However, non-homogeneous leaf nodes are usually produced. The second step of the algorithm consists of using a suitable classifier to determine the class labels for the non-homogeneous leaf nodes. The decision tree segment provides a coarse segment profile while the leaf level classifier can provide information about the attributes that affect the label within a segment.
Computational Statistics, 2020
Calcutta Statistical Association Bulletin, 2018
Datasets with a mixture of numerical and categorical attributes are routinely encountered in many... more Datasets with a mixture of numerical and categorical attributes are routinely encountered in many application domains. Such datasets do not have a direct representation in Euclidean space. As a consequence, dissimilarity measures such as the Gower distance are used when partitioning clustering approaches are used with such datasets. Homogeneity analysis (HA) can be used to determine a Euclidean representation of mixed datasets. Such a representation can be analysed by leveraging the large body of tools and techniques for data with a Euclidean representation. The utility of the representation obtained from HA is not limited to clustering. This representation can be used to visualize mixed datasets and generate succinct numerical summaries. Such summaries can yield clues about associations between variables which may be difficult to discover otherwise. AMS Classification Code: 62-07
New Economic Windows, 2019
Gastrointestinal Endoscopy, 2009
Journal of Clinical Gastroenterology, 2010
Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) h... more Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) have become important interventions in the management of pain due to chronic pancreatitis and pancreatic cancer. However, only a few well-structured studies have been performed to evaluate their efficacy. Given limited data, their use remains controversial. Herein, we evaluate the efficacy of EUS-guided CPB and CPN in alleviating chronic abdominal pain due to chronic pancreatitis and pancreatic cancer respectively. Using Medline, Pubmed, and Embase databases from January 1966 through December 2007, a thorough search of the English literature for studies evaluating the efficacy of EUS-guided CPB and CPN for the management of chronic abdominal pain due to chronic pancreatitis and pancreatic cancer was conducted, along with a hand search of reference lists. Studies that involved less than 10 patients were excluded. Data on pain relief was extracted, pooled, and analyzed. A total of 9 studies were included in the final analysis. For chronic pancreatitis, 6 relevant studies were identified, comprising a total of 221 patients. EUS-guided CPB was effective in alleviating abdominal pain in 51.46% of patients. For pancreatic cancer, 5 relevant studies were identified with a total of 119 patients. EUS-guided CPN was effective in alleviating abdominal pain in 72.54% of patients. EUS-guided CPB was 51.46% effective in managing chronic abdominal pain in patients with chronic pancreatitis, but warrants improvement in patient selection and refinement of technique, whereas EUS-guided CPN was 72.54% effective in managing pain due to pancreatic cancer and is a reasonable option for patients with tolerance to narcotic analgesics.
Statistics & Probability Letters, 2010
Alcoholism-clinical and Experimental Research, 2007
Background: Previous studies demonstrated, and replicated, an association between single nucleot... more Background: Previous studies demonstrated, and replicated, an association between single nucleotide polymorphisms (SNPs) within the GABRA2 gene and risk for alcohol dependence. The present study examines the association of a GABRA2 SNP with another definition of alcohol involvement and with the effects of psychosocial treatment.Methods: European-American subjects (n = 812, 73.4% male) provided DNA samples for the analysis. All were participants in Project Matching Alcoholism Treatment to Client Heterogeneity (MATCH), a multi-center randomized clinical trial evaluating the efficacy of 3 types of psychosocial treatment for alcoholism: Cognitive Behavioral Therapy (CBT), Motivational Enhancement Therapy (MET), or twelve-step facilitation (TSF). The daily probabilities of drinking and heavy drinking were estimated during the 12-week treatment and 12-month post-treatment periods.Results: Subjects homozygous for the allele associated with low risk for alcohol dependence in previous studies had lower daily probabilities of drinking and heavy drinking in the present study. This low-risk allele was also associated with a greater difference in drinking outcomes between the treatments. In addition, it enhanced the relative superiority of TSF over CBT and MET. Population stratification was excluded as a confound using ancestry informative marker analysis.Conclusions: The assessment of genetic vulnerability may be relevant to studies of the efficacy of psychosocial treatment: GABRA2 genotype modifies the variance in drinking and can therefore moderate power for resolving differences between treatments.
Statistics in Medicine, 2010
American Statistician, 2006
2017 International Conference on Computational Intelligence in Data Science(ICCIDS), 2017
ArXiv, 2017
We present an algorithm for classification tasks on big data. Experiments conducted as part of th... more We present an algorithm for classification tasks on big data. Experiments conducted as part of this study indicate that the algorithm can be as accurate as ensemble methods such as random forests or gradient boosted trees. Unlike ensemble methods, the models produced by the algorithm can be easily interpreted. The algorithm is based on a divide and conquer strategy and consists of two steps. The first step consists of using a decision tree to segment the large dataset. By construction, decision trees attempt to create homogeneous class distributions in their leaf nodes. However, non-homogeneous leaf nodes are usually produced. The second step of the algorithm consists of using a suitable classifier to determine the class labels for the non-homogeneous leaf nodes. The decision tree segment provides a coarse segment profile while the leaf level classifier can provide information about the attributes that affect the label within a segment.
Computational Statistics, 2020
Calcutta Statistical Association Bulletin, 2018
Datasets with a mixture of numerical and categorical attributes are routinely encountered in many... more Datasets with a mixture of numerical and categorical attributes are routinely encountered in many application domains. Such datasets do not have a direct representation in Euclidean space. As a consequence, dissimilarity measures such as the Gower distance are used when partitioning clustering approaches are used with such datasets. Homogeneity analysis (HA) can be used to determine a Euclidean representation of mixed datasets. Such a representation can be analysed by leveraging the large body of tools and techniques for data with a Euclidean representation. The utility of the representation obtained from HA is not limited to clustering. This representation can be used to visualize mixed datasets and generate succinct numerical summaries. Such summaries can yield clues about associations between variables which may be difficult to discover otherwise. AMS Classification Code: 62-07
New Economic Windows, 2019
Gastrointestinal Endoscopy, 2009
Journal of Clinical Gastroenterology, 2010
Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) h... more Endoscopic ultrasound (EUS)-guided celiac plexus block (CPB) and celiac plexus neurolysis (CPN) have become important interventions in the management of pain due to chronic pancreatitis and pancreatic cancer. However, only a few well-structured studies have been performed to evaluate their efficacy. Given limited data, their use remains controversial. Herein, we evaluate the efficacy of EUS-guided CPB and CPN in alleviating chronic abdominal pain due to chronic pancreatitis and pancreatic cancer respectively. Using Medline, Pubmed, and Embase databases from January 1966 through December 2007, a thorough search of the English literature for studies evaluating the efficacy of EUS-guided CPB and CPN for the management of chronic abdominal pain due to chronic pancreatitis and pancreatic cancer was conducted, along with a hand search of reference lists. Studies that involved less than 10 patients were excluded. Data on pain relief was extracted, pooled, and analyzed. A total of 9 studies were included in the final analysis. For chronic pancreatitis, 6 relevant studies were identified, comprising a total of 221 patients. EUS-guided CPB was effective in alleviating abdominal pain in 51.46% of patients. For pancreatic cancer, 5 relevant studies were identified with a total of 119 patients. EUS-guided CPN was effective in alleviating abdominal pain in 72.54% of patients. EUS-guided CPB was 51.46% effective in managing chronic abdominal pain in patients with chronic pancreatitis, but warrants improvement in patient selection and refinement of technique, whereas EUS-guided CPN was 72.54% effective in managing pain due to pancreatic cancer and is a reasonable option for patients with tolerance to narcotic analgesics.
Statistics & Probability Letters, 2010
Alcoholism-clinical and Experimental Research, 2007
Background: Previous studies demonstrated, and replicated, an association between single nucleot... more Background: Previous studies demonstrated, and replicated, an association between single nucleotide polymorphisms (SNPs) within the GABRA2 gene and risk for alcohol dependence. The present study examines the association of a GABRA2 SNP with another definition of alcohol involvement and with the effects of psychosocial treatment.Methods: European-American subjects (n = 812, 73.4% male) provided DNA samples for the analysis. All were participants in Project Matching Alcoholism Treatment to Client Heterogeneity (MATCH), a multi-center randomized clinical trial evaluating the efficacy of 3 types of psychosocial treatment for alcoholism: Cognitive Behavioral Therapy (CBT), Motivational Enhancement Therapy (MET), or twelve-step facilitation (TSF). The daily probabilities of drinking and heavy drinking were estimated during the 12-week treatment and 12-month post-treatment periods.Results: Subjects homozygous for the allele associated with low risk for alcohol dependence in previous studies had lower daily probabilities of drinking and heavy drinking in the present study. This low-risk allele was also associated with a greater difference in drinking outcomes between the treatments. In addition, it enhanced the relative superiority of TSF over CBT and MET. Population stratification was excluded as a confound using ancestry informative marker analysis.Conclusions: The assessment of genetic vulnerability may be relevant to studies of the efficacy of psychosocial treatment: GABRA2 genotype modifies the variance in drinking and can therefore moderate power for resolving differences between treatments.
Statistics in Medicine, 2010
American Statistician, 2006