Umang Gupta - Profile on Academia.edu (original) (raw)

Papers by Umang Gupta

Lung cancer can present with unilateral atypical facial pain, a rare symptom due to vagus nerve i... more Lung cancer can present with unilateral atypical facial pain, a rare symptom due to vagus nerve involvement or paraneoplastic syndrome. This manifestation is usually missed, delaying the diagnosis and prognosis. We discuss a case of a 45-year-old male who presented with right-sided hemifacial pain and with normal neurological investigations.

Journal of Health Economics and Outcomes Research

Background: The US population includes 24 million to 29 million people with diagnosed and undiagn... more Background: The US population includes 24 million to 29 million people with diagnosed and undiagnosed chronic obstructive pulmonary disease (COPD). Studies have demonstrated the safety and efficacy of single-inhaler triple therapy (SITT) in reducing COPD exacerbations. Long-term population implications of SITT use have not been quantified. Objectives: This simulation-based projection aimed to estimate the potential impact of widespread SITT use on the US COPD population. Methods: Exacerbation and all-cause mortality reductions reported in the Efficacy and Safety of Triple Therapy in Obstructive Lung Disease trial (ETHOS; NCT02465567) were used to project clinical outcomes in US patients meeting ETHOS trial eligibility criteria (ETHOS-Eligible) and patients meeting a practical definition of SITT eligibility (Expanded ETHOS-Eligible). The US COPD population was modeled with 1000 simulations of patient progression over 10 years. Agent characteristics were based on literature and claims...

arXiv (Cornell University), May 6, 2021

Ensuring the privacy of research participants is vital, even more so in healthcare environments. ... more Ensuring the privacy of research participants is vital, even more so in healthcare environments. Deep learning approaches to neuroimaging require large datasets, and this often necessitates sharing data between multiple sites, which is antithetical to the privacy objectives. Federated learning is a commonly proposed solution to this problem. It circumvents the need for data sharing by sharing parameters during the training process. However, we demonstrate that allowing access to parameters may leak private information even if data is never directly shared. In particular, we show that it is possible to infer if a sample was used to train the model given only access to the model prediction (black-box) or access to the model itself (white-box) and some leaked samples from the training data distribution. Such attacks are commonly referred to as Membership Inference attacks. We show realistic Membership Inference attacks on deep learning models trained for 3D neuroimaging tasks in a centralized as well as decentralized setup. We demonstrate feasible attacks on brain age prediction models (deep learning models that predict a person's age from their brain MRI scan). We correctly identified whether an MRI scan was used in model training with a 60% to over 80% success rate depending on model complexity and security assumptions.

Nitroisobenzofuranone, a small molecule inhibitor of multidrug-resistant Staphylococcus aureus, targets peptidoglycan biosynthesis

Chemical Communications

To target antimicrobial resistance, 4-nitroisobenzofuran-1(3H)-one (IITK2020), is presented as an... more To target antimicrobial resistance, 4-nitroisobenzofuran-1(3H)-one (IITK2020), is presented as an exclusive inhibitor of S. aureus including drug-resistant S. aureus clinical strains, that prevents peptidoglycan biosynthesis.

Proceedings of the 2nd Workshop on Trustworthy Natural Language Processing (TrustNLP 2022)

The widespread use of Artificial Intelligence (AI) in consequential domains, such as healthcare a... more The widespread use of Artificial Intelligence (AI) in consequential domains, such as healthcare and parole decisionmaking systems, has drawn intense scrutiny on the fairness of these methods. However, ensuring fairness is often insufficient as the rationale for a contentious decision needs to be audited, understood, and defended. We propose that the attention mechanism can be used to ensure fair outcomes while simultaneously providing feature attributions to account for how a decision was made. Toward this goal, we design an attention-based model that can be leveraged as an attribution framework. It can identify features responsible for both performance and fairness of the model through attention interventions and attention weight manipulation. Using this attribution framework, we then design a post-processing bias mitigation strategy and compare it with a suite of baselines. We demonstrate the versatility of our approach by conducting experiments on two distinct data types, tabular and textual.

arXiv (Cornell University), Apr 26, 2022

To improve federated training of neural networks, we develop FedSparsify, a sparsification strate... more To improve federated training of neural networks, we develop FedSparsify, a sparsification strategy based on progressive weight magnitude pruning. Our method has several benefits. First, since the size of the network becomes increasingly smaller, computation and communication costs during training are reduced. Second, the models are incrementally constrained to a smaller set of parameters, which facilitates alignment/merging of the local models and improved learning performance at high sparsification rates. Third, the final sparsified model is significantly smaller, which improves inference efficiency and optimizes operations latency during encrypted communication. We show experimentally that FedSparsify learns a subnetwork of both high sparsity and learning performance. Our sparse models can reach a tenth of the size of the original model with the same or better accuracy compared to existing pruning and nonpruning baselines.

Ordering policy based routing and bandwidth assignment algorithms in optical networks

2017 International Conference on Information, Communication, Instrumentation and Control (ICICIC), 2017

Today, the Internet traffic is crowded by emerging bandwidth hungry multimedia services. These se... more Today, the Internet traffic is crowded by emerging bandwidth hungry multimedia services. These services required dynamic bandwidth which may be high, low or moderate. For establishing such a connection demand, we propose a novel routing and bandwidth assignment (RBA) algorithm. In order to efficiently utilize the network resources and to reduce the connection blocking probability, we propose various ordering policies based RBA algorithms. For RBA, two constraints, the wavelength continuity and wavelength contiguity constraint must be satisfied, which is different and complex with the traditional routing and wavelength assignment (RWA) technique. In ordering policies based RBA algorithms, the entire set of traffic demands are arranged either by their route length, or bandwidth requirement, or considering both. We performed the extensive simulation experiments in MATLAB environment and compared the proposed policies with the simple RBA algorithm. It is observed from the result that some of the proposed ordering policies outperform the RBA, on the metric of blocking probability, and resource utilization ratio.

Proceedings of the 10th International Joint Conference on Computational Intelligence, 2018

In our day to day life, we come across situations which are interpreted differently by different ... more In our day to day life, we come across situations which are interpreted differently by different human beings. A given sentence may be offensive to some humans but not to others. Similarly, a sentence can convey different emotions to different human beings. For instance, "Why you never text me!", can either be interpreted as a sad or an angry utterance. Lack of facial expressions and voice modulations make detecting emotions in textual sentences a hard problem. Some textual sentences are inherently ambiguous and their true emotion label is difficult to determine. In this paper, we study how to use crowdsourcing for an ambiguous task of determining emotion labels of textual sentences. Crowdsourcing has become one of the most popular medium for obtaining large scale labeled data for supervised learning tasks. However, for our task, due to the intrinsic ambiguity, human annotators differ in opinions about the underlying emotion of certain sentences. In our work, we harness the multiple perspectives of annotators for ambiguous sentences to improve the performance of an emotion detection model. In particular, we compare our technique against the popularly used technique of majority vote to determine the label of a given sentence. Our results indicate that considering diverse perspective of annotators is helpful for the ambiguous task of emotion detection.

Optimizing medium access using integer sequences in wireless networks

2017 International Conference on Communication and Signal Processing (ICCSP), 2017

IEEE 802.11 Distributed Coordinated Function (DCF) considers Backoff algorithm as a core element ... more IEEE 802.11 Distributed Coordinated Function (DCF) considers Backoff algorithm as a core element of Media Access Control (MAC) protocols, as it minimizes collisions in wireless networks. This paper introduces an Integer Sequences based Backoff Algorithm (ISBA) which exploits exponential, cubic, jacobsthal and catalan integer sequences to determine proper Contention Window (CW) size. Based on backoff stages and acknowledgment failure count, these integer sequences are utilized to acquire the adequate growth rate of CW. This leads to safe medium access to curtail end-to-end delay and collisions among competing stations. The performance of the proposed ISBA algorithm is evaluated by computing throughput, packet loss and delay using NS2 simulator. Based on the extracted simulation results, it is inferred that the proposed algorithm outperforms Pessimistic Fibonacci Backoff (PFB) and Binary Exponential Backoff (BEB), especially in the static grid and random environment topologies.

Automatic categorization of computer science research papers using just the abstracts, is a hard ... more Automatic categorization of computer science research papers using just the abstracts, is a hard problem to solve. This is due to the short text length of the abstracts. Also, abstracts are a general discussion of the topic with few domain specific terms. These reasons make it hard to generate good representations of abstracts which in turn leads to poor categorization performance. To address this challenge, external Knowledge Bases (KB) like Wikipedia, Freebase etc. can be used to enrich the representations for abstracts, which can aid in the categorization task. In this work, we propose a novel method for enhancing classification performance of research papers into ACM computer science categories using knowledge extracted from related Wikipedia articles and Freebase entities. We use state-of-the-art representation learning methods for feature representation of documents, followed by learning to rank method for classification. Given the abstracts of research papers from the Citatio...

International Journal of Scientific Research in Science and Technology, 2021

The aim of this paper is to investigate the level of communication that has contributed for India... more The aim of this paper is to investigate the level of communication that has contributed for India’s rural development. The other reason is to check the success and malfunction of the different tools of communication in the rural development journey where many are fighting and stressing on improving the quality of rural life. In this article we will analyze some communication projects which are for rural development. Also we will discuss about the post liberalization impact on media for rural and urban development. This paper will trace the history of print and electronic media use for rural development. Initially it defines the kind of different activities initiated by government and non-government organizations that are in use as a part of developmental practice. It also investigates about some of the issues that implementation of rural development programs have been affected with and argues that different tools of mass media could overcome some of the flaws in implementation. The ...

In this project, we are tackling the problem of extractive reading comprehension (RC) where answe... more In this project, we are tackling the problem of extractive reading comprehension (RC) where answers may or may not be in the paragraph. We start with QANet (Yu et al., 2018) as our starting model and implement some of the existing modification to handle the case when the question can not be answered as baselines. We summarize the results of our baselines and discuss further direction.

ArXiv, 2017

Emotions are physiological states generated in humans in reaction to internal or external events.... more Emotions are physiological states generated in humans in reaction to internal or external events. They are complex and studied across numerous fields including computer science. As humans, on reading "Why don't you ever text me!" we can either interpret it as a sad or angry emotion and the same ambiguity exists for machines. Lack of facial expressions and voice modulations make detecting emotions from text a challenging problem. However, as humans increasingly communicate using text messaging applications, and digital agents gain popularity in our society, it is essential that these digital agents are emotion aware, and respond accordingly. In this paper, we propose a novel approach to detect emotions like happy, sad or angry in textual conversations using an LSTM based Deep Learning model. Our approach consists of semi-automated techniques to gather training data for our model. We exploit advantages of semantic and sentiment based embeddings and propose a solution com...

Lecture Notes in Computer Science, 2019

Finding Nash equilibrium in continuous action spaces is a challenging problem and has application... more Finding Nash equilibrium in continuous action spaces is a challenging problem and has applications in domains such as protecting geographic areas from potential attackers. We present DeepFP, an approximate extension of fictitious play in continuous action spaces. DeepFP represents players' approximate best responses via generative neural networks which are highly expressive implicit density approximators. It additionally uses a game-model network which approximates the players' expected payoffs given their actions, and trains the networks end-to-end in a model-based learning regime. Further, DeepFP allows using domain-specific oracles if available and can hence exploit techniques such as mathematical programming to compute best responses for structured games. We demonstrate stable convergence to Nash equilibrium on several classic games and also apply DeepFP to a large forest security domain with a novel defender best response oracle. We show that DeepFP learns strategies robust to adversarial exploitation and scales well with growing number of players' resources.

A New Neighborhood-Based Outlier Detection Technique

Lecture Notes in Electrical Engineering, 2019

Outlier detection is one of the most vital and essential issues in data mining tasks. We propose ... more Outlier detection is one of the most vital and essential issues in data mining tasks. We propose a new method to detect and analyze outliers. We apply neighborhood-based outlier detection technique to detect and analyze the outliers. Using weights of the neighbors of each data and a unique parameter OBN is used to identify the outlier. Our proposed algorithm is tested on real datasets and compared with the existing technique and the results are presented.

Y STR haplotype diversity in central Indian population

Annals of Human Biology, 2015

Seventeen Y-STR loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a/b, DYS437... more Seventeen Y-STR loci (DYS19, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS385a/b, DYS437, DYS438, DYS439, DYS448, DYS456, DYS458, DYS635 and Y-GATA-H4) were analysed in 173 males belonging to the central Indian population with the aim of studying genetic diversity and adding to the population database. Multiplexed PCR amplifications of the 17 Y STR loci were performed using AmpFlSTR® Yfiler® Kit. Amplified products were genotyped using a multi capillary electrophoresis with POP-4 polymer in ABI Prism 3100 Genetic Analyzer. Population genetic diversity and allele frequencies were calculated. The haplotype data obtained in the study was compared with the Y-STR haplotypes reference database (YHRD, http://www.yhrd.org ) and with previously published population data using the AMOVA tool and visualised in two-dimensional multidimensional scaling (MDS) plots. A total of 147 haplotypes were observed, out of which 125 were unique. Haplotype diversity and discriminating capacity were found to be 0.9979 and 0.8497, respectively. The gene diversity at the loci ranged from 0.398-0.785. Genotype diversity at the locus DYS385a/b was found to be 0.869. The population of central India was found to be significantly different (p &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lt; 0.05) when compared with populations from other parts of the Indian sub-continent and the population data of other countries. The population data generated in this study are useful for forensic, anthropological and demographic studies.

International Journal of Pediatric Otorhinolaryngology, 2015

Introduction: Socioeconomic differences in the society have been a major cause for the discrepanc... more Introduction: Socioeconomic differences in the society have been a major cause for the discrepancy in disease and behavioural patterns in society. With 360 million people (32 million children) in the world suffering from disabling hearing losses, it is imperative to gain an insight into the impact of differences in socioeconomic strata on children's ear health issues, their knowledge of ear ailments and attitude towards ear health so as to suggest policies addressing ear health issues. Methods: The study was carried out in two different school types namely government schools and private schools which represent wide difference in the socioeconomic status of the students studying there. A questionnaire was administered to students aged 10 to 13 years to assess the current ear care practices, knowledge regarding ear ailments, attitude towards hearing and their adaptability to reform. Results: The children belonging to higher socioeconomic status were found to have lesser incidence of ear diseases and ear abuse, more referrals for ear ailments, lesser indulgence in risky ear health behaviours, better knowledge pool, positive attitude towards ear health and hearing and were more adaptable to change for better hearing. Conclusion: Structures of social disparity are essential determinants of ear health acting both independently and through their influence on behavioural determinants of health. Increasing awareness of ear health issues at the school level itself should be one of the goals of health care providers.

Procedia - Social and Behavioral Sciences, 2015

Healthcare supply chain has been a subject of interest for many years. The pressure of changes in... more Healthcare supply chain has been a subject of interest for many years. The pressure of changes in environment lead to changes in guiding principles which produce solid problems viewed as problems with no feasible solutions. Healthcare supply chain demands effort on part of researchers to not view the problems as static. The different factors need to be viewed as dynamic. We attempt to highlight the benefits of adopting a factor interaction approach to hospital protocol. The interpretative system modelling (ISM) approach is utilized for interaction of variables affecting healthcare supply chain. This can provide healthcare sector professionals with programme guidelines where they can investigate their dynamic effects. Case study analysis and interviews have been used for generating data for ISM to understand the frame of reference in which Healthcare supply chain operate. ISM methodology has been successfully used to observe the boundaries between the different stratums of various systems and visualise the different echelons of the interacting variables of Healthcare supply chain operations. This helps to embrace and integrate human thinking. This is an advantage over hard quantitative approaches where expert knowledge is pushed to give a number to their opinion usually in form of weights to variables such as structured equation modelling or goal programming using historical data. The research paper aims to build a new frame of reference for studying and measuring the performance of Healthcare supply chain operations.

Comparison of Self organizing maps and Sammon's mapping on agricultural datasets for precision agriculture

2015 International Conference on Innovations in Information, Embedded and Communication Systems (ICIIECS), 2015

Over the ages technology has been occupying every field including the agriculture. Precision agri... more Over the ages technology has been occupying every field including the agriculture. Precision agriculture and Visual data mining uses technology to apply specific principles of data to interpret details like when and how much fertilizers to be used in a particular area (land). Data mining is the process of detecting patterns in a clustering form (certain chunk of information) to get more precise and accurate information. Not only it provides facilitating results but also improves the efficiency of farmer's productivity, it has also helped in qualitative improvement in the overall quality of life by providing timely and data inputs for decision making. Raw data collected from the statistics analysis has helped in determining the data to its full extent. This paper shows the techniques applied to real and till date agriculture data to reduce the high dimensional input data to much smaller size. We are using Self-organizing maps and multi-dimensional scaling techniques (Sammon's mapping) to reduce the data.