Aditya Chattopadhyay - Academia.edu (original) (raw)

Papers by Aditya Chattopadhyay

Research paper thumbnail of Quantifying Task Complexity Through Generalized Information Measures

How can we measure the "complexity" of a learning task so that we can compare one task to another... more How can we measure the "complexity" of a learning task so that we can compare one task to another? From classical information theory, we know that entropy is a useful measure of the complexity of a random variable and provides a lower bound on the minimum expected number of bits needed for transmitting its state. In this paper, we propose to measure the complexity of a learning task by the minimum expected number of questions that need to be answered to solve the task. For example, the minimum expected number of patches that need to be observed to classify FashionMNIST images. We prove several properties of the proposed complexity measure, including connections with classical entropy and sub-additivity for multiple tasks. As the computation of the minimum expected number of questions is generally intractable, we propose a greedy procedure called "information pursuit" (IP), which selects one question at a time depending on previous questions and their answers. This requires learning a probabilistic generative model relating data and questions to the task, for which we employ variational autoencoders and normalizing flows. We illustrate the usefulness of the proposed measure on various binary image classification tasks using image patches as the query set. Our results indicate that the complexity of a classification task increases as signal-to-noise ratio decreases, and that classification of the KMNIST dataset is more complex than classification of the FashionMNIST dataset. As a byproduct of choosing patches as queries, our approach also provides a principled way of determining which pixels in an image are most informative for a task.

Research paper thumbnail of Urea-water solvation of protein side chain models

Journal of Molecular Liquids, Aug 1, 2020

Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydro... more Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydrophilic and hydrophobic residues favorably. The nature of interactions that stabilize different types of amino acid side chains in their solvent exposed state is still not understood. To gain insights into the molecular level details of urea interactions with proteins in their unfolded states, we have performed atomistic molecular dynamics simulations and free energy calculations using the thermodynamic integration method on model systems representing side chains of all amino acids in different solvent environments (water and varying concentrations of aqueous urea). A systematic analysis of structural, energetic and dynamic parameters has been done to understand the detailed atomistic mechanism. The main aim of the current study is to unravel the nature of urea-amino acid interactions by emphasizing on the chemical nature of amino acid side chain models. The preferential interactions of urea over water with each side chain and backbone model systems in various concentrations of aqueous urea were quantified using the two-domain model, and it is validated by mean lifetime calculations. Interestingly, almost all amino acids showed a preference for urea over water. The order of preferences depending on the chemical nature of the amino acids is obtained with the aromatic groups exhibiting the highest preferences followed by hydrophobic groups, followed by amides and basic groups, and the least by nucleophilic groups. The extensive energetic analysis revealed, these preferential interactions are enthalpically and entropically driven and are dominated by dispersion effects. Spatial density distributions and radial distribution analyses provide insights to understand the different modes and urea orientation towards preferred sites of interactions by which urea-protein interactions stabilize proteins in their unfolded states by forming favorable interactions with exposed amino acids side chains.

Research paper thumbnail of A Probabilistic Framework for Constructing Temporal Relations in Replica Exchange Molecular Trajectories

Journal of Chemical Theory and Computation, May 23, 2018

I consider myself extremely fortunate to have Harish Krishna, Siladitya Padhi, Kush Motwani, Prit... more I consider myself extremely fortunate to have Harish Krishna, Siladitya Padhi, Kush Motwani, Pritam Verma and Jahnavi Gamalapati as friends. They have spent a considerable amount of their time brainstorming ideas, giving valuable feedback and proofreading this thesis. Their involvement greatly increased the quality and relevance of this thesis. Lastly, I would like to thank my parents for their limitless patience and unwavering faith. Without them, the completion of this thesis would not have been possible.

Research paper thumbnail of Role of Urea–Aromatic Stacking Interactions in Stabilizing the Aromatic Residues of the Protein in Urea-Induced Denatured State

Journal of the American Chemical Society, Oct 17, 2017

Research paper thumbnail of Energetic, Structural and Dynamic Properties of Nucleobase-Urea Interactions that Aid in Urea Assisted RNA Unfolding

Scientific Reports, Jun 19, 2019

Understanding the structure-function relationships of RNA has become increasingly important given... more Understanding the structure-function relationships of RNA has become increasingly important given the realization of its functional role in various cellular processes. Chemical denaturation of RNA by urea has been shown to be beneficial in investigating RNA stability and folding. Elucidation of the mechanism of unfolding of RNA by urea is important for understanding the folding pathways. In addition to studying denaturation of RNA in aqueous urea, it is important to understand the nature and strength of interactions of the building blocks of RNA. In this study, a systematic examination of the structural features and energetic factors involving interactions between nucleobases and urea is presented. Results from molecular dynamics (MD) simulations on each of the five DNA/RNA bases in water and eight different concentrations of aqueous urea, and free energy calculations using the thermodynamic integration method are presented. the interaction energies between all the nucleobases with the solvent environment and the transfer free energies become more favorable with respect to increase in the concentration of urea. preferential interactions of urea versus water molecules with all model systems determined using Kirkwood-Buff integrals and two-domain models indicate preference of urea by nucleobases in comparison to water. the modes of interaction between urea and the nucleobases were analyzed in detail. In addition to the previously identified hydrogen bonding and stacking interactions between urea and nucleobases that stabilize the unfolded states of RNA in aqueous solution, NH-π interactions are proposed to be important. Dynamic properties of each of these three modes of interactions have been presented. the study provides fundamental insights into the nature of interaction of urea molecules with nucleobases and how it disrupts nucleic acids. Understanding the underlying mechanism of folding and unfolding of biological macromolecules such as proteins and nucleic acids is essential, given their roles in cellular functions. Stabilities of these macromolecules are altered in the presence of various osmolytes, enzymes, and other denaturants 1. Early studies demonstrated the importance of osmolytes in living organisms to cope with the osmotic pressure change 2. Furthermore, a number of experimental studies have shown the role of osmolytes as denaturant or stabilizers depending on the nature of the osmolytes 2-5. These experiments have played a crucial role in the understanding of protein stability and folding pathways. Urea is a polar molecule with a large dipole moment and has a strong effect on protein stability 6. The underlying molecular mechanism behind urea-induced protein unfolding has been discussed extensively in the literature 4,7-10. Two different mechanisms by which protein denaturation is achieved by aqueous urea have been proposed, namely, the direct and indirect mechanism. According to the direct mechanism, urea competes with the native intramolecular interactions within the protein structure 2,11,12. The indirect mechanism suggests that urea alters the structure of water, which facilitates the weakening of the hydrophobic effect and destabilizes the native conformation 13. Several experimental and computational studies have been performed to understand the structure, energetics, thermodynamic and mechanical aspects of urea assisted direct and indirect mechanism behind protein folding/unfolding 2,14-22. Urea has also been shown to denature RNA, and these experiments provide insights into RNA stability and folding 15,23-26. Unlike the mechanism of urea-induced protein denaturation, the effect of urea on nucleic acid unfolding is less understood. Several studies reported that urea induces center for computational natural Sciences and Bioinformatics, international institute of information technology,

Research paper thumbnail of Neural Network Attributions: A Causal Perspective

International Conference on Machine Learning, Feb 6, 2019

We propose a new attribution method for neural networks developed using first principles of causa... more We propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such). The neural network architecture is viewed as a Structural Causal Model, and a methodology to compute the causal effect of each feature on the output is presented. With reasonable assumptions on the causal structure of the input data, we propose algorithms to efficiently compute the causal effects, as well as scale the approach to data with large dimensionality. We also show how this method can be used for recurrent neural networks. We report experimental results on both simulated and real datasets showcasing the promise and usefulness of the proposed algorithm.

Research paper thumbnail of Interpretable by Design: Learning Predictors by Composing Interpretable Queries

IEEE Transactions on Pattern Analysis and Machine Intelligence, Jun 1, 2023

There is a growing concern about typically opaque decision-making with high-performance machine l... more There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive domains such as healthcare. We argue that machine learning algorithms should be interpretable by design and that the language in which these interpretations are expressed should be domain-and task-dependent. Consequently, we base our model's prediction on a family of user-defined and task-specific binary functions of the data, each having a clear interpretation to the end-user. We then minimize the expected number of queries needed for accurate prediction on any given input. As the solution is generally intractable, following prior work, we choose the queries sequentially based on information gain. However, in contrast to previous work, we need not assume the queries are conditionally independent. Instead, we leverage a stochastic generative model (VAE) and an MCMC algorithm (Unadjusted Langevin) to select the most informative query about the input based on previous query-answers. This enables the online determination of a query chain of whatever depth is required to resolve prediction ambiguities. Finally, experiments on vision and NLP tasks demonstrate the efficacy of our approach and its superiority over post-hoc explanations.

Research paper thumbnail of Variational Information Pursuit for Interpretable Predictions

arXiv (Cornell University), Feb 6, 2023

There is a growing interest in the machine learning community in developing predictive algorithms... more There is a growing interest in the machine learning community in developing predictive algorithms that are "interpretable by design". Towards this end, recent work proposes to make interpretable decisions by sequentially asking interpretable queries about data until a prediction can be made with high confidence based on the answers obtained (the history). To promote short query-answer chains, a greedy procedure called Information Pursuit (IP) is used, which adaptively chooses queries in order of information gain. Generative models are employed to learn the distribution of query-answers and labels, which is in turn used to estimate the most informative query. However, learning and inference with a full generative model of the data is often intractable for complex tasks. In this work, we propose Variational Information Pursuit (V-IP), a variational characterization of IP which bypasses the need for learning generative models. V-IP is based on finding a query selection strategy and a classifier that minimizes the expected cross-entropy between true and predicted labels. We then demonstrate that the IP strategy is the optimal solution to this problem. Therefore, instead of learning generative models, we can use our optimal strategy to directly pick the most informative query given any history. We then develop a practical algorithm by defining a finite-dimensional parameterization of our strategy and classifier using deep networks and train them end-to-end using our objective. Empirically, V-IP is 10-100x faster than IP on different Vision and NLP tasks with competitive performance. Moreover, V-IP finds much shorter query chains when compared to reinforcement learning which is typically used in sequential-decision-making problems. Finally, we demonstrate the utility of V-IP on challenging tasks like medical diagnosis where the performance is far superior to the generative modelling approach. 1

Research paper thumbnail of Learning Graph Variational Autoencoders with Constraints and Structured Priors for Conditional Indoor 3D Scene Generation

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Research paper thumbnail of Interpretable by Design: Learning Predictors by Composing Interpretable Queries

IEEE Transactions on Pattern Analysis and Machine Intelligence

There is a growing concern about typically opaque decision-making with high-performance machine l... more There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive domains such as healthcare. We argue that machine learning algorithms should be interpretable by design and that the language in which these interpretations are expressed should be domain-and task-dependent. Consequently, we base our model's prediction on a family of user-defined and task-specific binary functions of the data, each having a clear interpretation to the end-user. We then minimize the expected number of queries needed for accurate prediction on any given input. As the solution is generally intractable, following prior work, we choose the queries sequentially based on information gain. However, in contrast to previous work, we need not assume the queries are conditionally independent. Instead, we leverage a stochastic generative model (VAE) and an MCMC algorithm (Unadjusted Langevin) to select the most informative query about the input based on previous query-answers. This enables the online determination of a query chain of whatever depth is required to resolve prediction ambiguities. Finally, experiments on vision and NLP tasks demonstrate the efficacy of our approach and its superiority over post-hoc explanations.

Research paper thumbnail of Neural Network Attributions: A Causal Perspective

arXiv (Cornell University), Feb 6, 2019

We propose a new attribution method for neural networks developed using first principles of causa... more We propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such). The neural network architecture is viewed as a Structural Causal Model, and a methodology to compute the causal effect of each feature on the output is presented. With reasonable assumptions on the causal structure of the input data, we propose algorithms to efficiently compute the causal effects, as well as scale the approach to data with large dimensionality. We also show how this method can be used for recurrent neural networks. We report experimental results on both simulated and real datasets showcasing the promise and usefulness of the proposed algorithm.

Research paper thumbnail of Structured Graph Variational Autoencoders for Indoor Furniture layout Generation

arXiv (Cornell University), Apr 11, 2022

We present a graph variational autoencoder with a structured prior for generating the layout of i... more We present a graph variational autoencoder with a structured prior for generating the layout of indoor 3D scenes. Given the room type (e.g., living room or library) and the room layout (e.g., room elements such as floor and walls), our architecture generates a collection of objects (e.g., furniture items such as sofa, table and chairs) that is consistent with the room type and layout. This is a challenging problem because the generated scene should satisfy multiple constrains, e.g., each object should lie inside the room and two objects should not occupy the same volume. To address these challenges, we propose a deep generative model that encodes these relationships as soft constraints on an attributed graph (e.g., the nodes capture attributes of room and furniture elements, such as class, pose and size, and the edges capture geometric relationships such as relative orientation). The architecture consists of a graph encoder that maps the input graph to a structured latent space, and a graph decoder that generates a furniture graph, given a latent code and the room graph. The latent space is modeled with auto-regressive priors, which facilitates the generation of highly structured scenes. We also propose an efficient training procedure that combines matching and constrained learning. Experiments on the 3D-FRONT dataset show that our method produces scenes that are diverse and are adapted to the room layout.

Research paper thumbnail of Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in so... more Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems. However, these deep models are perceived as ”black box” methods considering the lack of understanding of their internal functioning. There has been a significant recent interest in developing explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art. We provide a mathematical derivation for the proposed method, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for ...

Research paper thumbnail of Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in so... more Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems. However, these deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning. There has been a significant recent interest in developing explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art. We provide a mathematical derivation for the proposed method, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the corresponding class label. Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ provides promising human-interpretable visual explanations for a given CNN architecture across multiple tasks including classification, image caption generation and 3D action recognition; as well as in new settings such as knowledge distillation.

Research paper thumbnail of Urea-water solvation of protein side chain models

Journal of Molecular Liquids, 2020

Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydro... more Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydrophilic and hydrophobic residues favorably. The nature of interactions that stabilize different types of amino acid side chains in their solvent exposed state is still not understood. To gain insights into the molecular level details of urea interactions with proteins in their unfolded states, we have performed atomistic molecular dynamics simulations and free energy calculations using the thermodynamic integration method on model systems representing side chains of all amino acids in different solvent environments (water and varying concentrations of aqueous urea). A systematic analysis of structural, energetic and dynamic parameters has been done to understand the detailed atomistic mechanism. The main aim of the current study is to unravel the nature of urea-amino acid interactions by emphasizing on the chemical nature of amino acid side chain models. The preferential interactions of urea over water with each side chain and backbone model systems in various concentrations of aqueous urea were quantified using the two-domain model, and it is validated by mean lifetime calculations. Interestingly, almost all amino acids showed a preference for urea over water. The order of preferences depending on the chemical nature of the amino acids is obtained with the aromatic groups exhibiting the highest preferences followed by hydrophobic groups, followed by amides and basic groups, and the least by nucleophilic groups. The extensive energetic analysis revealed, these preferential interactions are enthalpically and entropically driven and are dominated by dispersion effects. Spatial density distributions and radial distribution analyses provide insights to understand the different modes and urea orientation towards preferred sites of interactions by which urea-protein interactions stabilize proteins in their unfolded states by forming favorable interactions with exposed amino acids side chains.

Research paper thumbnail of Energetic, Structural and Dynamic Properties of Nucleobase-Urea Interactions that Aid in Urea Assisted RNA Unfolding

Scientific Reports, 2019

Understanding the structure-function relationships of RNA has become increasingly important given... more Understanding the structure-function relationships of RNA has become increasingly important given the realization of its functional role in various cellular processes. Chemical denaturation of RNA by urea has been shown to be beneficial in investigating RNA stability and folding. Elucidation of the mechanism of unfolding of RNA by urea is important for understanding the folding pathways. In addition to studying denaturation of RNA in aqueous urea, it is important to understand the nature and strength of interactions of the building blocks of RNA. In this study, a systematic examination of the structural features and energetic factors involving interactions between nucleobases and urea is presented. Results from molecular dynamics (MD) simulations on each of the five DNA/RNA bases in water and eight different concentrations of aqueous urea, and free energy calculations using the thermodynamic integration method are presented. The interaction energies between all the nucleobases with ...

Research paper thumbnail of Role of Urea–Aromatic Stacking Interactions in Stabilizing the Aromatic Residues of the Protein in Urea-Induced Denatured State

Journal of the American Chemical Society, 2017

Research paper thumbnail of A Probabilistic Framework for Constructing Temporal Relations in Replica Exchange Molecular Trajectories

Journal of chemical theory and computation, Jan 10, 2018

Knowledge of the structure and dynamics of biomolecules is essential for elucidating the underlyi... more Knowledge of the structure and dynamics of biomolecules is essential for elucidating the underlying mechanisms of biological processes. Given the stochastic nature of many biological processes, like protein unfolding, it is almost impossible that two independent simulations will generate the exact same sequence of events, which makes direct analysis of simulations difficult. Statistical models like Markov chains, transition networks, etc. help in shedding some light on the mechanistic nature of such processes by predicting long-time dynamics of these systems from short simulations. However, such methods fall short in analyzing trajectories with partial or no temporal information, for example, replica exchange molecular dynamics or Monte Carlo simulations. In this work, we propose a probabilistic algorithm, borrowing concepts from graph theory and machine learning, to extract reactive pathways from molecular trajectories in the absence of temporal data. A suitable vector representati...

Research paper thumbnail of Quantifying Task Complexity Through Generalized Information Measures

How can we measure the "complexity" of a learning task so that we can compare one task to another... more How can we measure the "complexity" of a learning task so that we can compare one task to another? From classical information theory, we know that entropy is a useful measure of the complexity of a random variable and provides a lower bound on the minimum expected number of bits needed for transmitting its state. In this paper, we propose to measure the complexity of a learning task by the minimum expected number of questions that need to be answered to solve the task. For example, the minimum expected number of patches that need to be observed to classify FashionMNIST images. We prove several properties of the proposed complexity measure, including connections with classical entropy and sub-additivity for multiple tasks. As the computation of the minimum expected number of questions is generally intractable, we propose a greedy procedure called "information pursuit" (IP), which selects one question at a time depending on previous questions and their answers. This requires learning a probabilistic generative model relating data and questions to the task, for which we employ variational autoencoders and normalizing flows. We illustrate the usefulness of the proposed measure on various binary image classification tasks using image patches as the query set. Our results indicate that the complexity of a classification task increases as signal-to-noise ratio decreases, and that classification of the KMNIST dataset is more complex than classification of the FashionMNIST dataset. As a byproduct of choosing patches as queries, our approach also provides a principled way of determining which pixels in an image are most informative for a task.

Research paper thumbnail of Urea-water solvation of protein side chain models

Journal of Molecular Liquids, Aug 1, 2020

Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydro... more Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydrophilic and hydrophobic residues favorably. The nature of interactions that stabilize different types of amino acid side chains in their solvent exposed state is still not understood. To gain insights into the molecular level details of urea interactions with proteins in their unfolded states, we have performed atomistic molecular dynamics simulations and free energy calculations using the thermodynamic integration method on model systems representing side chains of all amino acids in different solvent environments (water and varying concentrations of aqueous urea). A systematic analysis of structural, energetic and dynamic parameters has been done to understand the detailed atomistic mechanism. The main aim of the current study is to unravel the nature of urea-amino acid interactions by emphasizing on the chemical nature of amino acid side chain models. The preferential interactions of urea over water with each side chain and backbone model systems in various concentrations of aqueous urea were quantified using the two-domain model, and it is validated by mean lifetime calculations. Interestingly, almost all amino acids showed a preference for urea over water. The order of preferences depending on the chemical nature of the amino acids is obtained with the aromatic groups exhibiting the highest preferences followed by hydrophobic groups, followed by amides and basic groups, and the least by nucleophilic groups. The extensive energetic analysis revealed, these preferential interactions are enthalpically and entropically driven and are dominated by dispersion effects. Spatial density distributions and radial distribution analyses provide insights to understand the different modes and urea orientation towards preferred sites of interactions by which urea-protein interactions stabilize proteins in their unfolded states by forming favorable interactions with exposed amino acids side chains.

Research paper thumbnail of A Probabilistic Framework for Constructing Temporal Relations in Replica Exchange Molecular Trajectories

Journal of Chemical Theory and Computation, May 23, 2018

I consider myself extremely fortunate to have Harish Krishna, Siladitya Padhi, Kush Motwani, Prit... more I consider myself extremely fortunate to have Harish Krishna, Siladitya Padhi, Kush Motwani, Pritam Verma and Jahnavi Gamalapati as friends. They have spent a considerable amount of their time brainstorming ideas, giving valuable feedback and proofreading this thesis. Their involvement greatly increased the quality and relevance of this thesis. Lastly, I would like to thank my parents for their limitless patience and unwavering faith. Without them, the completion of this thesis would not have been possible.

Research paper thumbnail of Role of Urea–Aromatic Stacking Interactions in Stabilizing the Aromatic Residues of the Protein in Urea-Induced Denatured State

Journal of the American Chemical Society, Oct 17, 2017

Research paper thumbnail of Energetic, Structural and Dynamic Properties of Nucleobase-Urea Interactions that Aid in Urea Assisted RNA Unfolding

Scientific Reports, Jun 19, 2019

Understanding the structure-function relationships of RNA has become increasingly important given... more Understanding the structure-function relationships of RNA has become increasingly important given the realization of its functional role in various cellular processes. Chemical denaturation of RNA by urea has been shown to be beneficial in investigating RNA stability and folding. Elucidation of the mechanism of unfolding of RNA by urea is important for understanding the folding pathways. In addition to studying denaturation of RNA in aqueous urea, it is important to understand the nature and strength of interactions of the building blocks of RNA. In this study, a systematic examination of the structural features and energetic factors involving interactions between nucleobases and urea is presented. Results from molecular dynamics (MD) simulations on each of the five DNA/RNA bases in water and eight different concentrations of aqueous urea, and free energy calculations using the thermodynamic integration method are presented. the interaction energies between all the nucleobases with the solvent environment and the transfer free energies become more favorable with respect to increase in the concentration of urea. preferential interactions of urea versus water molecules with all model systems determined using Kirkwood-Buff integrals and two-domain models indicate preference of urea by nucleobases in comparison to water. the modes of interaction between urea and the nucleobases were analyzed in detail. In addition to the previously identified hydrogen bonding and stacking interactions between urea and nucleobases that stabilize the unfolded states of RNA in aqueous solution, NH-π interactions are proposed to be important. Dynamic properties of each of these three modes of interactions have been presented. the study provides fundamental insights into the nature of interaction of urea molecules with nucleobases and how it disrupts nucleic acids. Understanding the underlying mechanism of folding and unfolding of biological macromolecules such as proteins and nucleic acids is essential, given their roles in cellular functions. Stabilities of these macromolecules are altered in the presence of various osmolytes, enzymes, and other denaturants 1. Early studies demonstrated the importance of osmolytes in living organisms to cope with the osmotic pressure change 2. Furthermore, a number of experimental studies have shown the role of osmolytes as denaturant or stabilizers depending on the nature of the osmolytes 2-5. These experiments have played a crucial role in the understanding of protein stability and folding pathways. Urea is a polar molecule with a large dipole moment and has a strong effect on protein stability 6. The underlying molecular mechanism behind urea-induced protein unfolding has been discussed extensively in the literature 4,7-10. Two different mechanisms by which protein denaturation is achieved by aqueous urea have been proposed, namely, the direct and indirect mechanism. According to the direct mechanism, urea competes with the native intramolecular interactions within the protein structure 2,11,12. The indirect mechanism suggests that urea alters the structure of water, which facilitates the weakening of the hydrophobic effect and destabilizes the native conformation 13. Several experimental and computational studies have been performed to understand the structure, energetics, thermodynamic and mechanical aspects of urea assisted direct and indirect mechanism behind protein folding/unfolding 2,14-22. Urea has also been shown to denature RNA, and these experiments provide insights into RNA stability and folding 15,23-26. Unlike the mechanism of urea-induced protein denaturation, the effect of urea on nucleic acid unfolding is less understood. Several studies reported that urea induces center for computational natural Sciences and Bioinformatics, international institute of information technology,

Research paper thumbnail of Neural Network Attributions: A Causal Perspective

International Conference on Machine Learning, Feb 6, 2019

We propose a new attribution method for neural networks developed using first principles of causa... more We propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such). The neural network architecture is viewed as a Structural Causal Model, and a methodology to compute the causal effect of each feature on the output is presented. With reasonable assumptions on the causal structure of the input data, we propose algorithms to efficiently compute the causal effects, as well as scale the approach to data with large dimensionality. We also show how this method can be used for recurrent neural networks. We report experimental results on both simulated and real datasets showcasing the promise and usefulness of the proposed algorithm.

Research paper thumbnail of Interpretable by Design: Learning Predictors by Composing Interpretable Queries

IEEE Transactions on Pattern Analysis and Machine Intelligence, Jun 1, 2023

There is a growing concern about typically opaque decision-making with high-performance machine l... more There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive domains such as healthcare. We argue that machine learning algorithms should be interpretable by design and that the language in which these interpretations are expressed should be domain-and task-dependent. Consequently, we base our model's prediction on a family of user-defined and task-specific binary functions of the data, each having a clear interpretation to the end-user. We then minimize the expected number of queries needed for accurate prediction on any given input. As the solution is generally intractable, following prior work, we choose the queries sequentially based on information gain. However, in contrast to previous work, we need not assume the queries are conditionally independent. Instead, we leverage a stochastic generative model (VAE) and an MCMC algorithm (Unadjusted Langevin) to select the most informative query about the input based on previous query-answers. This enables the online determination of a query chain of whatever depth is required to resolve prediction ambiguities. Finally, experiments on vision and NLP tasks demonstrate the efficacy of our approach and its superiority over post-hoc explanations.

Research paper thumbnail of Variational Information Pursuit for Interpretable Predictions

arXiv (Cornell University), Feb 6, 2023

There is a growing interest in the machine learning community in developing predictive algorithms... more There is a growing interest in the machine learning community in developing predictive algorithms that are "interpretable by design". Towards this end, recent work proposes to make interpretable decisions by sequentially asking interpretable queries about data until a prediction can be made with high confidence based on the answers obtained (the history). To promote short query-answer chains, a greedy procedure called Information Pursuit (IP) is used, which adaptively chooses queries in order of information gain. Generative models are employed to learn the distribution of query-answers and labels, which is in turn used to estimate the most informative query. However, learning and inference with a full generative model of the data is often intractable for complex tasks. In this work, we propose Variational Information Pursuit (V-IP), a variational characterization of IP which bypasses the need for learning generative models. V-IP is based on finding a query selection strategy and a classifier that minimizes the expected cross-entropy between true and predicted labels. We then demonstrate that the IP strategy is the optimal solution to this problem. Therefore, instead of learning generative models, we can use our optimal strategy to directly pick the most informative query given any history. We then develop a practical algorithm by defining a finite-dimensional parameterization of our strategy and classifier using deep networks and train them end-to-end using our objective. Empirically, V-IP is 10-100x faster than IP on different Vision and NLP tasks with competitive performance. Moreover, V-IP finds much shorter query chains when compared to reinforcement learning which is typically used in sequential-decision-making problems. Finally, we demonstrate the utility of V-IP on challenging tasks like medical diagnosis where the performance is far superior to the generative modelling approach. 1

Research paper thumbnail of Learning Graph Variational Autoencoders with Constraints and Structured Priors for Conditional Indoor 3D Scene Generation

2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Research paper thumbnail of Interpretable by Design: Learning Predictors by Composing Interpretable Queries

IEEE Transactions on Pattern Analysis and Machine Intelligence

There is a growing concern about typically opaque decision-making with high-performance machine l... more There is a growing concern about typically opaque decision-making with high-performance machine learning algorithms. Providing an explanation of the reasoning process in domain-specific terms can be crucial for adoption in risk-sensitive domains such as healthcare. We argue that machine learning algorithms should be interpretable by design and that the language in which these interpretations are expressed should be domain-and task-dependent. Consequently, we base our model's prediction on a family of user-defined and task-specific binary functions of the data, each having a clear interpretation to the end-user. We then minimize the expected number of queries needed for accurate prediction on any given input. As the solution is generally intractable, following prior work, we choose the queries sequentially based on information gain. However, in contrast to previous work, we need not assume the queries are conditionally independent. Instead, we leverage a stochastic generative model (VAE) and an MCMC algorithm (Unadjusted Langevin) to select the most informative query about the input based on previous query-answers. This enables the online determination of a query chain of whatever depth is required to resolve prediction ambiguities. Finally, experiments on vision and NLP tasks demonstrate the efficacy of our approach and its superiority over post-hoc explanations.

Research paper thumbnail of Neural Network Attributions: A Causal Perspective

arXiv (Cornell University), Feb 6, 2019

We propose a new attribution method for neural networks developed using first principles of causa... more We propose a new attribution method for neural networks developed using first principles of causality (to the best of our knowledge, the first such). The neural network architecture is viewed as a Structural Causal Model, and a methodology to compute the causal effect of each feature on the output is presented. With reasonable assumptions on the causal structure of the input data, we propose algorithms to efficiently compute the causal effects, as well as scale the approach to data with large dimensionality. We also show how this method can be used for recurrent neural networks. We report experimental results on both simulated and real datasets showcasing the promise and usefulness of the proposed algorithm.

Research paper thumbnail of Structured Graph Variational Autoencoders for Indoor Furniture layout Generation

arXiv (Cornell University), Apr 11, 2022

We present a graph variational autoencoder with a structured prior for generating the layout of i... more We present a graph variational autoencoder with a structured prior for generating the layout of indoor 3D scenes. Given the room type (e.g., living room or library) and the room layout (e.g., room elements such as floor and walls), our architecture generates a collection of objects (e.g., furniture items such as sofa, table and chairs) that is consistent with the room type and layout. This is a challenging problem because the generated scene should satisfy multiple constrains, e.g., each object should lie inside the room and two objects should not occupy the same volume. To address these challenges, we propose a deep generative model that encodes these relationships as soft constraints on an attributed graph (e.g., the nodes capture attributes of room and furniture elements, such as class, pose and size, and the edges capture geometric relationships such as relative orientation). The architecture consists of a graph encoder that maps the input graph to a structured latent space, and a graph decoder that generates a furniture graph, given a latent code and the room graph. The latent space is modeled with auto-regressive priors, which facilitates the generation of highly structured scenes. We also propose an efficient training procedure that combines matching and constrained learning. Experiments on the 3D-FRONT dataset show that our method produces scenes that are diverse and are adapted to the room layout.

Research paper thumbnail of Grad-CAM++: Improved Visual Explanations for Deep Convolutional Networks

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in so... more Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems. However, these deep models are perceived as ”black box” methods considering the lack of understanding of their internal functioning. There has been a significant recent interest in developing explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art. We provide a mathematical derivation for the proposed method, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for ...

Research paper thumbnail of Grad-CAM++: Generalized Gradient-Based Visual Explanations for Deep Convolutional Networks

2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018

Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in so... more Over the last decade, Convolutional Neural Network (CNN) models have been highly successful in solving complex vision problems. However, these deep models are perceived as "black box" methods considering the lack of understanding of their internal functioning. There has been a significant recent interest in developing explainable deep learning models, and this paper is an effort in this direction. Building on a recently proposed method called Grad-CAM, we propose a generalized method called Grad-CAM++ that can provide better visual explanations of CNN model predictions, in terms of better object localization as well as explaining occurrences of multiple object instances in a single image, when compared to state-of-the-art. We provide a mathematical derivation for the proposed method, which uses a weighted combination of the positive partial derivatives of the last convolutional layer feature maps with respect to a specific class score as weights to generate a visual explanation for the corresponding class label. Our extensive experiments and evaluations, both subjective and objective, on standard datasets showed that Grad-CAM++ provides promising human-interpretable visual explanations for a given CNN architecture across multiple tasks including classification, image caption generation and 3D action recognition; as well as in new settings such as knowledge distillation.

Research paper thumbnail of Urea-water solvation of protein side chain models

Journal of Molecular Liquids, 2020

Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydro... more Aqueous urea stabilizes the unfolded states of protein due to their ability to solvate both hydrophilic and hydrophobic residues favorably. The nature of interactions that stabilize different types of amino acid side chains in their solvent exposed state is still not understood. To gain insights into the molecular level details of urea interactions with proteins in their unfolded states, we have performed atomistic molecular dynamics simulations and free energy calculations using the thermodynamic integration method on model systems representing side chains of all amino acids in different solvent environments (water and varying concentrations of aqueous urea). A systematic analysis of structural, energetic and dynamic parameters has been done to understand the detailed atomistic mechanism. The main aim of the current study is to unravel the nature of urea-amino acid interactions by emphasizing on the chemical nature of amino acid side chain models. The preferential interactions of urea over water with each side chain and backbone model systems in various concentrations of aqueous urea were quantified using the two-domain model, and it is validated by mean lifetime calculations. Interestingly, almost all amino acids showed a preference for urea over water. The order of preferences depending on the chemical nature of the amino acids is obtained with the aromatic groups exhibiting the highest preferences followed by hydrophobic groups, followed by amides and basic groups, and the least by nucleophilic groups. The extensive energetic analysis revealed, these preferential interactions are enthalpically and entropically driven and are dominated by dispersion effects. Spatial density distributions and radial distribution analyses provide insights to understand the different modes and urea orientation towards preferred sites of interactions by which urea-protein interactions stabilize proteins in their unfolded states by forming favorable interactions with exposed amino acids side chains.

Research paper thumbnail of Energetic, Structural and Dynamic Properties of Nucleobase-Urea Interactions that Aid in Urea Assisted RNA Unfolding

Scientific Reports, 2019

Understanding the structure-function relationships of RNA has become increasingly important given... more Understanding the structure-function relationships of RNA has become increasingly important given the realization of its functional role in various cellular processes. Chemical denaturation of RNA by urea has been shown to be beneficial in investigating RNA stability and folding. Elucidation of the mechanism of unfolding of RNA by urea is important for understanding the folding pathways. In addition to studying denaturation of RNA in aqueous urea, it is important to understand the nature and strength of interactions of the building blocks of RNA. In this study, a systematic examination of the structural features and energetic factors involving interactions between nucleobases and urea is presented. Results from molecular dynamics (MD) simulations on each of the five DNA/RNA bases in water and eight different concentrations of aqueous urea, and free energy calculations using the thermodynamic integration method are presented. The interaction energies between all the nucleobases with ...

Research paper thumbnail of Role of Urea–Aromatic Stacking Interactions in Stabilizing the Aromatic Residues of the Protein in Urea-Induced Denatured State

Journal of the American Chemical Society, 2017

Research paper thumbnail of A Probabilistic Framework for Constructing Temporal Relations in Replica Exchange Molecular Trajectories

Journal of chemical theory and computation, Jan 10, 2018

Knowledge of the structure and dynamics of biomolecules is essential for elucidating the underlyi... more Knowledge of the structure and dynamics of biomolecules is essential for elucidating the underlying mechanisms of biological processes. Given the stochastic nature of many biological processes, like protein unfolding, it is almost impossible that two independent simulations will generate the exact same sequence of events, which makes direct analysis of simulations difficult. Statistical models like Markov chains, transition networks, etc. help in shedding some light on the mechanistic nature of such processes by predicting long-time dynamics of these systems from short simulations. However, such methods fall short in analyzing trajectories with partial or no temporal information, for example, replica exchange molecular dynamics or Monte Carlo simulations. In this work, we propose a probabilistic algorithm, borrowing concepts from graph theory and machine learning, to extract reactive pathways from molecular trajectories in the absence of temporal data. A suitable vector representati...