Introduction to MOLE DB-on-line Molecular Descriptors Database. (original) (raw)
Related papers
alvaDesc: A Tool to Calculate and Analyze Molecular Descriptors and Fingerprints
Ecotoxicological QSARs, 2020
In this chapter we will present alvaDesc, a software to calculate and analyze molecular descriptors and fingerprints. Molecular descriptors and fingerprints play an essential role in quantitative structure-activity relationships (QSAR) as they are the mathematical representation of chemicals and they serve as the input for the data analysis methods used to build QSAR models. The increasing number of newly proposed molecular descriptors and fingerprints and generally the attention paid by the scientific community to the development of novel methodologies to represent chemical structures are evidences of the relevance of these representations in the prediction of chemical properties. Despite the complexity of dealing with a high number of variables, different types of molecular descriptors and fingerprints can highlight specific traits of molecular structures. These aspects, together with the increased availability of chemical data and methods for data analysis, are some of the challenges that researchers face in the development of QSAR models.
[COMMODE] a large-scale database of molecular descriptors using compounds from PubChem
Source Code for Biology and Medicine, 2013
Background: Molecular descriptors have been extensively used in the field of structure-oriented drug design and structural chemistry. They have been applied in QSPR and QSAR models to predict ADME-Tox properties, which specify essential features for drugs. Molecular descriptors capture chemical and structural information, but investigating their interpretation and meaning remains very challenging. Results: This paper introduces a large-scale database of molecular descriptors called COMMODE containing more than 25 million compounds originated from PubChem. About 2500 DRAGON-descriptors have been calculated for all compounds and integrated into this database, which is accessible through a web interface at http://commode.i-med. ac.at.
Structure-Activity Relationships on the Molecular Descriptors Family Project at the End
2007
(SAR), a promising approach in investigation and quantification of the link between 2D and 3D structural information and the activity, and its potential in the analysis of the biological active compounds is summarized. The approach, attempts to correlate molecular descriptors family generated and calculated on a set of biological active compounds with their observed activity. The estimation as well as prediction abilities of the approach are presented. The obtained MDF SAR models can be used to predict the biological activity of unknown substrates in a series of compounds.
Dragon software: An easy approach to molecular descriptor calculations
MATCH / Communications In Mathematical & In Computer Chemistry, 2006
Due to the relevance that molecular descriptors are constantly gaining in several scientific fields, software for the calculation of molecular descriptors have become very important tools for the scientists. In this paper, the main characteristics of DRAGON software for the calculation of molecular descriptors are shortly illustrated.
Molecular Property Diagnostic Suite-Compound Library (MPDS-CL), is an open-source galaxy-based cheminformatics web-portal which presents a structure-based classification of the molecules. A structure-based classification of nearly 150 million unique compounds, which are obtained from 42 publicly available databases were curated for redundancy removal through 97 hierarchically well-defined atom composition-based portions. These are further subjected to 56-bit fingerprint-based classification algorithm which led to a formation of 56 structurally well-defined classes. The classes thus obtained were further divided into clusters based on their molecular weight. Thus, the entire set of molecules was put in 56 different classes and 625 clusters. This led to the assignment of a unique ID, named as MPDS-Aadhar card, for each of these 149 169 443 molecules. Aadhar card is akin to the unique number given to citizens in India (similar to the SSN in US, NINO in UK). MPDS-CL unique features are:...
2010
In the last decades, several scientific researches have been focused on studying how to encompass and convert–by a theoretical pathway–the information encoded in the molecular structure into one or more numbers used to establish quantitative relationships between structures and properties, biological activities, or other experimental properties.
MolFind: A Software Package Enabling HPLC/MS-Based Identification of Unknown Chemical Structures
Analytical Chemistry, 2012
In this paper, we present MolFind, a highly multi-threaded pipeline type software package for use as an aid in identifying chemical structures in complex biofluids and mixtures. MolFind is specifically designed for high performance liquid chromatography/mass spectrometry (HPLC/MS) data inputs typical of metabolomics studies where structure identification is the ultimate goal. MolFind enables compound identification by matching HPLC/MS based experimental data obtained for an unknown compound with computationally derived HPLC/MS values for candidate compounds downloaded from chemical databases such as PubChem. The downloaded "bins" consist of all compounds matching the monoisotopic molecular weight of the unknown. The computational HPLC/MS values predicted include retention index (RI), ECOM 50 (energy required to fragment 50% of a selected precursor ion), drift time and collision induced dissociation (CID) spectrum. RI, ECOM 50 , and drift time models are used for filtering compounds downloaded from PubChem. The remaining candidates are then ranked based on CID spectra matching. Current RI and ECOM 50 models allow for the removal of about 28% of compounds from PubChem bins. Our estimates suggest that this could be improved to as much as 87% with additional chemical structures included in the computational models. Quantitative structure property relationship based modeling of drift times showed a better correlation with experimentally determined drift times than did Mobcal cross sectional areas. In 23/35 example cases, filtering PubChem bins with RI and ECOM 50 predictive models resulted in improved ranking of the unknown compound compared to previous studies using CID spectra matching alone. In 19/35 examples, the correct candidate was ranked within the top 20 compounds in bins containing an average of 1635 compounds.