E-MSD: the European Bioinformatics Institute Macromolecular Structure Database - PubMed (original) (raw)

. 2003 Jan 1;31(1):458-62.

doi: 10.1093/nar/gkg065.

D Dimitropoulos, J Fillon, A Golovin, K Henrick, A Hussain, J Ionides, M John, P A Keller, E Krissinel, P McNeil, A Naim, R Newman, T Oldfield, J Pineda, A Rachedi, J Copeland, A Sitnov, S Sobhany, A Suarez-Uruena, J Swaminathan, M Tagari, J Tate, S Tromm, S Velankar, W Vranken

Affiliations

E-MSD: the European Bioinformatics Institute Macromolecular Structure Database

H Boutselakis et al. Nucleic Acids Res. 2003.

Abstract

The E-MSD macromolecular structure relational database (http://www.ebi.ac.uk/msd) is designed to be a single access point for protein and nucleic acid structures and related information. The database is derived from Protein Data Bank (PDB) entries. Relational database technologies are used in a comprehensive cleaning procedure to ensure data uniformity across the whole archive. The search database contains an extensive set of derived properties, goodness-of-fit indicators, and links to other EBI databases including InterPro, GO, and SWISS-PROT, together with links to SCOP, CATH, PFAM and PROSITE. A generic search interface is available, coupled with a fast secondary structure domain search tool.

PubMed Disclaimer

Figures

Figure 1

Figure 1

The E-MSD database core entity relationships. Each level of the hierarchy can have associated properties, e.g. Bound molecules, Domains definitions, Site residues, Derived properties (e.g. Accessible Surface Area), Reference information (e.g. standard geometry).

Figure 2

Figure 2

Sample SMILES based search using chempdb and starting from 3-chorophenol, (a) selected search results using the ‘has substructure’ option wherein the results have the connected fragment, and (b) selected search results using the fingerprint option where the matching ligands contain the chemical constituents of the query structure. The matched compounds shown are: TCL 5-chloro-2-(2,4-dichlorophenoxy)phenol, EAA [2,3-dichloro-4-(2-ethylacryloyl)phenoxy]acetic acid, CHB 3-chloro-4-hydroxybenzoic acid, and CFA 2,4-dichlorophenoxy acetic acid.

Figure 3

Figure 3

The process is driven by a number of dictionaries describing the database-model (Database Definition), interface contents and layout (Search page definition, Result page definition) or the description useful in construction of the SQL query (Search tools, Result tools). The system uses the XML-XSL technology to generate HTML pages using AxKit module.

Similar articles

Cited by

References

    1. Hamm G.H. and Cameron,G.N. (1986) The EMBL data library. Nucleic Acids Res., 14, 5–10. - PMC - PubMed
    1. Bairoch A. and Boeckmann,B. (1994) The SWISS-PROT protein sequence databank: current status. Nucleic Acids Res., 22, 3578–3580. - PMC - PubMed
    1. Bairoch A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acid Res., 28, 45–48. - PMC - PubMed
    1. Berman H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. (2000) The Protein Data Bank. Nucleic Acids Res., 28, 235–242. - PMC - PubMed
    1. Service R.F. (2000) Structural genomics offers high-speed look at proteins. Science, 287, 194–196. - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources