Updated resources for exploring experimentally-determined PDB structures and Computed Structure Models at the RCSB Protein Data Bank - PubMed (original) (raw)
. 2025 Jan 6;53(D1):D564-D574.
doi: 10.1093/nar/gkae1091.
Stephen K Burley 1 2 3 4 5 6, Charmi Bhikadiya 5, Chunxiao Bi 5, Alison Biester 1 2, Pratyoy Biswas 1 2, Sebastian Bittrich 5, Santiago Blaumann 1 2, Ronald Brown 1 2, Henry Chao 1 2, Vivek Reddy Chithari 1 2, Paul A Craig 7, Gregg V Crichlow 1 2, Jose M Duarte 5, Shuchismita Dutta 1 2 3, Zukang Feng 1 2, Justin W Flatt 1 2, Sutapa Ghosh 1 2, David S Goodsell 1 2 3 8, Rachel Kramer Green 1, Vladimir Guranovic 1 2, Jeremy Henry 5, Brian P Hudson 1 2, Michael Joy 1 2, Jason T Kaelber 1 2, Igor Khokhriakov 5, Jhih-Siang Lai 5, Catherine L Lawson 1 2, Yuhe Liang 1 2, Douglas Myers-Turnbull 5, Ezra Peisach 1 2, Irina Persikova 1 2, Dennis W Piehl 1 2, Aditya Pingale 1 2, Yana Rose 5, Jared Sagendorf 9, Andrej Sali 9, Joan Segura 5, Monica Sekharan 1 2, Chenghua Shao 1 2, James Smith 1 2, Michael Trumbull 1 2, Brinda Vallat 1 2, Maria Voigt 1 2, Ben Webb 9, Shamara Whetstone 1 2, Amy Wu-Wu 1 2, Tongji Xing 1 2, Jasmine Y Young 1 2, Arthur Zalevsky 9, Christine Zardecki 1 2
Affiliations
- PMID: 39607707
- PMCID: PMC11701563
- DOI: 10.1093/nar/gkae1091
Updated resources for exploring experimentally-determined PDB structures and Computed Structure Models at the RCSB Protein Data Bank
Stephen K Burley et al. Nucleic Acids Res. 2025.
Abstract
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB, RCSB.org), the US Worldwide Protein Data Bank (wwPDB, wwPDB.org) data center for the global PDB archive, provides access to the PDB data via its RCSB.org research-focused web portal. We report substantial additions to the tools and visualization features available at RCSB.org, which now delivers more than 227000 experimentally determined atomic-level three-dimensional (3D) biostructures stored in the global PDB archive alongside more than 1 million Computed Structure Models (CSMs) of proteins (including models for human, model organisms, select human pathogens, crop plants and organisms important for addressing climate change). In addition to providing support for 3D structure motif searches with user-provided coordinates, new features highlighted herein include query results organized by redundancy-reduced Groups and summary pages that facilitate exploration of groups of similar proteins. Newly released programmatic tools are also described, as are enhanced training opportunities.
© The Author(s) 2024. Published by Oxford University Press on behalf of Nucleic Acids Research.
Figures
Graphical Abstract
Figure 1.
(A) Advanced Search Query Builder, showcasing support for searches defined by File URL and File Upload for Structure Similarity and Structure Motif searches. (B) Mol* 3D visualization example of a Structure Motif search match with the serine protease catalytic triad (search motif derived from PDB ID: 4cha (21)). Shown here is one of the returned matches, the AlphaFold CSM for bovine Chymotrypsinogen B (UniProt ID: P00767).
Figure 2.
Alignment of structures to explore the NADP-binding sites of glyceraldehyde-3-phosphate dehydrogenase from Methanothermus fervidus (PDB ID: 1cf2 chain A, in orange, the reference structure (29)), Methanocaldococcus jannaschii DSM 2661 (PDB ID: 2yyy chain A (30), blue), Pyrococcus horikoshii OT3 (PDB ID: 2czc chain A (31), green), aspartate-semialdehyde dehydrogenase from Candida albicans (PDB ID: 3hsk chain A (32), salmon) and Blastomyces gilchristii SLH14081 (PDB ID: 6c8w chain A (33), red).
Figure 3.
GSP for a 30% sequence identity cluster of Ras family proteins. Histograms display the distribution of organisms among the Group members, with the currently filtered Homo sapiens displayed in blue. Gray bars represent the distribution of Group members belonging to other organisms.
Figure 4.
Group Sequence Alignments in 3D for Ras family proteins clustered at 30% amino acid sequence identity. On the left side, an MSA of the group members is displayed, including experimental structures and CSMs. The right panel shows the 3D structure of selected proteins from the alignment panel in Mol*.
Figure 5.
Tabular representation of M-CSA annotations for PDB ID: 1b73 (48). Links allow the user to (1) visualize the catalytic residues in Mol*; (2) launch a structure motif search based on the M-CSA motif definition; (3) launch a search for other structures in the PDB with the same EC number; (4) and launch a search with both the structure motif for similar arrangements of catalytic residues and PDB IDs or CSMs bearing the same EC number.
Figure 6.
Annotations available for a CSM of cholesterol side chain cleavage enzyme, mitochondrial (AlphaFold: AFP05108F1).
References
- Protein Data Bank Crystallography: Protein Data Bank. Nat. New Biol. 1971; 233:223–223.
- Burley S.K., Bhikadiya C., Bi C., Bittrich S., Chao H., Chen L., Craig A.P., Crichlow G.V., Dalenberg K., Duarte J.M.et al.. RCSB Protein Data Bank (RCSB.org): delivery of experimentally-determined PDB structures alongside one million Computed Structure Models of proteins from Artificial Intelligence/Machine Learning. Nucleic Acids Res. 2023; 51:D488–D508. -PMC -PubMed
- Berman H.M., Henrick K., Nakamura H.. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. 2003; 10:980. -PubMed
MeSH terms
Substances
Grants and funding
- R01GM157729/NH/NIH HHS/United States
- DE-SC0019749/U.S. Department of Energy
- R01 GM083960/GM/NIGMS NIH HHS/United States
- BB_/Biotechnology and Biological Sciences Research Council/United Kingdom
- R01GM083960/GM/NIGMS NIH HHS/United States
- CA/NCI NIH HHS/United States
- DBI-2129634/Mol* features
- DBI-2019297/Next Generation PDB
- P41 GM109824/GM/NIGMS NIH HHS/United States
- DBI-1756248/NSF
- DBI-2321666/U.S. National Science Foundation
- R01 GM157729/GM/NIGMS NIH HHS/United States
LinkOut - more resources
Full Text Sources
Molecular Biology Databases