Woonghee Lee - Academia.edu (original) (raw)

Papers by Woonghee Lee

Research paper thumbnail of Probabilistic validation of protein NMR chemical shift assignments

Journal of biomolecular NMR, 2016

Data validation plays an important role in ensuring the reliability and reproducibility of studie... more Data validation plays an important role in ensuring the reliability and reproducibility of studies. NMR investigations of the functional properties, dynamics, chemical kinetics, and structures of proteins depend critically on the correctness of chemical shift assignments. We present a novel probabilistic method named ARECA for validating chemical shift assignments that relies on the nuclear Overhauser effect data . ARECA has been evaluated through its application to 26 case studies and has been shown to be complementary to, and usually more reliable than, approaches based on chemical shift databases. ARECA is available online at http://areca.nmrfam.wisc.edu/ .

Research paper thumbnail of Integrative NMR for biomolecular research

Journal of biomolecular NMR, Apr 29, 2016

NMR spectroscopy is a powerful technique for determining structural and functional features of bi... more NMR spectroscopy is a powerful technique for determining structural and functional features of biomolecules in physiological solution as well as for observing their intermolecular interactions in real-time. However, complex steps associated with its practice have made the approach daunting for non-specialists. We introduce an NMR platform that makes biomolecular NMR spectroscopy much more accessible by integrating tools, databases, web services, and video tutorials that can be launched by simple installation of NMRFAM software packages or using a cross-platform virtual machine that can be run on any standard laptop or desktop computer. The software package can be downloaded freely from the NMRFAM software download page ( http://pine.nmrfam.wisc.edu/download_packages.html ), and detailed instructions are available from the Integrative NMR Video Tutorial page ( http://pine.nmrfam.wisc.edu/integrative.html ).

Research paper thumbnail of NMRmix: A Tool for the Optimization of Compound Mixtures in 1D (1)H NMR Ligand Affinity Screens

Journal of proteome research, Jan 11, 2016

NMR ligand affinity screening is a powerful technique that is routinely used in drug discovery or... more NMR ligand affinity screening is a powerful technique that is routinely used in drug discovery or functional genomics to directly detect protein-ligand binding events. Binding events can be identified by monitoring differences in the one-dimensional (1)H NMR spectrum of a compound with and without protein. Although a single NMR spectrum can be collected within a short period (2-10 min per sample), one-by-one screening of a protein against a library of hundreds or thousands of compounds requires a large amount of spectrometer time and a large quantity of protein. To improve the efficiency of these screens in both time and material, compounds are usually evaluated in mixtures ranging in size from 3 to 20 compounds. Ideally, the NMR signals from individual compounds in the mixture should not overlap so that spectral changes can be associated with a particular compound. We have developed a software tool, NMRmix, to assist in creating ideal mixtures from a large panel of compounds with k...

Research paper thumbnail of ADAPT-NMR Enhancer

Bioinformatics, Feb 1, 2013

Research paper thumbnail of Purification and backbone assignment of the Hypothetical Protein MTH1821 from Methanobacterium thermoautotrophicum H

MTH1821 (UniProtKB/TrEMBL ID O27849) is a 96-residue hypothetical protein from the open reading f... more MTH1821 (UniProtKB/TrEMBL ID O27849) is a 96-residue hypothetical protein from the open reading frame of Methanobacterium thermoautotrophicum H one of the target organisms of structural genomics pilot project. Proteins which contain conserved sequence compared with MTH1821 have not been discovered yet and the functional and structural information for MTH1821 is not available. Here, we present the sequence-specific backbone resonance using multidimensional heteronuclear NMR spectroscopy and propose the secondary structure using GetSBY software. The backbone resonances of N, HN, C α , C β , CO and H α which are necessary for a prediction of secondary structure by GetSBY were assigned about 98% (557/568). The secondary structure of MTH1821 confirmed that it is comprised of four strand regions and two helical regions. This report will provide a valuable resource for the calculation solution structure of MTH1821 and for the other hypothetical protein that is targeted for structural-based functional discovery.

Research paper thumbnail of NMRFAM-SDF: a protein structure determination framework

Journal of Biomolecular NMR, 2015

The computationally demanding nature of automated NMR structure determination necessitates a deli... more The computationally demanding nature of automated NMR structure determination necessitates a delicate balancing of factors that include the time complexity of data collection, the computational complexity of chemical shift assignments, and selection of proper optimization steps. During the past two decades the computational and algorithmic aspects of several discrete steps of the process have been addressed. Although no single comprehensive solution has emerged, the incorporation of a validation protocol has gained recognition as a necessary step for a robust automated approach. The need for validation becomes even more pronounced in cases of proteins with higher structural complexity, where potentially larger errors generated at each step can propagate and accumulate in the process of structure calculation, thereby significantly degrading the efficacy of any software framework. This paper introduces a complete framework for protein structure determination with NMR-from data acquisition to the structure determination. The aim is twofold: to simplify the structure determination process for non-NMR experts whenever feasible, while maintaining flexibility by providing a set of modules that validate each step, and to enable the assessment of error propagations. This framework, called NMRFAM-SDF (NMRFAM-Structure Determination Framework), and its various components are available for download from the NMRFAM website (http://nmrfam.wisc.edu/software.htm).

Research paper thumbnail of A network of assembly factors is involved in remodeling rRNA elements during preribosome maturation

The Journal of cell biology, Jan 24, 2014

Eukaryotic ribosome biogenesis involves ∼200 assembly factors, but how these contribute to riboso... more Eukaryotic ribosome biogenesis involves ∼200 assembly factors, but how these contribute to ribosome maturation is poorly understood. Here, we identify a network of factors on the nascent 60S subunit that actively remodels preribosome structure. At its hub is Rsa4, a direct substrate of the force-generating ATPase Rea1. We show that Rsa4 is connected to the central protuberance by binding to Rpl5 and to ribosomal RNA (rRNA) helix 89 of the nascent peptidyl transferase center (PTC) through Nsa2. Importantly, Nsa2 binds to helix 89 before relocation of helix 89 to the PTC. Structure-based mutations of these factors reveal the functional importance of their interactions for ribosome assembly. Thus, Rsa4 is held tightly in the preribosome and can serve as a "distribution box," transmitting remodeling energy from Rea1 into the developing ribosome. We suggest that a relay-like factor network coupled to a mechano-enzyme is strategically positioned to relocate rRNA elements during ...

Research paper thumbnail of Assignments of RNase A by ADAPT-NMR and enhancer

Biomolecular NMR Assignments, 2014

We report here backbone 1 H and 15 N assignments for ribonuclease A obtained by using ADAPT-NMR, ... more We report here backbone 1 H and 15 N assignments for ribonuclease A obtained by using ADAPT-NMR, a fully-automated approach for combined data collection, spectral analysis and resonance assignment. ADAPT-NMR was able to assign 98 % of the resonances with 93 % agreement with traditional data collection and assignment. Further refinement of the automated results with ADAPT-NMR enhancer led to complete (100 %) assignments with 96 % agreement with assignments by the traditional approach.

Research paper thumbnail of PONDEROSA-C/S: client–server based software package for automated protein 3D structure determination

Journal of Biomolecular NMR, 2014

Peak-picking Of Noe Data Enabled by Restriction Of Shift Assignments-Client Server (PONDEROSA-C/ ... more Peak-picking Of Noe Data Enabled by Restriction Of Shift Assignments-Client Server (PONDEROSA-C/ S) builds on the original PONDEROSA software (Lee et al. in

Research paper thumbnail of Uncovering symmetry-breaking vector and reliability order for assigning secondary structures of proteins from atomic NMR chemical shifts in amino acids

Journal of Biomolecular NMR, 2011

Unravelling the complex correlation between chemical shifts of (13) C (α), (13) C (β), (13) C&amp... more Unravelling the complex correlation between chemical shifts of (13) C (α), (13) C (β), (13) C', (1) H (α), (15) N, (1) H ( N ) atoms in amino acids of proteins from NMR experiment and local structural environments of amino acids facilitates the assignment of secondary structures of proteins. This is an important impetus for both determining the three-dimensional structure and understanding the biological function of proteins. The previous empirical correlation scores which relate chemical shifts of (13) C (α), (13) C (β), (13) C', (1) H (α), (15) N, (1) H ( N ) atoms to secondary structures resulted in progresses toward assigning secondary structures of proteins. However, the physical-mathematical framework for these was elusive partly due to both the limited and orthogonal exploration of higher-dimensional chemical shifts of hetero-nucleus and the lack of physical-mathematical understanding underlying those correlation scores. Here we present a simple multi-dimensional hetero-nuclear chemical shift score function (MDHN-CSSF) which captures systematically the salient feature of such complex correlations without any references to a random coil state of proteins. We uncover the symmetry-breaking vector and its reliability order not only for distinguishing different secondary structures of proteins but also for capturing the delicate sensitivity interplayed among chemical shifts of (13) C (α), (13) C (β), (13) C', (1) H (α), (15) N, (1) H ( N ) atoms simultaneously, which then provides a straightforward framework toward assigning secondary structures of proteins. MDHN-CSSF could correctly assign secondary structures of training (validating) proteins with the favourable (comparable) Q3 scores in comparison with those from the previous correlation scores. MDHN-CSSF provides a simple and robust strategy for the systematic assignment of secondary structures of proteins and would facilitate the de novo determination of three-dimensional structures of proteins.

Research paper thumbnail of Structural proteomics by NMR spectroscopy

Expert Review of Proteomics, 2008

Structural proteomics is one of the powerful research areas in the postgenomic era, elucidating s... more Structural proteomics is one of the powerful research areas in the postgenomic era, elucidating structure-function relationships of uncharacterized gene products based on the 3D protein structure. It proposes biochemical and cellular functions of unannotated proteins and thereby identifies potential drug design and protein engineering targets. Recently, a number of pioneering groups in structural proteomics research have achieved proof of structural proteomic theory by predicting the 3D structures of hypothetical proteins that successfully identified the biological functions of those proteins. The pioneering groups made use of a number of techniques, including NMR spectroscopy, which has been applied successfully to structural proteomics studies over the past 10 years. In addition, advances in hardware design, data acquisition methods, sample preparation and automation of data analysis have been developed and successfully applied to high-throughput structure determination techniques. These efforts ensure that NMR spectroscopy will become an important methodology for performing structural proteomics research on a genomic scale. NMR-based structural proteomics together with x-ray crystallography will provide a comprehensive structural database to predict the basic biological functions of hypothetical proteins identified by the genome projects.

Research paper thumbnail of ADAPT-NMR Enhancer: complete package for reduced dimensionality in protein NMR spectroscopy

Bioinformatics, 2013

ADAPT-nuclear magnetic resonance (ADAPT-NMR) offers an automated approach to the concurrent acqui... more ADAPT-nuclear magnetic resonance (ADAPT-NMR) offers an automated approach to the concurrent acquisition and processing of protein NMR data with the goal of complete backbone and side chain assignments. What the approach lacks is a useful graphical interface for reviewing results and for searching for missing peaks that may have prevented assignments or led to incorrect assignments. Because most of the data ADAPT-NMR collects are 2D tilted planes used to find peaks in 3D spectra, it would be helpful to have a tool that reconstructs the 3D spectra. The software package reported here, ADAPT-NMR Enhancer, supports the visualization of both 2D tilted planes and reconstructed 3D peaks on each tilted plane. ADAPT-NMR Enhancer can be used interactively with ADAPT-NMR to automatically assign selected peaks, or it can be used to produce PINE-SPARKY-like graphical dialogs that support atom-by-atom and peak-by-peak assignment strategies. Results can be exported in various formats, including XEASY proton file (.prot), PINE pre-assignment file (.str), PINE probabilistic output file, SPARKY peak list file (.list) and TALOSþ input file (.tab). As an example, we show how ADAPT-NMR Enhancer was used to extend the automated data collection and assignment results for the protein Aedes aegypti sterol carrier protein 2. Availability: The program, in the form of binary code along with tutorials and reference manuals, is available at http://pine.

Research paper thumbnail of PONDEROSA, an automated 3D-NOESY peak picking program, enables automated protein structure determination

Bioinformatics, 2011

PONDEROSA (Peak-picking Of Noe Data Enabled by Restriction of Shift Assignments) accepts input in... more PONDEROSA (Peak-picking Of Noe Data Enabled by Restriction of Shift Assignments) accepts input information consisting of a protein sequence, backbone and sidechain NMR resonance assignments, and 3D-NOESY ( 13 C-edited and/or 15 N-edited) spectra, and returns assignments of NOESY crosspeaks, distance and angle constraints, and a reliable NMR structure represented by a family of conformers. PONDEROSA incorporates and integrates external software packages (TALOS+, STRIDE and CYANA) to carry out different steps in the structure determination. PONDEROSA implements internal functions that identify and validate NOESY peak assignments and assess the quality of the calculated three-dimensional structure of the protein. The robustness of the analysis results from PONDEROSA's hierarchical processing steps that involve iterative interaction among the internal and external modules. PONDEROSA supports a variety of input formats: SPARKY assignment table (.shifts) and spectrum file formats (.ucsf), XEASY proton file format (.prot), and NMR-STAR format (.star). To demonstrate the utility of PONDEROSA, we used the package to determine 3D structures of two proteins: human ubiquitin and Escherichia coli iron-sulfur scaffold protein variant IscU(D39A). The automatically generated structural constraints and ensembles of conformers were as good as or better than those determined previously by much less automated means. Availability: The program, in the form of binary code along with tutorials and reference manuals, is available at

Research paper thumbnail of PINE-SPARKY: graphical interface for evaluating automated probabilistic peak assignments in protein NMR spectroscopy

Bioinformatics, 2009

PINE-SPARKY supports the rapid, user-friendly and efficient visualization of probabilistic assign... more PINE-SPARKY supports the rapid, user-friendly and efficient visualization of probabilistic assignments of NMR chemical shifts to specific atoms in the covalent structure of a protein in the context of experimental NMR spectra. PINE-SPARKY is based on the very popular SPARKY package for visualizing multidimensional NMR spectra (T.

Research paper thumbnail of Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

PLoS ONE, 2014

Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the... more Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the differing substrate specificity of the respective 2A protease (2A pro ). Rhinoviruses use their 2A pro to cleave a spectrum of cellular proteins important to virus replication and anti-host activities. These enzymes share a chymotrypsin-like fold stabilized by a tetra-coordinated zinc ion. The catalytic triad consists of conserved Cys (C105), His (H34), and Asp (D18) residues. We used a semi-automated NMR protocol developed at NMRFAM to determine the solution structure of 2A pro (C 105 A variant) from an isolate of the clinically important rhinovirus C species (RV-C). The backbone of C2 2Apro superimposed closely (1.41-1.81 Å rmsd) with those of orthologs from RV-A2, coxsackie B4 (CB4), and enterovirus 71 (EV71) having sequence identities between 40% and 60%. Comparison of the structures suggest that the differential functional properties of C2 2A pro stem from its unique surface charge, high proportion of surface aromatics, and sequence surrounding the di-tyrosine flap. Citation: Lee W, Watters KE, Troupis AT, Reinen NM, Suchy FP, et al. (2014) Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12. PLoS ONE 9(6): e97198.

Research paper thumbnail of PACSY database, a relational database management system for Protein structure and nuclear Magnetic Resonance chemical shift analysis

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops, 2012

We developed a new relational database management system called PACSY (protein structure And Chem... more We developed a new relational database management system called PACSY (protein structure And Chemical Shift NMR spectroscopY) by integrating information from the Protein Data Bank (PDB), the Biological Magnetic Resonance Data Bank (BMRB), and the Structural Classification of Proteins (SCOP) database. PACSY offers valuable information for structural investigations such as three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. The database can be installed on an RDBMS server such as MySQL and PostgreSQL for advanced search functions by supporting database queries. PACSY enables users to search for combinations of information from different database sources in support of their research. PACSY along with two associated software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu.

Research paper thumbnail of Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

PLoS ONE, 2014

Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the... more Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the differing substrate specificity of the respective 2A protease (2A pro ). Rhinoviruses use their 2A pro to cleave a spectrum of cellular proteins important to virus replication and anti-host activities. These enzymes share a chymotrypsin-like fold stabilized by a tetra-coordinated zinc ion. The catalytic triad consists of conserved Cys (C105), His (H34), and Asp (D18) residues. We used a semi-automated NMR protocol developed at NMRFAM to determine the solution structure of 2A pro (C 105 A variant) from an isolate of the clinically important rhinovirus C species (RV-C). The backbone of C2 2Apro superimposed closely (1.41-1.81 Å rmsd) with those of orthologs from RV-A2, coxsackie B4 (CB4), and enterovirus 71 (EV71) having sequence identities between 40% and 60%. Comparison of the structures suggest that the differential functional properties of C2 2A pro stem from its unique surface charge, high proportion of surface aromatics, and sequence surrounding the di-tyrosine flap. Citation: Lee W, Watters KE, Troupis AT, Reinen NM, Suchy FP, et al. (2014) Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12. PLoS ONE 9(6): e97198.

Research paper thumbnail of NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy

Bioinformatics, 2014

SPARKY (Goddard and Kneller, SPARKY 3) remains the most popular software program for NMR data ana... more SPARKY (Goddard and Kneller, SPARKY 3) remains the most popular software program for NMR data analysis, despite the fact that development of the package by its originators ceased in 2001. We have taken over the development of this package and describe NMRFAM-SPARKY, which implements new functions reflecting advances in the biomolecular NMR field. NMRFAM-SPARKY has been repackaged with current versions of Python and Tcl/Tk, which support new tools for NMR peak simulation and graphical assignment determination. These tools, along with chemical shift predictions from the PACSY database, greatly accelerate protein side chain assignments. NMRFAM-SPARKY supports automated data format interconversion for interfacing with a variety of web servers including, PECAN , PINE, TALOS-N, CS-Rosetta, SHIFTX2 and PONDEROSA-C/S. Availability and implementation: The software package, along with binary and source codes, if desired, can be downloaded freely from http://pine.nmrfam.wisc.edu/download_packages.html. Instruction manuals and video tutorials can be found at http://www.nmrfam.wisc.edu/nmrfamsparky-distribution.htm.

Research paper thumbnail of Fast automated protein NMR data collection and assignment by ADAPT-NMR on Bruker spectrometers

Journal of Magnetic Resonance, 2013

ADAPT-NMR (Assignment-directed Data collection Algorithm utilizing a Probabilistic Toolkit in NMR... more ADAPT-NMR (Assignment-directed Data collection Algorithm utilizing a Probabilistic Toolkit in NMR) supports automated NMR data collection and backbone and side chain assignment for [U-13 C, U-15 N]-labeled proteins. Given the sequence of the protein and data for the orthogonal 2D 1 H-15 N and 1 H-13 C planes, the algorithm automatically directs the collection of tilted plane data from a variety of triple-resonance experiments so as to follow an efficient pathway toward the probabilistic assignment of 1 H, 13 C, and 15 N signals to specific atoms in the covalent structure of the protein. Data collection and assignment calculations continue until the addition of new data no longer improves the assignment score. ADAPT-NMR was first implemented on Varian (Agilent) spectrometers [, PLoS One 7 (2012) e33173]. Because of broader interest in the approach, we present here a version of ADAPT-NMR for Bruker spectrometers. We have developed two AU console programs (ADAPT_ORTHO_run and ADAPT_NMR_run) that run under TOPSPIN Versions 3.0 and higher. To illustrate the performance of the algorithm on a Bruker spectrometer, we tested one protein, chlorella ubiquitin (76 amino acid residues), that had been used with the Varian version: the Bruker and Varian versions achieved the same level of assignment completeness (98% in 20 h). As a more rigorous evaluation of the Bruker version, we tested a larger protein, BRPF1 bromodomain (114 amino acid residues), which yielded an automated assignment completeness of 86% in 55 h. Both experiments were carried out on a 500 MHz Bruker AVANCE III spectrometer equipped with a z-gradient 5 mm TCI probe. ADAPT-NMR is available at http://pine.nmrfam.wisc.edu/ADAPT-NMR in the form of pulse programs, the two AU programs, and instructions for installation and use.

Research paper thumbnail of PACSY, a relational database management system for protein structure and chemical shift analysis

Journal of Biomolecular NMR, 2012

PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management... more PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http:// pacsy.nmrfam.wisc.edu.

Research paper thumbnail of Probabilistic validation of protein NMR chemical shift assignments

Journal of biomolecular NMR, 2016

Data validation plays an important role in ensuring the reliability and reproducibility of studie... more Data validation plays an important role in ensuring the reliability and reproducibility of studies. NMR investigations of the functional properties, dynamics, chemical kinetics, and structures of proteins depend critically on the correctness of chemical shift assignments. We present a novel probabilistic method named ARECA for validating chemical shift assignments that relies on the nuclear Overhauser effect data . ARECA has been evaluated through its application to 26 case studies and has been shown to be complementary to, and usually more reliable than, approaches based on chemical shift databases. ARECA is available online at http://areca.nmrfam.wisc.edu/ .

Research paper thumbnail of Integrative NMR for biomolecular research

Journal of biomolecular NMR, Apr 29, 2016

NMR spectroscopy is a powerful technique for determining structural and functional features of bi... more NMR spectroscopy is a powerful technique for determining structural and functional features of biomolecules in physiological solution as well as for observing their intermolecular interactions in real-time. However, complex steps associated with its practice have made the approach daunting for non-specialists. We introduce an NMR platform that makes biomolecular NMR spectroscopy much more accessible by integrating tools, databases, web services, and video tutorials that can be launched by simple installation of NMRFAM software packages or using a cross-platform virtual machine that can be run on any standard laptop or desktop computer. The software package can be downloaded freely from the NMRFAM software download page ( http://pine.nmrfam.wisc.edu/download_packages.html ), and detailed instructions are available from the Integrative NMR Video Tutorial page ( http://pine.nmrfam.wisc.edu/integrative.html ).

Research paper thumbnail of NMRmix: A Tool for the Optimization of Compound Mixtures in 1D (1)H NMR Ligand Affinity Screens

Journal of proteome research, Jan 11, 2016

NMR ligand affinity screening is a powerful technique that is routinely used in drug discovery or... more NMR ligand affinity screening is a powerful technique that is routinely used in drug discovery or functional genomics to directly detect protein-ligand binding events. Binding events can be identified by monitoring differences in the one-dimensional (1)H NMR spectrum of a compound with and without protein. Although a single NMR spectrum can be collected within a short period (2-10 min per sample), one-by-one screening of a protein against a library of hundreds or thousands of compounds requires a large amount of spectrometer time and a large quantity of protein. To improve the efficiency of these screens in both time and material, compounds are usually evaluated in mixtures ranging in size from 3 to 20 compounds. Ideally, the NMR signals from individual compounds in the mixture should not overlap so that spectral changes can be associated with a particular compound. We have developed a software tool, NMRmix, to assist in creating ideal mixtures from a large panel of compounds with k...

Research paper thumbnail of ADAPT-NMR Enhancer

Bioinformatics, Feb 1, 2013

Research paper thumbnail of Purification and backbone assignment of the Hypothetical Protein MTH1821 from Methanobacterium thermoautotrophicum H

MTH1821 (UniProtKB/TrEMBL ID O27849) is a 96-residue hypothetical protein from the open reading f... more MTH1821 (UniProtKB/TrEMBL ID O27849) is a 96-residue hypothetical protein from the open reading frame of Methanobacterium thermoautotrophicum H one of the target organisms of structural genomics pilot project. Proteins which contain conserved sequence compared with MTH1821 have not been discovered yet and the functional and structural information for MTH1821 is not available. Here, we present the sequence-specific backbone resonance using multidimensional heteronuclear NMR spectroscopy and propose the secondary structure using GetSBY software. The backbone resonances of N, HN, C α , C β , CO and H α which are necessary for a prediction of secondary structure by GetSBY were assigned about 98% (557/568). The secondary structure of MTH1821 confirmed that it is comprised of four strand regions and two helical regions. This report will provide a valuable resource for the calculation solution structure of MTH1821 and for the other hypothetical protein that is targeted for structural-based functional discovery.

Research paper thumbnail of NMRFAM-SDF: a protein structure determination framework

Journal of Biomolecular NMR, 2015

The computationally demanding nature of automated NMR structure determination necessitates a deli... more The computationally demanding nature of automated NMR structure determination necessitates a delicate balancing of factors that include the time complexity of data collection, the computational complexity of chemical shift assignments, and selection of proper optimization steps. During the past two decades the computational and algorithmic aspects of several discrete steps of the process have been addressed. Although no single comprehensive solution has emerged, the incorporation of a validation protocol has gained recognition as a necessary step for a robust automated approach. The need for validation becomes even more pronounced in cases of proteins with higher structural complexity, where potentially larger errors generated at each step can propagate and accumulate in the process of structure calculation, thereby significantly degrading the efficacy of any software framework. This paper introduces a complete framework for protein structure determination with NMR-from data acquisition to the structure determination. The aim is twofold: to simplify the structure determination process for non-NMR experts whenever feasible, while maintaining flexibility by providing a set of modules that validate each step, and to enable the assessment of error propagations. This framework, called NMRFAM-SDF (NMRFAM-Structure Determination Framework), and its various components are available for download from the NMRFAM website (http://nmrfam.wisc.edu/software.htm).

Research paper thumbnail of A network of assembly factors is involved in remodeling rRNA elements during preribosome maturation

The Journal of cell biology, Jan 24, 2014

Eukaryotic ribosome biogenesis involves ∼200 assembly factors, but how these contribute to riboso... more Eukaryotic ribosome biogenesis involves ∼200 assembly factors, but how these contribute to ribosome maturation is poorly understood. Here, we identify a network of factors on the nascent 60S subunit that actively remodels preribosome structure. At its hub is Rsa4, a direct substrate of the force-generating ATPase Rea1. We show that Rsa4 is connected to the central protuberance by binding to Rpl5 and to ribosomal RNA (rRNA) helix 89 of the nascent peptidyl transferase center (PTC) through Nsa2. Importantly, Nsa2 binds to helix 89 before relocation of helix 89 to the PTC. Structure-based mutations of these factors reveal the functional importance of their interactions for ribosome assembly. Thus, Rsa4 is held tightly in the preribosome and can serve as a "distribution box," transmitting remodeling energy from Rea1 into the developing ribosome. We suggest that a relay-like factor network coupled to a mechano-enzyme is strategically positioned to relocate rRNA elements during ...

Research paper thumbnail of Assignments of RNase A by ADAPT-NMR and enhancer

Biomolecular NMR Assignments, 2014

We report here backbone 1 H and 15 N assignments for ribonuclease A obtained by using ADAPT-NMR, ... more We report here backbone 1 H and 15 N assignments for ribonuclease A obtained by using ADAPT-NMR, a fully-automated approach for combined data collection, spectral analysis and resonance assignment. ADAPT-NMR was able to assign 98 % of the resonances with 93 % agreement with traditional data collection and assignment. Further refinement of the automated results with ADAPT-NMR enhancer led to complete (100 %) assignments with 96 % agreement with assignments by the traditional approach.

Research paper thumbnail of PONDEROSA-C/S: client–server based software package for automated protein 3D structure determination

Journal of Biomolecular NMR, 2014

Peak-picking Of Noe Data Enabled by Restriction Of Shift Assignments-Client Server (PONDEROSA-C/ ... more Peak-picking Of Noe Data Enabled by Restriction Of Shift Assignments-Client Server (PONDEROSA-C/ S) builds on the original PONDEROSA software (Lee et al. in

Research paper thumbnail of Uncovering symmetry-breaking vector and reliability order for assigning secondary structures of proteins from atomic NMR chemical shifts in amino acids

Journal of Biomolecular NMR, 2011

Unravelling the complex correlation between chemical shifts of (13) C (α), (13) C (β), (13) C&amp... more Unravelling the complex correlation between chemical shifts of (13) C (α), (13) C (β), (13) C', (1) H (α), (15) N, (1) H ( N ) atoms in amino acids of proteins from NMR experiment and local structural environments of amino acids facilitates the assignment of secondary structures of proteins. This is an important impetus for both determining the three-dimensional structure and understanding the biological function of proteins. The previous empirical correlation scores which relate chemical shifts of (13) C (α), (13) C (β), (13) C', (1) H (α), (15) N, (1) H ( N ) atoms to secondary structures resulted in progresses toward assigning secondary structures of proteins. However, the physical-mathematical framework for these was elusive partly due to both the limited and orthogonal exploration of higher-dimensional chemical shifts of hetero-nucleus and the lack of physical-mathematical understanding underlying those correlation scores. Here we present a simple multi-dimensional hetero-nuclear chemical shift score function (MDHN-CSSF) which captures systematically the salient feature of such complex correlations without any references to a random coil state of proteins. We uncover the symmetry-breaking vector and its reliability order not only for distinguishing different secondary structures of proteins but also for capturing the delicate sensitivity interplayed among chemical shifts of (13) C (α), (13) C (β), (13) C', (1) H (α), (15) N, (1) H ( N ) atoms simultaneously, which then provides a straightforward framework toward assigning secondary structures of proteins. MDHN-CSSF could correctly assign secondary structures of training (validating) proteins with the favourable (comparable) Q3 scores in comparison with those from the previous correlation scores. MDHN-CSSF provides a simple and robust strategy for the systematic assignment of secondary structures of proteins and would facilitate the de novo determination of three-dimensional structures of proteins.

Research paper thumbnail of Structural proteomics by NMR spectroscopy

Expert Review of Proteomics, 2008

Structural proteomics is one of the powerful research areas in the postgenomic era, elucidating s... more Structural proteomics is one of the powerful research areas in the postgenomic era, elucidating structure-function relationships of uncharacterized gene products based on the 3D protein structure. It proposes biochemical and cellular functions of unannotated proteins and thereby identifies potential drug design and protein engineering targets. Recently, a number of pioneering groups in structural proteomics research have achieved proof of structural proteomic theory by predicting the 3D structures of hypothetical proteins that successfully identified the biological functions of those proteins. The pioneering groups made use of a number of techniques, including NMR spectroscopy, which has been applied successfully to structural proteomics studies over the past 10 years. In addition, advances in hardware design, data acquisition methods, sample preparation and automation of data analysis have been developed and successfully applied to high-throughput structure determination techniques. These efforts ensure that NMR spectroscopy will become an important methodology for performing structural proteomics research on a genomic scale. NMR-based structural proteomics together with x-ray crystallography will provide a comprehensive structural database to predict the basic biological functions of hypothetical proteins identified by the genome projects.

Research paper thumbnail of ADAPT-NMR Enhancer: complete package for reduced dimensionality in protein NMR spectroscopy

Bioinformatics, 2013

ADAPT-nuclear magnetic resonance (ADAPT-NMR) offers an automated approach to the concurrent acqui... more ADAPT-nuclear magnetic resonance (ADAPT-NMR) offers an automated approach to the concurrent acquisition and processing of protein NMR data with the goal of complete backbone and side chain assignments. What the approach lacks is a useful graphical interface for reviewing results and for searching for missing peaks that may have prevented assignments or led to incorrect assignments. Because most of the data ADAPT-NMR collects are 2D tilted planes used to find peaks in 3D spectra, it would be helpful to have a tool that reconstructs the 3D spectra. The software package reported here, ADAPT-NMR Enhancer, supports the visualization of both 2D tilted planes and reconstructed 3D peaks on each tilted plane. ADAPT-NMR Enhancer can be used interactively with ADAPT-NMR to automatically assign selected peaks, or it can be used to produce PINE-SPARKY-like graphical dialogs that support atom-by-atom and peak-by-peak assignment strategies. Results can be exported in various formats, including XEASY proton file (.prot), PINE pre-assignment file (.str), PINE probabilistic output file, SPARKY peak list file (.list) and TALOSþ input file (.tab). As an example, we show how ADAPT-NMR Enhancer was used to extend the automated data collection and assignment results for the protein Aedes aegypti sterol carrier protein 2. Availability: The program, in the form of binary code along with tutorials and reference manuals, is available at http://pine.

Research paper thumbnail of PONDEROSA, an automated 3D-NOESY peak picking program, enables automated protein structure determination

Bioinformatics, 2011

PONDEROSA (Peak-picking Of Noe Data Enabled by Restriction of Shift Assignments) accepts input in... more PONDEROSA (Peak-picking Of Noe Data Enabled by Restriction of Shift Assignments) accepts input information consisting of a protein sequence, backbone and sidechain NMR resonance assignments, and 3D-NOESY ( 13 C-edited and/or 15 N-edited) spectra, and returns assignments of NOESY crosspeaks, distance and angle constraints, and a reliable NMR structure represented by a family of conformers. PONDEROSA incorporates and integrates external software packages (TALOS+, STRIDE and CYANA) to carry out different steps in the structure determination. PONDEROSA implements internal functions that identify and validate NOESY peak assignments and assess the quality of the calculated three-dimensional structure of the protein. The robustness of the analysis results from PONDEROSA's hierarchical processing steps that involve iterative interaction among the internal and external modules. PONDEROSA supports a variety of input formats: SPARKY assignment table (.shifts) and spectrum file formats (.ucsf), XEASY proton file format (.prot), and NMR-STAR format (.star). To demonstrate the utility of PONDEROSA, we used the package to determine 3D structures of two proteins: human ubiquitin and Escherichia coli iron-sulfur scaffold protein variant IscU(D39A). The automatically generated structural constraints and ensembles of conformers were as good as or better than those determined previously by much less automated means. Availability: The program, in the form of binary code along with tutorials and reference manuals, is available at

Research paper thumbnail of PINE-SPARKY: graphical interface for evaluating automated probabilistic peak assignments in protein NMR spectroscopy

Bioinformatics, 2009

PINE-SPARKY supports the rapid, user-friendly and efficient visualization of probabilistic assign... more PINE-SPARKY supports the rapid, user-friendly and efficient visualization of probabilistic assignments of NMR chemical shifts to specific atoms in the covalent structure of a protein in the context of experimental NMR spectra. PINE-SPARKY is based on the very popular SPARKY package for visualizing multidimensional NMR spectra (T.

Research paper thumbnail of Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

PLoS ONE, 2014

Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the... more Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the differing substrate specificity of the respective 2A protease (2A pro ). Rhinoviruses use their 2A pro to cleave a spectrum of cellular proteins important to virus replication and anti-host activities. These enzymes share a chymotrypsin-like fold stabilized by a tetra-coordinated zinc ion. The catalytic triad consists of conserved Cys (C105), His (H34), and Asp (D18) residues. We used a semi-automated NMR protocol developed at NMRFAM to determine the solution structure of 2A pro (C 105 A variant) from an isolate of the clinically important rhinovirus C species (RV-C). The backbone of C2 2Apro superimposed closely (1.41-1.81 Å rmsd) with those of orthologs from RV-A2, coxsackie B4 (CB4), and enterovirus 71 (EV71) having sequence identities between 40% and 60%. Comparison of the structures suggest that the differential functional properties of C2 2A pro stem from its unique surface charge, high proportion of surface aromatics, and sequence surrounding the di-tyrosine flap. Citation: Lee W, Watters KE, Troupis AT, Reinen NM, Suchy FP, et al. (2014) Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12. PLoS ONE 9(6): e97198.

Research paper thumbnail of PACSY database, a relational database management system for Protein structure and nuclear Magnetic Resonance chemical shift analysis

2012 IEEE International Conference on Bioinformatics and Biomedicine Workshops, 2012

We developed a new relational database management system called PACSY (protein structure And Chem... more We developed a new relational database management system called PACSY (protein structure And Chemical Shift NMR spectroscopY) by integrating information from the Protein Data Bank (PDB), the Biological Magnetic Resonance Data Bank (BMRB), and the Structural Classification of Proteins (SCOP) database. PACSY offers valuable information for structural investigations such as three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. The database can be installed on an RDBMS server such as MySQL and PostgreSQL for advanced search functions by supporting database queries. PACSY enables users to search for combinations of information from different database sources in support of their research. PACSY along with two associated software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http://pacsy.nmrfam.wisc.edu.

Research paper thumbnail of Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12

PLoS ONE, 2014

Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the... more Human rhinovirus strains differ greatly in their virulence, and this has been correlated with the differing substrate specificity of the respective 2A protease (2A pro ). Rhinoviruses use their 2A pro to cleave a spectrum of cellular proteins important to virus replication and anti-host activities. These enzymes share a chymotrypsin-like fold stabilized by a tetra-coordinated zinc ion. The catalytic triad consists of conserved Cys (C105), His (H34), and Asp (D18) residues. We used a semi-automated NMR protocol developed at NMRFAM to determine the solution structure of 2A pro (C 105 A variant) from an isolate of the clinically important rhinovirus C species (RV-C). The backbone of C2 2Apro superimposed closely (1.41-1.81 Å rmsd) with those of orthologs from RV-A2, coxsackie B4 (CB4), and enterovirus 71 (EV71) having sequence identities between 40% and 60%. Comparison of the structures suggest that the differential functional properties of C2 2A pro stem from its unique surface charge, high proportion of surface aromatics, and sequence surrounding the di-tyrosine flap. Citation: Lee W, Watters KE, Troupis AT, Reinen NM, Suchy FP, et al. (2014) Solution Structure of the 2A Protease from a Common Cold Agent, Human Rhinovirus C2, Strain W12. PLoS ONE 9(6): e97198.

Research paper thumbnail of NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy

Bioinformatics, 2014

SPARKY (Goddard and Kneller, SPARKY 3) remains the most popular software program for NMR data ana... more SPARKY (Goddard and Kneller, SPARKY 3) remains the most popular software program for NMR data analysis, despite the fact that development of the package by its originators ceased in 2001. We have taken over the development of this package and describe NMRFAM-SPARKY, which implements new functions reflecting advances in the biomolecular NMR field. NMRFAM-SPARKY has been repackaged with current versions of Python and Tcl/Tk, which support new tools for NMR peak simulation and graphical assignment determination. These tools, along with chemical shift predictions from the PACSY database, greatly accelerate protein side chain assignments. NMRFAM-SPARKY supports automated data format interconversion for interfacing with a variety of web servers including, PECAN , PINE, TALOS-N, CS-Rosetta, SHIFTX2 and PONDEROSA-C/S. Availability and implementation: The software package, along with binary and source codes, if desired, can be downloaded freely from http://pine.nmrfam.wisc.edu/download_packages.html. Instruction manuals and video tutorials can be found at http://www.nmrfam.wisc.edu/nmrfamsparky-distribution.htm.

Research paper thumbnail of Fast automated protein NMR data collection and assignment by ADAPT-NMR on Bruker spectrometers

Journal of Magnetic Resonance, 2013

ADAPT-NMR (Assignment-directed Data collection Algorithm utilizing a Probabilistic Toolkit in NMR... more ADAPT-NMR (Assignment-directed Data collection Algorithm utilizing a Probabilistic Toolkit in NMR) supports automated NMR data collection and backbone and side chain assignment for [U-13 C, U-15 N]-labeled proteins. Given the sequence of the protein and data for the orthogonal 2D 1 H-15 N and 1 H-13 C planes, the algorithm automatically directs the collection of tilted plane data from a variety of triple-resonance experiments so as to follow an efficient pathway toward the probabilistic assignment of 1 H, 13 C, and 15 N signals to specific atoms in the covalent structure of the protein. Data collection and assignment calculations continue until the addition of new data no longer improves the assignment score. ADAPT-NMR was first implemented on Varian (Agilent) spectrometers [, PLoS One 7 (2012) e33173]. Because of broader interest in the approach, we present here a version of ADAPT-NMR for Bruker spectrometers. We have developed two AU console programs (ADAPT_ORTHO_run and ADAPT_NMR_run) that run under TOPSPIN Versions 3.0 and higher. To illustrate the performance of the algorithm on a Bruker spectrometer, we tested one protein, chlorella ubiquitin (76 amino acid residues), that had been used with the Varian version: the Bruker and Varian versions achieved the same level of assignment completeness (98% in 20 h). As a more rigorous evaluation of the Bruker version, we tested a larger protein, BRPF1 bromodomain (114 amino acid residues), which yielded an automated assignment completeness of 86% in 55 h. Both experiments were carried out on a 500 MHz Bruker AVANCE III spectrometer equipped with a z-gradient 5 mm TCI probe. ADAPT-NMR is available at http://pine.nmrfam.wisc.edu/ADAPT-NMR in the form of pulse programs, the two AU programs, and instructions for installation and use.

Research paper thumbnail of PACSY, a relational database management system for protein structure and chemical shift analysis

Journal of Biomolecular NMR, 2012

PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management... more PACSY (Protein structure And Chemical Shift NMR spectroscopY) is a relational database management system that integrates information from the Protein Data Bank, the Biological Magnetic Resonance Data Bank, and the Structural Classification of Proteins database. PACSY provides three-dimensional coordinates and chemical shifts of atoms along with derived information such as torsion angles, solvent accessible surface areas, and hydrophobicity scales. PACSY consists of six relational table types linked to one another for coherence by key identification numbers. Database queries are enabled by advanced search functions supported by an RDBMS server such as MySQL or PostgreSQL. PACSY enables users to search for combinations of information from different database sources in support of their research. Two software packages, PACSY Maker for database creation and PACSY Analyzer for database analysis, are available from http:// pacsy.nmrfam.wisc.edu.