Yang Shen - Academia.edu (original) (raw)
Papers by Yang Shen
Proceedings of The National Academy of Sciences, 2008
Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is describ... more Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is described that exploits this relation for de novo protein structure generation, using as input experimental parameters the 13 C ␣ , 13 C  , 13 C, 15 N, 1 H ␣ and 1 H N NMR chemical shifts. These shifts are generally available at the early stage of the traditional NMR structure determination process, before the collection and analysis of structural restraints. The chemical shift based structure determination protocol uses an empirically optimized procedure to select protein fragments from the Protein Data Bank, in conjunction with the standard ROSETTA Monte Carlo assembly and relaxation methods. Evaluation of 16 proteins, varying in size from 56 to 129 residues, yielded full-atom models that have 0.7-1.8 Å root mean square deviations for the backbone atoms relative to the experimentally determined x-ray or NMR structures. The strategy also has been successfully applied in a blind manner to nine protein targets with molecular masses up to 15.4 kDa, whose conventional NMR structure determination was conducted in parallel by the Northeast Structural Genomics Consortium. This protocol potentially provides a new direction for high-throughput NMR structure determination.
Proceedings of The National Academy of Sciences, 2005
A standardized protocol enabling rapid NMR data collection for high-quality protein structure det... more A standardized protocol enabling rapid NMR data collection for high-quality protein structure determination is presented that allows one to capitalize on high spectrometer sensitivity: a set of five G-matrix Fourier transform NMR experiments for resonance assignment based on highly resolved 4D and 5D spectral information is acquired in conjunction with a single simultaneous 3D 15 N, 13 C aliphatic , 13 C aromatic -resolved [ 1 H, 1 H]-NOESY spectrum providing 1 H-1 H upper distance limit constraints. The protocol was integrated with methodology for semiautomated data analysis and used to solve eight NMR protein structures of the Northeast Structural Genomics Consortium pipeline. The molecular masses of the hypothetical target proteins ranged from 9 to 20 kDa with an average of Ϸ14 kDa. Between 1 and 9 days of instrument time were invested per structure, which is less than Ϸ10 -25% of the measurement time routinely required to date with conventional approaches. The protocol presented here effectively removes data collection as a bottleneck for high-throughput solution structure determination of proteins up to at least Ϸ20 kDa, while concurrently providing spectra that are highly amenable to fast and robust analysis.
Proceedings of The National Academy of Sciences, 2008
Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is describ... more Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is described that exploits this relation for de novo protein structure generation, using as input experimental parameters the 13 C ␣ , 13 C  , 13 C, 15 N, 1 H ␣ and 1 H N NMR chemical shifts. These shifts are generally available at the early stage of the traditional NMR structure determination process, before the collection and analysis of structural restraints. The chemical shift based structure determination protocol uses an empirically optimized procedure to select protein fragments from the Protein Data Bank, in conjunction with the standard ROSETTA Monte Carlo assembly and relaxation methods. Evaluation of 16 proteins, varying in size from 56 to 129 residues, yielded full-atom models that have 0.7-1.8 Å root mean square deviations for the backbone atoms relative to the experimentally determined x-ray or NMR structures. The strategy also has been successfully applied in a blind manner to nine protein targets with molecular masses up to 15.4 kDa, whose conventional NMR structure determination was conducted in parallel by the Northeast Structural Genomics Consortium. This protocol potentially provides a new direction for high-throughput NMR structure determination.
Journal of Biomolecular Nmr, 2007
Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their... more Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their local environment. A computer program, SPARTA, is described that uses this correlation with local structure to predict protein backbone chemical shifts, given an input three-dimensional structure, by searching a newly generated database for triplets of adjacent residues that provide the best match in ϕ/ψ/χ1 torsion angles and sequence similarity to the query triplet of interest. The database contains 15N, 1HN, 1Hα, 13Cα, 13Cβ and 13C′ chemical shifts for 200 proteins for which a high resolution X-ray (≤2.4 Å) structure is available. The relative importance of the weighting factors for the ϕ/ψ/χ1 angles and sequence similarity was optimized empirically. The weighted, average secondary shifts of the central residues in the 20 best-matching triplets, after inclusion of nearest neighbor, ring current, and hydrogen bonding effects, are used to predict chemical shifts for the protein of known structure. Validation shows good agreement between the SPARTA-predicted and experimental shifts, with standard deviations of 2.52, 0.51, 0.27, 0.98, 1.07 and 1.08 ppm for 15N, 1HN, 1Hα, 13Cα, 13Cβ and 13C′, respectively, including outliers.
Journal of Biomolecular Nmr, 2009
NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes... more NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes an empirical relation between 13 C, 15 N and 1 H chemical shifts and backbone torsion angles φ and ψ (G. Cornilescu et al. J. Biomol. NMR. 13, 289-302, 1999). Extension of the original 20-protein database to 200 proteins increased the fraction of residues for which backbone angles could be predicted from 65 to 74%, while reducing the error rate from 3 to 2.5 percent. Addition of a twolayer neural network filter to the database fragment selection process forms the basis for a new program, TALOS+, which further enhances the prediction rate to 88.5%, without increasing the error rate. Excluding the 2.5% of residues for which TALOS makes predictions that strongly differ from those observed in the crystalline state, the accuracy of predicted φ and ψ angles, equals ±13°. Large discrepancies between predictions and crystal structures are primarily limited to loop regions, and for the few cases where multiple X-ray structures are available such residues are often found in different states in the different structures. The TALOS+ output includes predictions for individual residues with missing chemical shifts, and the neural network component of the program also predicts secondary structure with good accuracy.
Proceedings of The National Academy of Sciences, 2008
Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is describ... more Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is described that exploits this relation for de novo protein structure generation, using as input experimental parameters the 13 C ␣ , 13 C  , 13 C, 15 N, 1 H ␣ and 1 H N NMR chemical shifts. These shifts are generally available at the early stage of the traditional NMR structure determination process, before the collection and analysis of structural restraints. The chemical shift based structure determination protocol uses an empirically optimized procedure to select protein fragments from the Protein Data Bank, in conjunction with the standard ROSETTA Monte Carlo assembly and relaxation methods. Evaluation of 16 proteins, varying in size from 56 to 129 residues, yielded full-atom models that have 0.7-1.8 Å root mean square deviations for the backbone atoms relative to the experimentally determined x-ray or NMR structures. The strategy also has been successfully applied in a blind manner to nine protein targets with molecular masses up to 15.4 kDa, whose conventional NMR structure determination was conducted in parallel by the Northeast Structural Genomics Consortium. This protocol potentially provides a new direction for high-throughput NMR structure determination.
Proceedings of The National Academy of Sciences, 2005
A standardized protocol enabling rapid NMR data collection for high-quality protein structure det... more A standardized protocol enabling rapid NMR data collection for high-quality protein structure determination is presented that allows one to capitalize on high spectrometer sensitivity: a set of five G-matrix Fourier transform NMR experiments for resonance assignment based on highly resolved 4D and 5D spectral information is acquired in conjunction with a single simultaneous 3D 15 N, 13 C aliphatic , 13 C aromatic -resolved [ 1 H, 1 H]-NOESY spectrum providing 1 H-1 H upper distance limit constraints. The protocol was integrated with methodology for semiautomated data analysis and used to solve eight NMR protein structures of the Northeast Structural Genomics Consortium pipeline. The molecular masses of the hypothetical target proteins ranged from 9 to 20 kDa with an average of Ϸ14 kDa. Between 1 and 9 days of instrument time were invested per structure, which is less than Ϸ10 -25% of the measurement time routinely required to date with conventional approaches. The protocol presented here effectively removes data collection as a bottleneck for high-throughput solution structure determination of proteins up to at least Ϸ20 kDa, while concurrently providing spectra that are highly amenable to fast and robust analysis.
Proceedings of The National Academy of Sciences, 2008
Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is describ... more Protein NMR chemical shifts are highly sensitive to local structure. A robust protocol is described that exploits this relation for de novo protein structure generation, using as input experimental parameters the 13 C ␣ , 13 C  , 13 C, 15 N, 1 H ␣ and 1 H N NMR chemical shifts. These shifts are generally available at the early stage of the traditional NMR structure determination process, before the collection and analysis of structural restraints. The chemical shift based structure determination protocol uses an empirically optimized procedure to select protein fragments from the Protein Data Bank, in conjunction with the standard ROSETTA Monte Carlo assembly and relaxation methods. Evaluation of 16 proteins, varying in size from 56 to 129 residues, yielded full-atom models that have 0.7-1.8 Å root mean square deviations for the backbone atoms relative to the experimentally determined x-ray or NMR structures. The strategy also has been successfully applied in a blind manner to nine protein targets with molecular masses up to 15.4 kDa, whose conventional NMR structure determination was conducted in parallel by the Northeast Structural Genomics Consortium. This protocol potentially provides a new direction for high-throughput NMR structure determination.
Journal of Biomolecular Nmr, 2007
Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their... more Chemical shifts of nuclei in or attached to a protein backbone are exquisitely sensitive to their local environment. A computer program, SPARTA, is described that uses this correlation with local structure to predict protein backbone chemical shifts, given an input three-dimensional structure, by searching a newly generated database for triplets of adjacent residues that provide the best match in ϕ/ψ/χ1 torsion angles and sequence similarity to the query triplet of interest. The database contains 15N, 1HN, 1Hα, 13Cα, 13Cβ and 13C′ chemical shifts for 200 proteins for which a high resolution X-ray (≤2.4 Å) structure is available. The relative importance of the weighting factors for the ϕ/ψ/χ1 angles and sequence similarity was optimized empirically. The weighted, average secondary shifts of the central residues in the 20 best-matching triplets, after inclusion of nearest neighbor, ring current, and hydrogen bonding effects, are used to predict chemical shifts for the protein of known structure. Validation shows good agreement between the SPARTA-predicted and experimental shifts, with standard deviations of 2.52, 0.51, 0.27, 0.98, 1.07 and 1.08 ppm for 15N, 1HN, 1Hα, 13Cα, 13Cβ and 13C′, respectively, including outliers.
Journal of Biomolecular Nmr, 2009
NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes... more NMR chemical shifts in proteins depend strongly on local structure. The program TALOS establishes an empirical relation between 13 C, 15 N and 1 H chemical shifts and backbone torsion angles φ and ψ (G. Cornilescu et al. J. Biomol. NMR. 13, 289-302, 1999). Extension of the original 20-protein database to 200 proteins increased the fraction of residues for which backbone angles could be predicted from 65 to 74%, while reducing the error rate from 3 to 2.5 percent. Addition of a twolayer neural network filter to the database fragment selection process forms the basis for a new program, TALOS+, which further enhances the prediction rate to 88.5%, without increasing the error rate. Excluding the 2.5% of residues for which TALOS makes predictions that strongly differ from those observed in the crystalline state, the accuracy of predicted φ and ψ angles, equals ±13°. Large discrepancies between predictions and crystal structures are primarily limited to loop regions, and for the few cases where multiple X-ray structures are available such residues are often found in different states in the different structures. The TALOS+ output includes predictions for individual residues with missing chemical shifts, and the neural network component of the program also predicts secondary structure with good accuracy.