Proteomics of the Chloroplast: Systematic Identification and Targeting Analysis of Lumenal and Peripheral Thylakoid Proteins (original) (raw)

Plant Cell. 2000 Mar; 12(3): 319–342.

Systematic Identification and Targeting Analysis of Lumenal and Peripheral Thylakoid Proteins

Jean-Benoît Peltier

aDepartment of Biochemistry, Arrhenius Laboratories, Stockholm University, S-10691 Stockholm, Sweden

Giulia Friso

bDepartment of Cellular and Molecular Pharmacology, AstraZeneca Novum, S-14157 Huddinge, Sweden

Dário Eluan Kalume

cDepartment of Molecular Biology, Odense University, DK-5230 Odense M, Denmark

Peter Roepstorff

cDepartment of Molecular Biology, Odense University, DK-5230 Odense M, Denmark

Frederik Nilsson

dDepartment of Bioanalytical Chemistry, AstraZeneca R&D Mölndal, S-43183 Mölndal, Sweden

Iwona Adamska

aDepartment of Biochemistry, Arrhenius Laboratories, Stockholm University, S-10691 Stockholm, Sweden

Klaas J. van Wijk

aDepartment of Biochemistry, Arrhenius Laboratories, Stockholm University, S-10691 Stockholm, Sweden

bDepartment of Cellular and Molecular Pharmacology, AstraZeneca Novum, S-14157 Huddinge, Sweden

cDepartment of Molecular Biology, Odense University, DK-5230 Odense M, Denmark

dDepartment of Bioanalytical Chemistry, AstraZeneca R&D Mölndal, S-43183 Mölndal, Sweden

1To whom correspondence should be addressed. E-mail es.us.imekoib@saalk; fax 46-8-153679

Received 1999 Oct 15; Accepted 1999 Dec 23.

Abstract

The soluble and peripheral proteins in the thylakoids of pea were systematically analyzed by using two-dimensional electrophoresis, mass spectrometry, and N-terminal Edman sequencing, followed by database searching. After correcting to eliminate possible isoforms and post-translational modifications, we estimated that there are at least 200 to 230 different lumenal and peripheral proteins. Sixty-one proteins were identified; for 33 of these proteins, a clear function or functional domain could be identified, whereas for 10 proteins, no function could be assigned. For 18 proteins, no expressed sequence tag or full-length gene could be identified in the databases, despite experimental determination of a significant amount of amino acid sequence. Nine previously unidentified proteins with lumenal transit peptides are presented along with their full-length genes; seven of these proteins possess the twin arginine motif that is characteristic for substrates of the TAT pathway. Logoplots were used to provide a detailed analysis of the lumenal targeting signals, and all nuclear-encoded proteins identified on the two-dimensional gels were used to test predictions for chloroplast localization and transit peptides made by the software programs ChloroP, PSORT, and SignalP. A combination of these three programs was found to provide a useful tool for evaluating chloroplast localization and transit peptides and also could reveal possible alternative processing sites and dual targeting. The potential of proteomics for plant biology and homology-based searching with mass spectrometry data is discussed.

INTRODUCTION

Chloroplasts in green algae and higher plants contain photosynthetic thylakoid membranes. Four multisubunit protein complexes, photosystem I (PSI), PSII, the ATP-synthase complex, and the cytochrome b_6_f complex, which together comprise 75 to 100 proteins, perform the photosynthetic reactions (see, e.g., Ort and Yocum, 1996). It can be postulated that the thylakoid membranes contain a large number of other proteins that are involved in the biogenesis and regulation of the four multisubunit complexes (Wollman et al., 1999). These additional proteins would be involved in processes such as biosynthesis and ligation of cofactors, insertion of proteins into membranes, and folding and degradation of proteins. Several such proteins have been identified in the thylakoid membrane. They include the proteases DegP and FtsH (reviewed in Adam, 1996); protein translocation components, such as SecY, SecE, SecA, Alb3, and Hcf106 (reviewed in Settles and Martienssen, 1998; Dalbey and Robinson, 1999; Keegstra and Cline, 1999); the PSII assembly factor Hcf136 (Meurer et al., 1998); and a lumenal isomerase TLP40 (Fulgosi et al., 1998).

To regulate biogenesis and several thylakoid functions, kinases (Snyders and Kohorn, 1999), phosphatases (Vener et al., 1998), and possibly other signal transducers are present in the thylakoid membrane. Several proteins without obvious function have been identified in the intrathylakoid space, the thylakoid lumen (Kieselbach et al., 1998). Based on preliminary information and postulated functions, it can be expected that at least 100 proteins involved in such processes probably exist and that many of these proteins are present in low abundance in chloroplasts.

Although best known for their role in photosynthesis, chloroplasts also synthesize many essential compounds, including plant hormones (Marin et al., 1996; Lange, 1998), fatty acids and lipids (Miquel and Browse, 1992; Essigmann et al., 1998), amino acids (Ho et al., 1999), vitamins (B1, K1, and E; Belanger et al., 1995), purine and pyrimidine nucleotides (Doremus, 1986; Smith et al., 1998), and secondary metabolites such as alkaloids and isoprenoids (Keller et al., 1998). In addition, chloroplasts also are required for nitrate and sulfur assimilation (Heldt, 1997). Several of the enzymes in these biosynthetic pathways have been identified by analyzing mutants of Arabidopsis and other plant species. However, more proteins and possible new pathways remain to be discovered, and it is not unlikely that several of these components are located in or at the surface of the thylakoid membrane.

The 120- to 160-kb circular chloroplast genome encodes only ∼120 proteins and RNA molecules that are involved in chloroplast transcription and translation or encode subunits of the four complexes involved in photosynthesis or the NADH dehydrogenase complex (Sugita and Sugiura, 1996). It is estimated, however, that the chloroplast contains between 2000 and 5000 different proteins; thus, the majority of chloroplast proteins are encoded by the nuclear genome. Those nuclear-encoded proteins are synthesized as precursors on cytosolic ribosomes and subsequently are targeted to the chloroplast via an N-terminal transit peptide, which is proteolytically removed after import into the chloroplast (reviewed in Keegstra and Cline, 1999). Once inside the chloroplast, at least four pathways operate to target the proteins to the thylakoid membrane or into the thylakoid lumen (Dalbey and Robinson, 1999; Keegstra and Cline, 1999). The presequences of these nuclear-encoded chloroplast proteins share common features, which can be used to predict localization with moderate confidence (Emanuelsson et al., 1999; Nakai and Horton, 1999). However, the degeneracy of these targeting sequences precludes a systematic polymerase chain reaction–based screening for all chloroplast-localized proteins.

The improvement of two-dimensional electrophoresis by the development of immobilized pH gradients (IPGs; Görg et al., 1988) together with improved solubilization techniques (Rabilloud et al., 1997; Molloy et al., 1998) now permits the reproducible separation of up to 2000 proteins on a single two-dimensional electrophoresis gel. Such gel-separated proteins can be identified rapidly by mass spectrometry (MS), and if genomic information is also available, such analyses permit the systematic identification of the protein complement of a genome, the proteome (Shevchenko et al., 1996; Dainese et al., 1997; Roepstorff, 1997; Yates, 1998). In addition, MS is a powerful tool for analysis of isoforms, secondary modifications of proteins (such as glycosylation, phosphorylation, or isoprenylation), and proteolysis requiring only low amounts (picomoles to attomoles) of proteins (Burlingame et al., 1998; Kuster and Mann, 1998; McLafferty et al., 1999; Wilkins et al., 1999). Such systematic analysis of protein populations is summarized by the term proteomics. Thus, proteomics bridges the gap between genomic sequence information and the actual protein population in a specific tissue, cell, or cellular compartment.

To identify novel components involved in thylakoid biogenesis, we have started to identify systematically the thylakoid proteins by two-dimensional electrophoresis, matrix-assisted laser desorption/ionization–time of flight (MALDI-TOF) MS, electrospray ionization tandem MS (ESI-MS/MS), and N-terminal Edman sequencing. MS has been used previously to analyze purified PSII complexes (Whitelegge et al., 1998; Zheleva et al., 1998), but no systematic analysis of chloroplast proteins has been conducted. In this study, we present detailed reproducible two-dimensional electrophoresis maps of the lumenal and peripheral proteins of the thylakoids from pea. Functional domain analysis was performed for the newly discovered proteins or translated open reading frames by different programs available on the Internet. Localization prediction and presequences were analyzed by using the (noncommercial) software programs PSORT (Nakai and Horton, 1999), ChloroP (Emanuelsson et al., 1999), and SignalP (Nielsen et al., 1997). After correction for spots resulting from post-translational modifications and proteolysis, we estimated that there are in total at least 200 proteins in the lumenal space and in the periphery of the thylakoid membrane.

RESULTS

Isolation of Protein Populations and Reproducibility of Preparations and Two-Dimensional Electrophoresis Gels

Two-dimensional electrophoresis maps were constructed for lumenal and peripheral proteins of the thylakoid membrane. To avoid cross-contamination with nonchloroplast proteins, we made all preparations from intact pea chloroplasts that had been purified on linear Percoll gradients. Thylakoids then were liberated from the intact chloroplasts by osmotic shock and carefully washed to remove stromal proteins (Figures 1A and 1B). The lumenal proteins subsequently were liberated from the thylakoids by sonication, and the peripheral proteins were extracted from the thylakoid membranes by incubation under high-salt conditions (Figure 1A). The enrichment for lumenal plastocyanin, two proteins of the oxygen evolving complex (OEC33 and OEC23), and the peripheral coupling factor protein CF1α, as well as the partitioning of the stromal ribulose bisphosphate carboxylase small subunit (RbcS) and the abundant integral membrane protein light-harvesting complex IIb (LhcIIb), was verified by protein gel blot analysis (Figure 1B). Clearly, LhcIIb and RbcS partitioned away from the lumen (Figure 1B, lane 6) and peripheral proteins (Figure 1B, lane 8), because no immunoresponse was found in either of those fractions (even after overexposure of the blots), indicating that contamination of the lumenal and peripheral fractions by these proteins was <1%.

Protein Gel Blot Analysis and Scheme of the Protein Purification Process.

(A) To purify thylakoid protein fractions enriched in lumenal and peripheral proteins, intact and purified chloroplasts (1) were lysed and separated into crude thylakoids (2) and soluble stromal proteins and envelope proteins (3). Subsequently, the thylakoids were washed extensively to remove stromal proteins and envelopes. These washed thylakoids (4) then were sonicated to liberate soluble lumenal proteins (6), and the sonicated thylakoid membranes (5) were collected by centifugation. The sonicated thylakoid membranes were incubated in high salt to liberate the peripheral thylakoid proteins (8), and the remaining membranes, containing only integral membrane proteins (7), were removed by centrifugation. Fractions 6 and 8 were used for the proteomics analysis.

(B) The partitioning of a set of stromal, peripheral, integral, and lumenal proteins during the purification of the lumenal and peripheral protein-enriched fractions was followed by protein gel blotting. All samples were loaded on an equal volume basis. Polyclonal antisera were used against the ribulose bisphosphate carboxylase small subunit (RbcS), which is one of the most abundant stromal proteins, peripheral protein CF1α on the stromal side, the integral membrane protein LhcIIb, two extrinsic subunits of the oxygen-evolving complex (OEC33 and OEC23) of PSII on the lumenal side of the membrane, and the soluble lumenal protein plastocyanin. The lane numbers in the protein gel blots correspond to the numbers in the purification scheme shown in (A).

The peripheral protein CF1α was found both in the lumenal as well as in the peripheral protein fraction. CF1α is a peripheral protein located at the stromal side of the thylakoid membrane; thus, sonication released a substantial fraction of this protein from the membrane into the lumenal fraction (see Discussion). The lumenal protein plastocyanin partitioned completely to the lumen, as detected by protein gel blot analysis, indicating that any plastocyanin in the peripheral fractions represents <1% of total plastocyanin content. The peripheral proteins OEC33 and OEC23 at the lumenal side were found in both the lumenal fraction and the peripheral fraction; however, as expected from the structure of the OEC complex, OEC33 partitioned more to the peripheral protein population, whereas OEC23 partitioned more to the lumen. The sonication and extraction completely removed the peripheral and lumenal marker proteins from the remaining integral membrane protein fraction (Figure 1B, lane 7).

To improve the resolution of the two-dimensional electrophoresis maps, we made separate maps for the low pH range (4.0 to 7.0) and the high pH range (6.0 to 11.0). Subfractionation of the thylakoid proteins helped to improve the resolution of these maps because the different physical-chemical properties of soluble versus peripheral membrane proteins necessitate the use of different detergent mixtures for optimal separation and migration in the first dimension (Rabilloud et al., 1997; Herbert, 1999). The subfractionation and the use of separate IPGs for low and high pH ranges also tend to increase the probability that proteins of low abundance will be identified, especially if the protein population is dominated by a number of very abundant proteins, as is the case for the thylakoid membrane.

The two-dimensional electrophoresis maps of the lumenal and peripheral proteins separated in the first dimension between pH 4.0 to 7.0 (acidic map) and pH 7.0 to 11.0 (basic map) are shown in Figures 2A to 2D. For the basic maps, only the region from pH 7.0 to 11.0 is shown, to avoid overlap with the acidic maps, although the IPG strips were from 6.0 to 11.0. Between 360 and 400 protein spots were detected by silver staining (detection limit below 1 ng) in each acidic map, and ∼50 protein spots were detected in each basic map. Computer-aided image analysis indicated a 39% overlap between the two acidic two-dimensional electrophoresis maps and a 75% overlap between the two basic two-dimensional electrophoresis maps. The maps were made in triplicate from independent chloroplast preparations by using different batches of pea leaves, and we observed an excellent reproducibility (data not shown).

Silver-Stained Two-Dimensional Electrophoresis Maps of Lumenal and Peripheral Proteins.

Proteins were separated by two-dimensional gel electrophoresis with denaturing isoelectric focusing in the first dimension and SDS-PAGE in the second dimension. Gels were calibrated for molecular mass (in kilodaltons) and pI (in pH units) by internal (pH and mass) and external (mass) standards, which are indicated. Numbers indicate protein spots listed in Tables 1 to 4. For a selected number of spots, the identity (in addition to the number) has been listed on the two-dimensional electrophoresis map. Spots on the acidic maps (pH 4.0 to 7.0) for the peripheral and the lumenal proteins are, respectively, numbered from 1 to 99 and 100 to 199. Spots on the basic maps (pH 7.0 to 11.0) of the peripheral and lumenal proteins are numbered from 200 to 249 and 250 to 300, respectively. The same numbers are used on the lumenal and peripheral maps if spots could be matched by image analysis and confirmed by mass fingerprints or sequence tags. If proteins were identified on both maps, the number was chosen for the map in which the spot was most abundant.

(A) and (C) Peripheral proteins were separated in the first dimension on IPGs between pH 4.0 and 7.0 (A) and between pH 7.0 and 11.0 (C).

(B) and (D) Lumenal proteins were separated in the first dimension on IPGs between 4.0 and 7.0 (B) and between 7.0 and 11.0 (D).

Strategies for Identification of Peripheral and Lumenal Proteins

The principles of our proteomics approach are shown schematically in Figure 3. After two-dimensional electrophoresis, protein spots were selected and analyzed by MALDI-TOF. To denote a protein as unambiguously identified, we set the following criteria (Parker et al., 1998): coverage of the mature protein (i.e., excluding cleavable presequences) by the matching peptides must reach a minimum of 15%, and at least four independent (i.e., with no sequence overlap) peptides should match, within a stringent 15 ppm maximum deviation of mass accuracy (thus, maximum 0.015% difference between the experimental and theoretical mass of the MALDI peptides). If a protein could not be identified unambiguously by using MALDI-TOF, peptide sequence tags were obtained by ESI-MS/MS or Edman sequencing and were used for protein identification. Three to five precursor ions from each sample were selected for ESI-MS/MS analysis. The peptide masses and obtained sequence tags were used to search the public databases with the freely available software program MS-Tag, developed at the University of California, San Francisco MS Facility (http://prospector.ucsf.edu) and FASTA. To obtain sequence tags by Edman sequencing, we stained gels with Coomassie Brilliant Blue R 250 before blotting to increase the staining sensitivity and to facilitate matching with other gels. If information about the length of cleavable transit peptides was obtained, the theoretical pI and molecular masses were calculated after removal of the presequences and compared with the experimental pI and molecular masses on the two-dimensional electrophoresis maps.

Schematic Explanation of the Proteomics Strategy for Systematic Analysis of the Lumenal and Peripheral Thylakoid Proteins.

The proteins were separated according to their isoelectric point (pI) and then according to their molecular mass, resulting in a two-dimensional gel. The spots were then visualized by Coomassie blue or silver staining, and the gels were scanned for image analysis. Individual protein spots then were selected (exemplified by the encircled spot), excised from the gel, and digested with the site-specific protease trypsin (cleavage C-terminal of either a K residue or an R residue), resulting in a set of tryptic peptides. The peptides were extracted, and their masses were measured by MALDI-TOF MS. The list of measured peptide masses was compared with the masses of the predicted tryptic peptides for each entry in the sequence databases (NCBI, SWISS-Prot, and PIR). Multiple search rounds were performed as described in Methods. In case the protein was not unambiguously identified by MALDI-TOF MS, peptide sequence tags were obtained by ESI-MS/MS or Edman sequencing. The peptide masses and obtained sequence tags were used to search the public databases with the program MS-Tag and FASTA. To obtain sequence tags by Edman sequencing, we stained gels with Coomassie blue before blotting to increase the sensitivity and to allow easier matching of the gels. Spots containing 10 to 15 pmol or more were selected.

General Comments on the Two-Dimensional Electrophoresis Maps

Four hundred spots were analyzed by using MALDI-TOF, 20 of which were further analyzed by ESI-MS/MS and 55 of which were analyzed by N-terminal Edman sequencing. It is likely that none of the spots analyzed by Edman sequencing was blocked at the N terminus (see Discussion). The protein spots on the maps (Figure 2) were only numbered if the protein was identified or if sufficient amino acid sequence tags were obtained to identify the corresponding gene if it were present in the public databases. Unidentified proteins that were analyzed only by MALDI-TOF are not numbered.

Information about these numbered spots is summarized in Tables 1 to 4. Spots from the acidic maps (pH 4.0 to 7.0) of peripheral and lumenal proteins are numbered 1 to 99 (Figure 2A) and 100 to 199 (Figure 2B), respectively. Spots from the basic maps (pH 7.0 to 11.0) of peripheral (Figure 2C) and lumenal proteins (Figure 2D) are numbered from 200 to 249 and 250 to 300, respectively. The same numbers are used on the lumenal and peripheral maps if spots could be matched by image analysis and confirmed by mass fingerprints or sequence tags. If proteins were identified on both maps, the number was chosen for the map in which the spot was most abundant. Spots for which no significant information was obtained were not numbered. These were mostly high (>60 kD) or low (<10 kD) molecular mass proteins of low abundance or spots located very close to an abundant photosynthetic protein on the two-dimensional electrophoresis maps.

Table 1.

Proteins Involved in Photosynthetic Electron and Carbon Metabolism Identified from the Two-Dimensional Electrophoresis Gels of the Lumenal and Peripheral Fractions (pI 4.0 to 7.0 and pI 7.0 to 11.0) Shown in Figures 2A to 2Da

MALDI-TOF	Edmanf or ESI-MS/MSg	Localization and CleavageSite Prediction
Spot No.	Masses (kD)	pI	Identityb(Theorical Mass in kD)	AccessionNumberc	Cover %15 ppmd	Cover %50 ppme	Sequence	N Terminus ofthe Proteinh	PSORTi	ChloroPj	SignalPk
203	17.1	9.3	OEC16b(25.4)	AF026400	YYAI/LAVSTgI/LNDVLSK	84-EAKPI	Lumen: 0.96	54–55	AVL-AE82–83AEA-KPI85–86
11–14	21.9	5.7–6.4	OEC23 (28.0)	P16059	23–37	27–46	74-AYGEA	Mito: 0.75Tk: 0.28	22–23	ADA-AY73–74
30–33	29.8–30.2	5.5–5.8	OEC33 (34.9)	P14226	27–35	32–42	82-EGAPK	Lumen: 0.94	44–45	ASA-EG
10l	21.5	6.3	OEC33 (34.9)	31	31	81–82
100	9.8	4.7	Plastocyanin (17.1)	P16002	35	35	VEVLLGASDg	70-VEVLL	Tk: 0.66	55–56	ALA-VE
116	26.9	4.7	Plastocyanin (17.1)	Lumen: 0.52	69–70
38	38.8	5.7	Ferredoxinb,mNADH red (40.6)	729479	14	15	53	ER: 0.55	51–52	None
117	27.9	5.1	Ferredoxinb (14.4)	P27789	ATYNIKLITPELf	39-ATYNV	Mito: 0.81Stroma: 0.57	25–26	AQA-TV39–40
34, 35, 37	37.3–39.9	6.1–6.3	CF1γ (41.3)	114640	15–24	22–24	53	Stroma: 0.94Tk: 0.75	35–36	None
7, 8	17.8–19.5	6.0–6.1	CF1δ (27.6)	399082	41	42	65	Cyto: 0.45Tk: 0.20	72–73	ALA-DL77–78
202	9.7	8.5	PsaEb (16.2)	1217601	16	23	Not clear	ER: 0.55	51–52	EEA-AP56–57
201	7.2	9.0	PsaNb (15.5)	P31093	SVFDEYLEKSKANKf	(61)-SVFDE	Lumen: 0.86	52–53	ARA-SV60–61
120, 121	37.5–40.6	6.2–6.5	Aldolasen (39.2)	399024	23–27	26–33	(39)	Cyto: 0.65Tk: 0.28	37–38	None
102	11.9	6.4	RbcSn (20.2)	P00869	32	45	(57)	Stroma: 0.91	56–57	None
129	54.3	6.6	RbcLn (47.3)	P04717	18	21	1	Chloroplast encoded
48–53	55–55.5	6.0–6.7	CF1α (55.1)	114522	21–44	26–51	1	Chloroplast encoded
5l	12.6	5.2	CF1α (55.1)	19	24
43–46	51.5–52.5	5.6–5.8	CF1β (53.1)	114560	37–45	46–59	1	Chloroplast encoded
41l	37.3	5.6	CF1β (53.1)	24	34

Table 4.

Identification of Proteins of Unknown Function (Hypothetical Proteins) from the Two-Dimensional Electrophoresis Gels of the Lumenal and Peripheral Fractions (pI 4.0 to 7.0 and pI 7.0 to 11.0) Shown in Figures 2A to 2D by ESI-MS/MS or N-Terminal Edman Degradationad

Edmanb or MS/MSc
Spot No.	Masses (kD)	pI	Sequence	Precursor Ion (M+H)+d	Suggestions/Remarks
118	28.0	5.7	EEQEQEQEQDTKMAb	RecA like prot.?
106	18.2	5.3	AKAGVNKPELLPb
114	24.7	5.7	EQQQQQQP(QN)RRF(R)Eb
125	45.8	5.8	(F)AEIE(A)EQNIEb
117	28.0	5.1	VXVKVXDXDXDb
128	52.3	4.4	(A)EDLGAEKPTSb(A)SXTGAEKPGb
251	11.9	8.6	(AG)EVAP(E)IL(D)VXQ(F)b
109	21.5	6.1	(F)SI/LFEI/LVc	1394.44
105	16.8	6.7	SVVAAYMV(EM)c	1706.74
203	17.1	9.3	NK/QPI/L(YK/Q)c	1531.9
9	21.5	6.0	I/LDSFPDFKcTI/LYI/LWI/L(T)c	1115.601239.68
104	17.1	6.2	GYI/LK/QDWEc	1314.66
3	9.4	5.0	K/QRWYAK/QAI/Lc	1315.65
4	11.3	6.2	(PyroE)-SSPA-443.35-Kc262.05-YTI/LI/LK/QSK/QI/LPGKc	1043.481396.66	See spots 2, 3, and 6 (Table 2)
23	27.1	6.3	STSI/LI/LE-321.18-Rc	1126.64
20	25.8	5.8	201.96-(NG)HSK/QI/LPPI/LEVc(K/QP)I/L-635.45-Rc316.06-FEVTY(I/LDWTR)c	2532.551645.78
21	25.7	5.6	Sequences found in spot 14226.11-PI/LTI/LP-462.29-RcI/LI/LYS-624.40c241.93-EDAGGI/LV-443.24-DKc	1384.861130.861816.92
22	27.5	5.5	VNVI/LK/QK/QI/L-406.3-RcPTTSP-446.3-RcVPI/LSG(S)-275.2-RcFK/QE(NG)EI/L(VDI/L)-290.2-(N)-Kc(DP)FENFP-287.2-A(NI/L)SKcYSSAA(PI/LS)-282.2-Rc	1873.221104.72990.661941.101665.921233.76	See spot 115 (Table 3)

Examples of MALDI-TOF and ESI-MS/MS

Two examples in which proteins were identified successfully have been selected to demonstrate the utility of our proteomics approach, the quality of the MS spectra, and the potential for homology-based searching with MS data that has been realized in this study. Homology-based searching is an important issue, because no plant genome has been sequenced completely and because sequencing of plant genomes from different species is in progress. It is also an important issue for those experimental plant systems for which no genomic sequences or expressed sequence tags are expected to be available in the near future.

Figure 4B shows the MALDI-TOF spectrum from protein spot number 123 from the peripheral map (pH 4.0 to 7.0), containing a mixture of two proteins. Five of the measured peptides matched (i.e., no miscleavage; within 50 ppm; no oxidation) the recently identified gene product Hcf136 from Arabidopsis; these five peptides are indicated in the protein sequence shown in Figure 4A. Hcf136 also can be seen on the lumenal map (Figure 2B, spot 123), as identified by N-terminal Edman sequencing (Table 2). This protein was determined earlier to be on the lumenal side of the thylakoid membrane, where it is involved in the assembly of PSII (Meurer et al., 1998). Thus, homology-based searching with MALDI-TOF data from a pea protein resulted in the successful identification of a protein that previously had been sequenced only in Arabidopsis.

Table 2.

Proteins Involved in Nonphotosynthetic Functions, Identified from the Two-Dimensional Electrophoresis Gels of the Lumenal and Peripheral Fractions (pI 4.0 to 7.0 and pI 7.0 to 11.0) Shown in Figures 2A to 2Da

MALDI-TOF	Edmane or ESI-MS/MSf	Localization and CleavageSite Prediction
Spot No.	Masses (kD)	pI	Identityb(Theorical Mass in kD)	AccessionNumberc	Cover %50 ppmd	Sequence	% ofIdentityg	N Terminus ofthe Proteinh	PSORTi	ChloroPj	SignalPk
123	37.3	5.6	Hcf136b(44.1)	O82660	24	EETLSE-ERVYLe	67	79-DE	ER: 0.60Tk: 0.49	60–61	ARA-DE78–79
127	46.0	5.3	DegPb (46.2)	2565436	15	104-FV	Tk: 0.91Lumen: 0.80	42–43	AVE-SA99–100VES-AS100–101
119	29.9	4.9	RNA binding protein (32.0)	PSY14557	AAQEGETLTVEETVe	86	61-AA	Mito: 0.61Stroma: 0.50	60–61	LFA-AQ61–62
124	39.1	4.6	Plastoglobule ass. prot. PG1 (38.4)	4105180	25	48-AG	Stroma: 0.52	46–47	ISA-AG47–48
113	24.2	5.8	Cpn21 (26.9)	O65282	ATVVAPKYTAIKe	83	52-AS	Stroma: 0.89Lumen: 0.58	50–51(25–26)	AQS-KP93–94VKA-AS51–52
130	57.5	5.3	Cpn60α (14.4)	P08926	26	Stroma: 0.92Lumen: 0.68	45–46(23–24)	AAA-KD49–50
131, 132	74.5	5.3	Hsp70 (41.3)	399942	27–35	68-KV	Stroma: 0.94Lumen: 0.74	66–67(46–47)	AVA-AM83–84
2, 3, 4, 6	9.3–15.3	4.4–6.9	Histone H4-likeb (27.6)	1806283	39–40	I/LSGI/LI/LYEETRf	100	Nucleus: 0.95	None	None
40	41.2	5.6	Brittle-1b (16.2)	231654	16	76-DN	Mito: 0.83Stroma: 0.52	44–45	None
42	49.5	5.8	Stearoyl-ACP desat.b (15.5)	2290402	17	Mito: 0.71	33–34	AMA-ST60–61
39	37.3	5.7	ACC oxidaseb (39.2)	4090533	19	Cyto: 0.45	None	None
112	22.9	6.0	Ferritinb (15.5)	AI443623	ATKGSSDNRVLTGVe	64	60-AT	Cyto: 0.65	59–60	None
23–28, 205, 206126	27.1–28.546.0	5.3–6.3	Putative ascorbate perox/partial cDNAbPutative ascorbate perox/partial cDNAb	AI490846	XDLIERRQRX(Y/E)Fe	77–93
5.8	AW156024AW185405AW185405AI490846	TDYEVDI/LI/LTTFTKf243.12-FSAVGI/LGPRfI/LNYEAYTYPRfADLIERRQRSEFQe	92788093	89-AD	ER: 0.55	45–46	ANA-AD88–89
110	21.3	6.0	FKBP isomerase isolog/partial cDNAb	AU070407AW092542	AGLPTEEKPPLLe	64	117-AG	Tk: 0.88Lumen: 0.70	57–58	ALA-AG117–118

MALDI-TOF MS Peptide Map of Spot Number 123 from the Peripheral Map (4 to 7).

(A) The precursor protein sequence of Hcf136 from Arabidopsis. The N-terminal part of the sequence in italics is the predicted presequence; the lumenal cleavage site is indicated by an asterisk. Five peptides were identified by MALDI-TOF MS (B) and are indicated in the protein sequence (underlined).

(B) The MALDI-TOF MS spectrum of the peptides generated by tryptic digestion of protein spot 123. The trypsin autodigested peptide ions 842.51 and 2211.11 (not labeled) were used for internal calibration. The MALDI-TOF MS spectrum of the peptides generated by tryptic digestion of the protein spot 123 matched (no miscleavage allowed; within 50 ppm; no oxidations) Hcf136 from Arabidopsis. Hcf136 can be seen on the lumenal map (spot 123) and was confirmed by N-terminal Edman sequencing. The second protein in this spot is CF1β, determined by the matching of six peptides (no miscleavage; four within 15 ppm; two within 40 ppm).

The second protein in this spot is CF1β, as determined by the matching of six peptides (without miscleavage) to the pea sequence, illustrating the ease with which proteins in mixtures can be identified by MALDI-TOF. Three peptides in the spectra originate from autodigestion of trypsin (at m/z ratios of 842.51, 2211.11 [peak not labeled], and 2807.29), and the first two peptides were used for internal calibration (Figure 4B). The other nonmatching peptides most likely result from domains of pea Hcf136 that are not 100% conserved with the Arabidopsis homolog; if there is a single amino acid residue mismatch between the pea peptide and the Arabidopsis sequence, then this peptide often will not match (at a <50-ppm mass resolution). We also observed that for a number of pea proteins present in the public databases, sequence conservation among different pea cultivars is incomplete. Therefore, it is likely that some nonmatching peptides in the spectrum are derived from CF1β.

Figure 5 shows an example of the identification of a protein with unknown function from Arabidopsis by ESI-MS/MS (spot 104). With ESI-MS/MS, a peptide (rather than an assigned precursor ion) from the protein digest is selected within the mass spectrometer and further fragmented (ionized) along the protein backbone by additional energy. Several precursor ions can be selected from the same sample and are measured consecutively. In this study, we typically selected three to five ions per protein digest for MS/MS analysis. The ESI-MS/MS spectrum of the doubly charged precursor ion at m/z 620.92 ([M+2H]2+) is shown. The complete y-ion series (from y1 to y11) could be assigned unambiguously as indicated (y ions are the C-terminal ions after fragmentation of the precursor ion) and are generally the predominant ions (for an overview, see Chapman, 1996; Burlingame et al., 1998). The experimentally determined pea peptide sequence tag (by reading the sequence from y11 to y1) matched (for 10 of 11 amino acid residues) a hypothetical Arabidopsis protein. This identification was further confirmed by two other MS/MS sequence tags and an N-terminal Edman tag, as indicated in Figure 5 (see also Table 3, spot 104).

Table 3.

Proteins without Assigned Functions, Identified from the Two-Dimensional Electrophoresis Gels of the Lumenal and Peripheral Fractions (pI 4.0 to 7.0 and pI 7.0 to 11.0) Shown in Figures 2A to 2Da

Edmanc or MS/MSd	Localization and CleavageSite Prediction
Spot No.	Masses(kD)	pI	Identity(TheoricalMass in kD)	AccessionNumberb	Sequence	% ofIdentitye	N Terminusof theProteinf	DomainPredictiong	BLASTSearchh	PSORTi	ChloroPj	SignalPk
107	18.3	6.0	Unknown function (15.0)	T21992	ATQRLPPLSTEPNRc	93	76-AX	Pentapeptide repeat and many others	Similar to A. thal. DNA clone AB015476 and with hyp. prot. Syn. sp. BAA17756	Stroma: 0.86Lumen: 0.45	No cTP	VIA-AX75–76
104	16.0	6.2	Unknown function (46.2)	AAC78263(W43350/ T45153)	AILEADDDVELLEcAFVSSAAAFEKdI/LEADDDVEI/LI/LEK/QdGYI/LK/QDWEd	1009110071	80-AI	Many but nothing obvious	Lumen: 0.80	64–65	LVA-IG64–65
103	14.3	5.7	Unknown function (32.0)	2344892 (AAC31832)	FKGGGPYGQGVTRGc	100	48-FK	Pentapeptide repeat and many others	Hyp. prot. Syn sp./D90917	PM: 0.68	34–35	ALA-FK47–48
108	21.3	6.3	Unknown function (38.4)	AAC00624	VVKQGLLAGRIPGLc	93	71-VV	Many but nothing obvious	Cyto: 0.45	None	ALA-FP63–64
115	27.4	5.5	Unknown function (26.9)	AAC28768	YSSAAPI/L(I/L)dSPTEQ/KPd	7267	?	2 main rhodopsin GPCR domains? and many others	Other A. thal. clones	Tk: 0.95	44–45	None
36	39.6	6.0	Hyp. protein (14.4)	3395429	17% of coverage at 50 ppm with MALDI-TOF	?	Many but nothing obvious	Perox: 0.80	35–36	None
19	24.0	5.6	Hyp. protein (41.3)	Z97339	VI/LNK/QYLTE-482.2-RdI/LYYK/QVEANNKdSYASNNEI/LAVFPDQRd	898087	?	Eukar. Mo-pterin redox prot or euk RNA pol. heptapep. repeat? and many others	Hyp. prot. Syn. sp./BAA18019	Stroma: 0.94Tk: 0.74Lumen: 0.72	75–76	AFA-ST104–105
22	27.6	5.4	Hyp. protein (16.2)	AI600799	I/LYSLSAS(TI/LS)-401.2-KdGPI/LFK/QAVSSFRd	9067	12–22l18–29	Type-1 copper prot? and many others	Hyp. prot. Oryza. sat. AA754382 partial cDNA	Partial cDNA
111	24.1	6.5	Hyp. protein	AW201127	RDVAVGSFLPPSc	83	(54)-RDm	Nothing obvious	Stroma: 0.84Lumen: 0.34	(14–15)	SHA-RE(54–55)
204	16.9	8.5	Hyp. protein	D47525AI855536	AESGFQPVVDRKGDc	64	(68)-AEm	Nothing obvious	Similar to A. thal. DNA clone AL132954	Perox. 0.64	(33–34)	SFA-AE(68–69)

Identification of a Thylakoid Protein in Spot 104 by ESI-MS/MS.

(A) Protein sequence of a hypothetical precursor protein from Arabidopsis identified in spot 104. The presequence is in italics, and the lumenal cleavage site is indicated by an asterisk. Protein spot 104 was identified by three experimental sequence tags determined by ESI-MS/MS and an N-terminal Edman tag. The four sequence tags are indicated in the protein sequence in boldface for ESI-MS/MS (I/LEADDDVELI/LEK; AFVSSAGAFEK; GYI/LK/QD) or underlined for Edman (AILEADDDEELLEK). Determination of the sequence tag AFVSSAAAFEK by ESI-MS/MS is shown in (B).

(B) Typical ESI-MS/MS mass spectrum of a peptide recovered after in-gel tryptic digestion of protein spot 104. Fragmentation of the doubly charged precursor ion at an m/z ratio of 620.92 yielded the y-ion series (y1 to y11) for which the sequence is indicated. The experimental sequence tag from the pea protein matched (for 10 of 11 amino acid residues) a hypothetical protein of Arabidopsis as indicated in (A). Note that the sequence tag should be read backward from y11 to y1.

Proteins Involved in Photosynthetic Electron Transport

Table 1 lists 12 abundant proteins that are known to be involved in photosynthetic electron transport reactions and carbon metabolism, as identified on the two-dimensional electrophoresis maps and as shown in Figures 2A to 2D. These proteins were all unambiguously identified by MALDI-TOF or N-terminal Edman degradation as indicated in Table 1. With MALDI-TOF, a large number (six to 20) of matching peptides were found at high mass accuracy (15 ppm) and with a coverage of the protein by the matching peptides ranging from 15 to 45% at 15 ppm and 21 to 59% at 50 ppm as indicated (calculated for the precursor protein).

The most abundant proteins are OEC16, OEC23, OEC33, and the soluble electron transporter plastocyanin. In addition, four proteins from the ATP–synthase complex (CF1α, CF1β, CF1δ, and CF1γ) were identified. Three other stromal-side peripheral proteins, ferredoxin, ferredoxin-NADPH-reductase, and PsaE, and the lumenal-side peripheral protein PsaN also were identified. PsaN was found on the basic map (spot 201), at approximately threefold lower abundance than OEC16 (spot 203 on the same basic map), in approximate agreement with the stoichiometry between PSII and PSI. These 12 proteins (Table 1) represent nearly all expected photosynthetic proteins (only CF1ε and some of the peripheral PSI proteins have not been identified), indicating that our maps give a good representation of the proteins present in the lumen and periphery of the thylakoid.

In addition, three stromal soluble proteins involved in carbon metabolism (i.e., RbcS, RbcL, and aldolase) were found on the maps. On the peripheral maps, only a very small amount of aldolase was observed, especially considering the high abundance of this protein. On the lumenal map, only a very small amount of RbcS was detected (in agreement with the protein gel analysis in Figure 1), whereas somewhat more RbcL and aldolase were present. In contrast, none of the very abundant soluble kinases involved in carbon metabolism (e.g., phosphoribulose kinase) was discovered on any of our two-dimensional electrophoresis gels. This could indicate that a specific subset of stromal proteins (RbcL and aldolase) has a strong affinity for the thylakoid membrane. In this respect, it is important to realize that a very small amount of proteins (i.e., in the femtomole range) can be detected by using two-dimensional electrophoresis and state-of-the-art MS instruments; thus, minor cross-contamination can be detected easily.

As is clear from Tables 1 (OEC33, OEC23, CF1α, and CF1β) and 2 (heat shock protein Hsp70 and chaperone Cpn60), several proteins were present at different pI values at a similar molecular mass, forming trains or beads of protein spots, indicating post-translational modifications, different isoforms, or RNA editing (see Discussion). We currently are investigating these modifications in detail.

Several breakdown products of OEC33, CF1α, and CF1β were identified, totaling <0.1% for each protein. These breakdown products also were present when two-dimensional electrophoresis maps were made from preparations in which the protease inhibitor cocktail was omitted during the complete isolation procedure (data not shown). Considering the broad specificity of this inhibitor cocktail, it is likely that these breakdown products reflect the proteolytic process naturally occurring in the thylakoid.

Proteins with a Nonphotosynthetic Function or No Obvious Functional Domains

In addition to the abundant proteins involved in photosynthetic electron transport and carbon metabolism, 18 proteins with clear functions (Table 2) and 10 proteins without obvious functional domains (Table 3) were identified. The identified proteins with a clear function were involved in DNA binding and transcription (i.e., four histone H4-like proteins and an RNA binding protein), oxygen radical scavenging (two ascorbate peroxidases), ADP–glucose transport (Brittle-1), proteolysis (DegP), chaperone or isomerase activity (Hsp70, Cpn60, Cpn21, and FKBP), protein assembly (Hcf136), Fe2+ binding (ferritin), lipid storage (plastoglobule-associated PG1 protein), and the biosynthetic pathway of fatty acids (stearoyl-acyl carrier protein desaturase). Amino-cyclopropane carboxylate oxidase, which catalyzes the last step of ethylene biosynthesis, also was provisionally identified, but this result needs to be further confirmed by sequence tags.

The histone-like proteins were only found on the two-dimensional electrophoresis maps from the peripheral fraction and not from the lumenal fraction. For these histone-like proteins, only highly conserved homologous genes, and no genes for chloroplast-localized proteins, were found in the database. Interestingly, the ascorbate peroxidase is most likely a lumenal protein with a twin arginine motif and 40% identical to other peroxidases. The sequence tags matched to expressed sequence tags from tomato and cotton and interestingly also to the moss Physcomitrella patens. These peroxidases were present in several spots in a wide pI range (pI 5.8 to 8.3) and at two different molecular masses, and they matched partial cDNAs.

A set of 10 proteins without assigned function but with identified full-length genes is listed in Table 3, and these proteins were analyzed for the presence of functional domains. Three of the identified proteins (spots 19, 103, and 107) are homologous with Synechocystis proteins (Table 3). Spots 103 and 107 are related to each other and were assigned to be part of a pentapeptide repeat family involved in lipid transport or assembly, but this assignment is based only on very weak similarity (23%) to a protein family in the cyanobacteria (Table 3). One of these postulated pentapeptide repeat proteins (spot 107) was identified earlier (Kieselbach et al., 1998), and the corresponding full-length Arabidopsis gene was identified in this study. An N-terminal sequence for spot 104 was found earlier in a two-dimensional electrophoresis map from total leaf extracts (Tsugita et al., 1996) as well as on a one-dimensional SDS electrophoretic gel of a lumenal preparation (Kieselbach et al., 1998). This protein was experimentally placed in the TAT pathway (Mant et al., 1999).

Proteins without Matching Expressed Sequence Tags or Genomic Sequence in the Public Database

Table 4 lists 18 protein spots from the two-dimensional electrophoresis maps (Figures 2A to 2D) for which a significant amount of amino acid sequence information was obtained. However, at the time this article was accepted for publication, the corresponding genes could not be identified by MALDI-TOF mass fingerprinting (at least 10 tryptic peptides after filtering for frequently recurring peptides, contamination, etc.), followed by database searching with sequence tags obtained by ESI-MS/MS or N-terminal Edman sequencing. The sequence tags are listed in Table 4.

Localization Prediction and Determination of Cleavage Sites

All identified proteins were analyzed by using the programs PSORT, ChloroP, and SignalP to verify the predicted location as well as the expected cleavage site by stromal processing peptidase(s) or thylakoid-bound lumenal peptidase (Tables 1 to 3). This analysis was conducted to verify the predictive strength of these programs. If the programs indeed are predicting the localization and cleavage site of newly identified proteins with a high degree of confidence, then they can help to analyze ambiguous hits in a proteomics screen and could be used to perform a genome-wide search for thylakoid membrane and lumenal proteins. Our set of newly identified proteins provided us with an excellent sample on which we could test these programs.

The ChloroP program predicted that all of the nuclear-encoded photosynthetic proteins listed in Table 1 would be located in the chloroplast, which is not surprising because most of these proteins or their homologs were part of the training set for the development of the program (Emanuelsson et al., 1999). The chloroplast transit peptide was correctly predicted for those proteins localized on the stromal side, assuming that in all cases an additional amino acid was removed after processing, as described by Emanuelsson et al. (1999). Approximately 94% of the nonphotosynthetic proteins in Tables 2 and 3 (excluding the histone-like proteins because we did not find the corresponding genes) were predicted to be localized in the chloroplast by ChloroP.

The program PSORT makes localization predictions for proteins in any of the plant organelles (i.e., nucleus, endoplasmatic reticulum, mitochondria, peroxisomes, and chloroplasts; Nakai and Horton, 1999). However, only 52% of the proteins in Tables 1 to 3 were predicted by PSORT to be in the chloroplast.

Lumenal Transit Peptides

The lumenal transit peptides were analyzed by alignment of 26 nonredundant proteins in a so-called logoplot (Figure 6). In a logoplot, the sequence alignment is represented by a sequence of stacked letters in which the total height of the stack at each position shows the amount of information (conservation), whereas the relative height of each letter shows the relative abundance of the corresponding amino acid (Schneider and Stephens, 1990). Such logoplots have been used successfully to analyze different signal peptides of Gram-positive and Gram-negative bacteria (Nielsen, 1999). Seventeen nonredundant proteins that had been shown experimentally to possess lumenal transit peptides or lumenal localization were found in the public databases, whereas nine were identified in this study. The N termini of six of the nine newly identified lumenal proteins were obtained by Edman sequencing. Lumenal transit peptides have been well studied during the past decade. The semiconserved consensus sequence for the cleavage site of the transit peptide is AXA↓X, with the arrow representing the cleavage site and X representing any amino acid. In addition, a high frequency of alanine residues as well as leucine residues upstream of this consensus sequence have been observed (reviewed in Settles and Martienssen, 1998; Keegstra and Cline, 1999). A similar structure has been reported for signal peptides of Gram-negative and Gram-positive bacteria (Nielsen et al., 1997; Nielsen, 1999). After alignment of the 26 sequences according to their cleavage site in a logoplot (Figure 6), we observed a nearly complete conservation for the −1 position (alanine), in agreement with site-directed mutagenesis studies (Shackleton and Robinson, 1991). At the −3 position, there is a preference for alanine as well as valine, serine (small and neutral residues), and, unexpectedly, aspartic acid. At the +1 position, there is a preference for alanine, valine (small, neutral), glutamic acid, and aspartic acid (negatively charged). After the alanine/leucine–rich hydrophobic region, a fairly high frequency of prolines at the −6, −5, and −4 positions can be seen. This is likely to stimulate helix breaking of the hydrophobic region and possibly ensures interaction with the thylakoid processing peptidase. Further downstream (+2 and +4), a preference for the negatively charged glutamic acid is observed, which is not found in Gram-positive and Gram-negative bacteria (Nielsen et al., 1997; Cristobal et al., 1999).

Logoplot of Thylakoid Proteins with Lumenal Transit Peptides Aligned According to the Predicted Cleavage Site (between −1 and +1) of Their Lumenal Transit Peptide.

The main figure shows the logoplot of 26 different proteins with lumenal transit peptides, without any redundancy. The top inset shows the logoplot of a subset of 13 proteins targeted via the ΔpH/TAT pathway. The bottom inset shows the logoplot for the remaining 13 proteins. The height of the stack of letters at each position shows the amount of information, defined as the difference between the maximal and actual entropy (Schneider and Stephens, 1990), whereas the relative height of each letter shows the relative abundance of the corresponding amino acid. Positively and negatively charged amino acids are shown in black and red, respectively; external polar residues (N and Q) are shown in yellow, internal apolar (F, L, I, M, and V) in green, and ambivalent (P, T, S, C, A, G, Y, and W) in blue. Proteins used in the logoplots are for the ΔpH/TAT pathway (top inset): OEC16 (P12301); OEC23 (P16059); PsaN (P49107); Hcf136 (O82660); polyphenol oxidase (Q08303); PsbT (Q39195); the new ascorbate peroxidases in spots 23 to 28, 205, 206, and 125 in Table 2; and spots 19, 104, 108, 110, 111, and 204 in Table 3. The remaining proteins (bottom inset) are as follows: plastocyanin (P16002); OEC33 (P14226); DegP (AAC39436); CFoII (BAA09134); violaxanthin-deepoxidase (AAC50032); rotamase (CAA72792); PsbY1 (P80470); PsbY2 (P80470); CtpA (BAA09134); PsbX (AAD25151); PsaF (P13192); and spots 107 and 103 in Table 3.

Thirteen of these 26 lumenal proteins contain a twin arginine motif (R-R-X-o-o, with o representing a hydrophobic residue) and are predicted to translocate through the thylakoid membrane via the ΔpH/TAT pathway (Settles and Martienssen, 1998; Dalbey and Robinson, 1999; Mori et al., 1999; legend to Figure 6). Six of the newly identified proteins contain this twin arginine motif. Alignment of the signal sequences by the twin arginines points to a very strong preference for a leucine or a methionine residue at the third position after the two arginines (data not shown), which would favor R-R-X-o-L/M as the general consensus motif for substrates of the TAT pathway. The 13 proteins were aligned in a separate logoplot (Figure 6, top inset). A striking feature is that the hydrophobic region in these TAT-dependent proteins is less hydrophobic than it is in the other 13 proteins (Figure 6, bottom inset). The TAT pathway proteins show a strong preference for a serine or alanine at the −8 position. No conservation of other “Sec avoidance” signals was found, such as a postulated basic residue directly after the hydrophobic region (Dalbey and Robinson, 1999).

We decided to systematically test, based on this alignment, whether the program SignalP could predict the correct cleavage sites of lumenal transit peptides (as listed in Tables 1 to 3) in either the Gram-negative or Gram-positive mode. In all cases in which we knew that the protein was located in the lumen, a lumenal cleavage site was found by SignalP. In most of these cases, the SignalP prediction corresponded to the experimentally determined N terminus (Tables 1 to 3). In a few other cases (e.g., OEC16 and DegP), the predicted site was shifted a number of residues with respect to the experimentally determined cleavage site. A possible reason is that in bacteria, the −1 position is not as much biased toward an alanine residue as compared with the chloroplast lumenal transit peptides. Removal of the stromal transit peptide generally did not influence the predicted lumenal cleavage site. This is important because it allows genome-wide screening of the Arabidopsis genome for lumenal transit peptides. However, SignalP also identified one or more lumenal cleavage sites for a number of proteins that are located at the stromal side of the thylakoid membrane (i.e., ferredoxin, Psa E, CF1δ, and an RNA binding protein). The selectivity might be improved if SignalP were to be equipped with a specific option for chloroplast lumenal transit peptides.

DISCUSSION

Proteomics already has become an important tool for drug discovery and the analysis of yeast and Escherichia coli protein expression patterns; however, it has not been widely applied in plant biology. This study demonstrates some of the potential of proteomics that can be realized for plant sciences. This potential will become more evident once the full genome sequences of Arabidopsis, rice, or other plant genomes are available. The total proteome of higher plants is estimated to consist of ∼21,000 to 25,000 proteins (see, e.g., Bouchez and Hofte, 1998; Meinke et al., 1998), and chloroplasts contain 10 to 25% of this total, an illustration of the central importance of the chloroplast for the plant cell. Our long-term goals are to analyze systematically the proteome of the thylakoids and to understand thylakoid biogenesis and maintenance through further functional analysis.

Total Number of Peripheral and Lumenal Thylakoid Proteins

It was estimated from image analysis that the two acidic maps of lumenal and peripheral proteins each contain at least 360 to 400 protein spots, whereas each basic map contained at least 50 to 60 spots. Thus, there is a total of ∼820 to 920 protein spots that can be detected easily by silver staining. In most cases, the apparent molecular mass corresponded with the theoretical mass of the mature protein. In a few cases (e.g., ferredoxin, plastocyanin, and CF1γ), the same protein was observed at a molecular mass higher than the theoretical molecular mass. In case of plastocyanin, this is likely due to aggregation during focusing in the first dimension because of the very high abundance of the protein. The 11-kD ferredoxin was identified in a small protein spot (spot 117) at 30 kD and pI 5.0 and therefore must be in an aggregate or oligomeric form, modifying both molecular weight and experimental pI value (the theoretical pI value of monomeric ferredoxin is below 4.0 and thus was outside the range of the two-dimensional electrophoresis maps).

To estimate the total number of proteins, it is necessary to conduct a correction for the 39% overlap between the lumenal and peripheral maps, calculated by computer-aided image analysis. This strong overlap between the lumenal and peripheral proteins was expected because proteins can be transiently bound to the thylakoid membrane as part of their function (e.g., plastocyanin) or during thylakoid biogenesis and protein complex assembly (e.g., the OEC proteins). In addition, some of the thylakoid-bound proteins are released during the sonication procedure. However, through image analysis, one can clearly distinguish between proteins that are mostly soluble and mostly membrane bound. Naturally, protein partitioning into lumenal or peripheral fractions can be influenced by altering the purification procedure, for example, by using shorter sonication times (data not shown) or French or Yeda presses.

To further calculate the total number of functionally different proteins from the observed number of protein spots, it is necessary to perform a correction for different isoforms, post-translational modifications, and proteolysis. Isoforms can be expected for several of the photosynthetic proteins, because multigene families have been reported in pea, spinach, and tobacco for OEC23 (e.g., Hua et al., 1992), OEC33 (e.g., Wales et al., 1989), and ferredoxin–NADPH reductase (e.g., Gozzer et al., 1997). Post-translational modifications are likely to occur not only through phosphorylation but also by post-translational methylation (e.g., of RbcS; Grimm et al., 1997), carbamylation (e.g., for RbcS and RbcL; Smith et al., 1988), glycosylation (e.g., for CF1; Maione and Jagendorf, 1984), and palmitoylation (e.g., for the D1 protein; Mattoo and Edelman, 1987). Finally, alternative splicing produced both stromal and thylakoid bound forms of ascorbate peroxidase (Mano et al., 1997), and editing of mRNA leading to heterogeneous proteins has been reported (Sugita and Sugiura, 1996). As discussed earlier (Table 1), proteolytic fragments also were observed. Finally, beads of spots can result from carbamylation induced by sample preparation, even when appropriate precautions are taken.

After taking these multiple forms into account, we estimate that there is a total of at least 200 to 235 proteins in the lumen and periphery of the thylakoid membrane. However, this number is conservative because it is likely that several proteins of lower abundance were masked on the acidic maps by the predominant OEC23 and OEC33 proteins. Visualization of these masked proteins would be possible after removal of the abundant OEC proteins, for instance, by affinity chromatography and/or by two-dimensional electrophoretic analysis using isoelectric focusing strips with narrower pI ranges.

MS- and Homology-Based Searches

In this study, ∼400 spots were analyzed by MALDI-TOF MS, followed by analysis of 20 spots by ESI-MS/MS and 55 by N-terminal Edman sequencing. As a result, 61 different proteins were identified. Breakdown products and modified forms of several of these proteins also were identified (Tables 1 to 3). For ∼100 spots, insufficient information was obtained for positive identification, and ∼30 of those will be analyzed by ESI-MS/MS. Because Edman degradation sequencing resulted in good N-terminal sequence from all 55 proteins spots, with the quality and size of the peaks on the chromatograms proportional to the intensity of the spots, it is likely that none of the spots analyzed by Edman degradation was blocked at the N terminus. Note also that for several spots, two or even three amino acid sequences could be determined by Edman sequencing. Thus, it is likely that the N termini of most mature proteins in the chloroplast are not modified in vivo. This is in contrast to a two-dimensional electrophoresis study of rice and Arabidopsis total cellular proteins, in which 40 to 60% of the measured proteins were blocked (Tsugita et al., 1996). We should point out that all nuclear-encoded chloroplast proteins are N-terminally processed after import to remove the transit peptide; thus, N-terminal modifications that occurred in the cytosol are removed.

With the completion of the sequencing of the different plant genomes, it is expected that most (if not all) of the proteins listed in Table 4 will be identified. It was, however, possible to identify a number of the proteins by their similarity to proteins from other species based only on the mass fingerprints. This was possible because many plant proteins are well conserved, an encouraging prospect for future high throughput plant proteomics once a number of plant genomes have been fully sequenced. Digestion of a protein spot with a second protease (such as V8 or chymotrypsin) in parallel to digestion with trypsin also could help to identify proteins with mass fingerprints (MALDI) alone, because the two different proteases generate different peptides, thereby increasing the total number of peptides available for identification by database searching.

To avoid contamination with nonchloroplast proteins, we used intact, purified chloroplasts for the starting material for the two-dimensional electrophoresis maps in this study. Pea (or spinach) is preferred over other species, such as Arabidopsis, because of the ease with which intact chloroplasts can be purified. The disadvantage is that homology-based searching with mass fingerprints must be conducted. This type of analysis is more demanding and often requires confirmation by electrospray ESI-MS/MS or Edman sequencing. However, with the combination of these three techniques, it was possible to identify the corresponding genes in other plant species (Tables 1 to 4).

Functional Role of Lumenal and Peripheral Thylakoid Proteins

The 58 proteins (excluding the three stromal proteins in Table 1) listed in Tables 1 to 4 were classified in a pie diagram according to their function (Figure 7), following the categories used in the analysis of a 1.9-mB contiguous sequence of chromosome IV of Arabidopsis (Bevan et al., 1998). This is a helpful tool to visualize the more general role of the lumenal and peripheral membrane proteins. For 17% of the proteins, no clear function could be predicted, whereas for 31% of the proteins, no expressed sequence tag or full-length gene could be assigned. The remaining 52% could be classified according to their function.

Assignment of the Identified Proteins to Functional Categories by Using Classifications as Described by Bevan et al. (1998).

In total, 58 proteins were classified. The categories are as follows: energy (the 12 nonstromal proteins in Table 1), transcription/translation (spot 119 in Table 2), metabolism (spot 42 in Table 2), growth and division (spots 2, 3, 4, 6, 39, and 124 in Table 2), protein destination and storage (spots 110, 112, 113, 123, 127, and 131 to 133 in Table 2), transport (spot 40 in Table 2), defense (spots 23 to 28, 126, 205, and 206 in Table 2), no assigned function (the 10 proteins in Table 3), and no identified gene or homolog (the 18 proteins in Table 4).

Naturally, a significant fraction of the proteins is involved in energy production (21%), either in photosynthetic electron transport or in ATP production. So far, only one transporter, Brittle-1, was found on the two-dimensional electrophoresis maps of both the lumenal and peripheral fractions. Brittle-1 has an adenylate translocator function in the transfer of ADP glucose in amyloplasts. Brittle-1 was cloned using a transposon-tagged maize mutant (Sullivan et al., 1991) and was found to be localized in amyloplast membranes (Sullivan and Kaneko, 1995). Because starch also accumulates in chloroplasts, Brittle-1 is likely to have a similar function as in amyloplasts. Indeed, in vitro import assays showed that Brittle-1 could be targeted to the chloroplast inner envelope membranes, but the protein had not been found previously in chloroplasts in vivo (Li et al., 1992). Transmembrane prediction programs (such as TopPred2, DAS, and Tmpred; see http://www.expasy.ch/tools/#transmem) indicate that Brittle-1 could be an integral membrane protein, but the predicted transmembrane domains are shorter and less hydrophobic than are observed usually. We have not detected any integral membrane protein on the two-dimensional electrophoresis maps, which is expected because the protein fractions were centrifuged at high gravity values to remove any membranes. Thus, it was surprising to find Brittle-1 on both the lumenal and peripheral maps, an observation that calls into question the suggestion that Brittle-1 is an integral membrane protein. Because the protein had not been detected previously in chloroplasts, further experimentation is needed to more precisely define the localization of Brittle-1.

The two-dimensional electrophoresis maps also revealed a lipid storage protein (PG1), which was classified in the cell growth and division category (Figure 7). PG1 recently was identified in plastoglobules extracted from pea chloroplast membranes, and in vitro import assays confirmed its localization in the chloroplast (Kessler et al., 1999). Plastoglobules have been observed in different types of plastids and are thought to serve as lipid reservoirs for thylakoid membranes. In addition to PG1, plastoglobules possess several other proteins associated with the outer surface, such as the 30-kD plastid lipid–associated protein (Pozueta-Romero et al., 1997). It is quite likely that several of those proteins are present on the two-dimensional electrophoresis maps, and they should be identified in the near future.

Other proteins involved in cell growth include a set of four histone H4-like proteins in the 10- to 14-kD size range. A large fragment of histone H4 is extremely well conserved among many eukaryotes, varying little in monocotyledons and dicotyledons, green algae, insects, and mammals. The identified peptides of the four proteins all corresponded to this conserved region. Two chloroplast proteins that are electrophoretically similar to histones H2A, H2B, and H3 of pea cell nuclei were isolated earlier from chloroplast nucleoids. The amino acid composition of these proteins demonstrates high similarity with the HU proteins of E. coli (Yurina et al., 1995), but the identified peptides of the histone-like proteins in this study did not match those HU proteins. In total, ∼30 polypeptides from 94 to 12 kD in size were detected in nucleoids. These nucleoids have been found on the thylakoid surface (Liu and Rose, 1992) and chloroplast inner envelopes (e.g., Sato et al., 1993). The histone-like proteins were found exclusively in our two-dimensional electrophoresis maps of the peripheral proteins, which would be in agreement with a fairly tight binding (i.e., sonication resistant) of plastid DNA to the thylakoid surface. An alternative explanation for our finding of these histone-like proteins is that a minor contamination with nuclei has occurred. To avoid such a contamination, we purified the intact chloroplasts on linear Percoll gradients, in which nuclei are known to sediment to the bottom, whereas intact chloroplasts can be collected between ∼50 to 70% Percoll (Gallagher and Ellis, 1982).

The finding in this study of a known chloroplast-localized RNA binding protein points to the role of the thylakoid as a surface for transcription and translation, in agreement with observations of membrane-associated polysomes (Hattorie and Margulies, 1986; Jagendorf and Michaels, 1990) and plastid DNA.

One enzyme involved in fatty acid biosynthesis was identified and was the only protein classified under metabolism (Figure 7). It is indeed well known that fatty acid and lipid metabolism takes place both in the plastid as well as in the endoplasmic reticulum (Miquel and Browse, 1992). Further studies are required to determine where in the chloroplast these enzymes are localized.

Approximately 12% of the proteins were classified under protein destination and storage (Figure 7). They include the protease DegP, the assembly factor Hcf136, ferritin, three chaperones, and a cis-trans isomerase that possesses a lumenal transit peptide. Ferritin concentrates and stores cellular iron to ∼1011 times the solubility of the free ion and earlier had been determined to be a plastid protein located at the stromal side of thylakoid membranes (e.g., Waldo et al., 1995). Ferritin accumulation is positively correlated to iron loading of the plant, is regulated at a transcriptional level (Gaymard et al., 1996), and is under developmental control (Lobreaux and Briat, 1991). The FKBP isomerase is only the second isomerase identified in the thylakoid; the other one was named TLP40 (Fulgosi et al., 1998).

The presence of fairly high amounts of the chaperones Hsp70, Cpn60, and Cpn21 (Table 2) on the lumen map (Figure 2B), but not on the peripheral map (Figure 2A), is intriguing. These ATP-dependent chaperones have been demonstrated to be functionally important for protein import across the envelope and for assembly of protein complexes (e.g., Lubeck et al., 1997; Keegstra and Cline, 1999). The MALDI-TOF or Edman sequencing data matched best to the genes for stromal chaperones. We therefore conclude that either (1) the genes for the lumenal chaperones have not been sequenced or (2) the gene products have a dual location due either to dual targeting to the stroma and lumen or to alternative splicing, as observed for stromal and thylakoid ascorbate peroxidases (Mano et al., 1997; Yoshimura et al., 1999). ChloroP predicts alternate stromal cleavage sites for the three chaperones if the program is provided with only the N-terminal sequences truncated before the experimentally determined (Hsp70) or postulated N terminus of the mature protein (Cpn60 and Cpn21; see Table 2; prediction for alternate cleavage site). This often is not the case for other substrates (data not shown). When these alternatively cleaved precursors then are analyzed by SignalP, lumenal cleavage sites are detected within the first 100 amino acid residues of the precursor proteins (Table 2). Thus, if the actual stromal transit peptide is cleaved more toward the N terminus, it might be possible that these chaperones are targeted to the lumen. At this point, note that different stromal-processing peptidases may exist (Su and Boschetti, 1994; Koussevitzky et al., 1998) and that a similar alternative cleavage behavior and dual targeting have been proposed for polyphenol oxidase (Koussevitzky et al., 1998).

A third possible explanation for the concentration of chaperones on the lumen map is that the proteins are bound to the stromal side of the thylakoid membrane. They might be released together with the lumenal proteins upon sonication, just like CF1α and CF1β (Figure 1). It is likely that the stromal chaperones are involved in protein assembly at the thylakoid surface. Schlichter and Soll (1996) searched for lumenal chaperones and protease-treated thylakoids before release of the lumenal content; they found shorter isoforms of the stromal chaperones, based on immunodetection with antisera generated against the stromal chaperones. Their results might be explained by only partial digestion of the stromal chaperones when they are assembled in membrane-bound complexes at the thylakoid surface. Clearly, additional experiments are required to draw any final conclusions.

Consensus Sequences and Prediction of Localization and Cleavage Sites

The neural network program ChloroP was most successful in localizing the newly discovered chloroplast proteins, whereas PSORT assigned 52% of the identified proteins to the chloroplast. What does this mean for the use of these programs to confirm localization of chloroplast proteins? If the proteins indeed are localized in the chloroplast, ChloroP recognizes ∼94% of them, whereas PSORT probably identifies ∼52%. ChloroP was trained with a positive test set of 75 known transit peptide containing chloroplast proteins (excluding lumenal proteins) and with a negative test set (75 proteins from other nonchloroplast localizations; Emanuelsson et al., 1999). When the authors tested 715 Arabidopsis entries in the SWISS-Prot database, they observed that 96% of those annotated in the database as being chloroplast localized were indeed predicted to be in the chloroplast. By contrast, 11% of proteins annotated as nonchloroplast proteins were predicted to be in the chloroplast, which would result in ∼2500 false positives for the full genome. In addition, the program is somewhat biased toward the known chloroplast proteins, as exemplified by the 100% score for the 12 well-known photosynthetic proteins on our maps. The PSORT program is much more ambitious in that it predicts proteins to 17 different locations (three within the chloroplast) in the plant cell. Judging from information available on the PSORT WWW server (http://psort.nibb.ac.jp:8800/), the program has not been as rigorously tested as ChloroP has been (Nakai and Horton, 1999). It is also possible that PSORT is more conservative in its prediction of proteins to be localized in the chloroplast. Testing newly identified proteins by both programs is probably the best option at the moment to get additional support/confirmation for chloroplast localization. It has been noted that in many cases, an additional amino acid residue needs to be removed to obtain the mature protein sequence (Richter and Lamppa, 1998; Emanuelsson et al., 1999).

The lumenal transit peptides of a set of 26 nonredundant proteins were analyzed by aligning the sequences according to the experimentally determined cleavage sites. As expected, the presequence and cleavage sites had features similar to those of signal peptides in Gram-negative and Gram-positive bacteria. However, several of these features are more pronounced in lumenal transit peptides. These include the presence of the prolines at the end of the hydrophobic domain, the nearly complete conservation (25 out of 26) of the alanine at the −1 position, and the predominance of glutamic acid at the +2 and +4 positions. Aside from the twin arginine motif, the subset of 13 ΔpH/TAT proteins has a less A/L–rich hydrophobic region than the other proteins and a striking preference for a serine or alanine at the −8 position. The Sec avoidance motif (Dalbey and Robinson, 1999; Keegstra and Cline, 1999) does not seem to be a basic residue directly after the hydrophobic domain but is more likely to be the overall hydrophobicity, as was recently discussed for E. coli (Cristobal et al., 1999). A higher hydrophobicity (i.e., more leucines) in the signal peptide is likely to favor targeting via the Sec pathway. Alignment of the signal sequences by the twin arginines suggests a strong preference for a leucine or a methionine residue at the third residue after RR (data not shown). The twin arginines are positioned between −21 and −29 with respect to the lumenal cleavage site. These features taken together should make it possible to predict, with high confidence, lumenal transit peptides of plant proteins on a genome-wide scale. Adaptation of SignalP for chloroplast proteins might make this possible. It is important to note that for these programs to work correctly, the initiating methionine needs to be correctly assigned in the database; we observed several cases (spots 104 and 110) in which the assignment was incorrect, resulting in a negative (incorrect) prediction.

Future Perspectives and Conclusions

In this study, we have presented high-resolution two-dimensional electrophoresis maps from thylakoids of higher plants. Forty-five percent of the visible proteins were analyzed, and in total, 61 different proteins were identified. For 18 of those, no corresponding full-length gene could be found, but we expect to identify most of these genes once the sequencing of complete plant genomes is completed. A reverse genetics approach using tagged Arabidopsis mutants is now in progress to identify the function of the newly identified proteins. The two-dimensional electrophoresis maps in this study, their updates, and accession numbers in SWISS-Prot will become available via our website located at http://www.biokemi.su.se/chloroplast.

METHODS

Chemicals and Materials for Two-Dimensional Electrophoresis Analysis

Pharmalyte, pH 3.0 to 10.0, immobilized pH gradient (IPG) buffer 6 to 11, reswelling tray, and equipment for running the IPG gels (Multiphor II and dry strip kit) came from Pharmacia Biotech (Uppsala, Sweden). 3-([3-Cholamidopropyl]dimethylammonio)-1-propane-sulfonate (CHAPS), caprylyl sulfobetaine, Tris-HCl, and Triton X-100 were purchased from Sigma, and tributylphosphine (TBP) was from Fluka (Buchs, Switzerland). Acrylamide was obtained from BDH (Poole, UK) or Bio-Rad. Piperazine diacrylamide (PDA) was obtained from Bio-Rad. Urea, thiourea, and glycine came from Labasco (Stockholm, Sweden). Ammonium persulfate, N,N,N′,N′-tetramethylethylenediamine (TEMED), and Tricine came from Bio-Mol (Hamburg, Germany) and Bio-Rad. DTT was produced by Kodak (Rochester, NY).

Growth of Plants

Pea (Pisum sativum var De Grace) plants were grown for 12 to 14 days in a growth chamber at 25/21°C day/night temperatures with 12 hr of artificial light of ∼100 μmol of photons m−2 sec−1. The first expanded leaves were collected and used for chloroplast isolation.

Isolation and Fractionation of Intact Chloroplasts

Intact chloroplasts were isolated and purified on Percoll gradients according to Cline (1986). For preparation of thylakoid lumen, chloroplasts (equivalent to 20 to 40 mg of chlorophyll) were ruptured by osmotic shock in 50 mM Tris-HCl, pH 8.0, and 5 mM MgCl2 at a chlorophyll concentration of 1 mg mL−1 for 10 min at 4°C. This lysis medium and all solutions used in subsequent steps contained a protease inhibitor cocktail (50 μg mL−1 Pefabloc [Bio-Mol], and 1 μg μL−1 antipain, leupeptin, and phosphoramidon). Thylakoids were recovered by centrifugation at 10,000_g_ for 10 min at 4°C, washed three times with 10 mM Tris-HCl, pH 8.0, and resuspended in 10 mM Tris-HCl, pH 8.0, and 5 mM MgCl2, at a chlorophyll concentration of 0.5 mg mL−1. To liberate the soluble lumenal proteins, we then sonicated the thylakoids 10 times for 30 sec each at 4°C (power 10.0; Misonix Inc., Farmingdale, NY). The thylakoid membranes were separated from the soluble lumenal proteins by centrifugation for 1 hr at 145,000_g_ at 4°C. The clear supernatant containing the lumenal proteins was concentrated in an Amicon (Beverly, MA) cell (3-kD cut-off filter) to a protein concentration of 5 to 7 mg mL−1.

For isolation of the peripheral thylakoid proteins, the sonicated and pelleted thylakoid membranes were washed once with 10 mM Tris-HCl, pH 8.0, centrifuged at 145,000_g_ for 30 min at 4°C, and resuspended in 25 mM Mes, pH 6.5, and 0.5 M CaCl2 at a chlorophyll concentration of 0.1 mg mL−1. This thylakoid suspension was gently stirred for 30 min at 4°C and centrifuged for 1 hr at 220,000_g_ at 4°C to separate extracted peripheral thylakoid proteins from the remaining thylakoid membranes. The clear supernatant, containing the peripheral proteins, was concentrated to 12 to 15 mg protein mL−1 in an Amicon cell (3-kD cut-off filter) while reverting to 10 mM Tris-HCl, pH 8.0. The yield of lumenal and peripheral proteins was 0.01 to 0.03 and 0.02 to 0.06 mg protein mg−1 chlorophyll of isolated intact chloroplasts, respectively.

Two-Dimensional Gel Electrophoresis

Two solutions, A and B, were used to solubilize the samples for isoelectric focusing. Solution A contained 9 M urea, 4% CHAPS, 2 mM TBP, and 2% pharmalyte, pH 3.0 to 10.0 (for the pH 4.0 to 7.0 two-dimensional electrophoresis map), or 2% IPG buffer 6.0 to 11.0 (for the two-dimensional electrophoresis map pH 7.0 to 11.0), and 0.5% Triton X-100. Solution B contained 5 M urea, 2 M thiourea, 2 mM TBP, 2% CHAPS, 2% SB 3-10, 0.5% Triton X-100, and 2% pharmalyte, pH 3.0 to 10.0 (pH 4.0 to 7.0 map) or 2% IPG buffer 6.0 to 11.0 (pH 7.0 to 11.0 map) (Rabilloud, 1998). For the analytical and preparative gels, individual 13-cm IPG strips (pH 4.0 to 7.0 or 6.0 to 11.0) were rehydrated overnight with 250 μL of protein sample in solution A (for lumen) or B (for peripheral) in a reswelling tray at room temperature. The isoelectric focusing was conducted at 18°C by using a Pharmacia Multiphor II with a DryStrip kit and a Pharmacia 3500XL power supply, following the running conditions in Rouquié et al. (1997). Isoelectric focusing strips were focused for ∼80 kVhr.

The focused strips were equilibrated in a solution containing 6 M urea, 30% glycerol, 50 mM Tris-HCl, pH 6.8, 5 mM TBP, and 2% SDS (w/v) for 20 min (Rabilloud et al., 1997; Herbert et al., 1998). Separation in the second dimension was conducted at room temperature on gradient Tricine-SDS gels (8 to 16% acrylamide) (Schägger and von Jagow, 1987). After equilibration, IPGs were embedded in an agarose solution at the top of the Tricine-SDS gel as described by Rabilloud et al. (1994a). The protein spots in the analytical gels were visualized by staining with silver nitrate (pH 4.0 to 7.0 map; Rabilloud et al., 1994b) or silver ammonia (pH 7.0 to 11.0 map; Hochstrasser and Merril, 1988). Preparative gels were stained with Coomassie Brilliant Blue R 250. The pI and molecular mass scales of the two-dimensional electrophoresis maps were internally calibrated by mixing carbamylated standards (Pharmacia Biotech) with the lumenal and peripheral samples before two-dimensional electrophoresis analysis. For external calibrations, molecular mass markers were loaded onto the second dimension.

Image Analysis of Two-Dimensional Electrophoresis Gels

After staining, gels were scanned using a flatbed scanner, and the data were analyzed using Melanie II software (Bio-Rad). After selecting so-called landmarks and the assignment of all features, two-dimensional electrophoresis images were aligned and matched.

Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometry and Electrospray Tandem Mass Spectrometry

Coomassie Brilliant Blue R 250–stained protein spots were excised from the gel and prepared for mass spectrometry (MS) analysis (Edvardsson et al., 1999). The peptide extract (1 μL) from each tryptic digest was crystallized in 0.5 μL of matrix solution (α-cyano-4-hydrocynnamic acid in methanol; Hewlett-Packard, Böblingen, Germany) on the matrix-assisted laser desorption/ionization–time of flight (MALDI-TOF) target plate. Molecular mass information of the peptides was obtained by using a MALDI-TOF mass spectrometer, equipped with a nitrogen laser and operating in reflector/delay extraction mode (Voyager-DE-STR; Perseptive Biosystems Inc.). All MALDI-TOF spectra were internally calibrated using either trypsin autodigestion peptides (842.51 D and 2211.11 D) or ACTH (18 to 39) and bradykinin.

To obtain sequence information by electrospray ionization tandem MS (ESI-MS/MS) (Q-TOF, Micromass, and SCIEX API-365; Perkin-Elmer), we purified the remainder of each peptide extract by using PorosTM 50 R2 beads (Perseptive Biosystems), as described in Gobom et al. (1998) and Edvardsson et al. (1999). The peptides were eluted from the Poros beads with 8 μL of 50% (v/v) methanol and 5% (v/v) formic acid, and the solution was loaded into a nanoelectrospray needle (Au/Pd-coated glass capillaries; Protana A/S, Odense, Denmark). The instrument was calibrated with polypropylene glycol, according to the manufacturer's specifications.

Protein Gel Blotting, Edman Sequencing, and Antisera

For N-terminal Edman sequencing, Coomassie Brilliant Blue–stained gels were equilibrated in 100 mM boric acid, pH 8.5 (NaOH), and 0.12% SDS twice for ∼1 hr, principally according to Bauw et al. (1987)(1989). Protein spots then were electroblotted (16 hr at 20 V) onto polyvinylidene difluoride membrane (0.2 μm from Bio-Rad), by using 50 mM Tris-HCl and 50 mM boric acid, pH 8.5, plus 0.01% SDS as transfer buffer. After matching the protein patterns with the reference gels by computer-aided image analysis, we excised the spots from the dried blots and stored them at −20°C for Edman sequencing. To analyze the purity of the thylakoid preparations, we conducted a protein gel blot analysis with different polyclonal antisera, according to standard procedures, using chemoluminescence for detection.

Database Searching

For the more abundant spots (>25 kD), usually >25 peptide masses were obtained by MALDI-TOF, and a very good coverage of the full-length proteins was typically found (30 to 60%; Table 1) within the specified 50-ppm mass accuracy. The more abundant ions in the MALDI spectra were used directly for database searches using the software MS-Fit, developed at the University of California at San Francisco MS Facility (http://prospector.ucsf.edu), to match known proteins or translated open reading frames in databases at the National Center for Biotechnology Information (NCBI) and SWISS-Prot. Database searches with MS-Fit were set up and performed on the basis of accumulating experience as well as suggestions from Parker et al. (1998) and were performed as follows: (1) In the first round of database searching, the maximal molecular mass was restricted to 120 kD to avoid hits of polyproteins or very large proteins, and no miscleavage was allowed. Mass accuracy was set at 15 ppm (thus, for a 1-kD peptide, the maximum allowed difference between the measured and theoretical peptide masses was defined as 0.015 D), and minimally four matching peptides were required. Oxidation of methionines was allowed, and cysteins could be modified by carbamidomethylation. Contamination (i.e., from keratin, the matrix, and/or the instrument), trypsin fragments, and systematic reoccurring ions were removed from the data set. No strict molecular mass and pI filters were applied to find possible breakdown products or unexpected splicing or incorrect annotations and to account for the expected variable lengths of the presequences. (2) If a plant protein was found, the accuracy was set at 50 ppm and one missed cleavage was allowed. This step was added to permit the identification of all possible peptide masses derived from an individual protein. (3) In the third round, we investigated whether the gel spot contained a second protein, after eliminating all peptide masses matching to the protein identified in the first and second rounds. If at least eight peptides remained, another search with MS-Fit was conducted.

A number of protein spots with uncertain identities were selected and analyzed by nano-ESI-MS/MS to yield fragment ion tag data. Searches with MS-Tag (http://prospector.ucsf.edu) were performed in nonerror mode, using the following values: all species; protein molecular mass range of 5 to 250 kD; precursor ion mass tolerance of 1 D; allowed fragment ion types of a, b, y, a-NH3, b-NH3, y-NH3, b-H2O, and internal ions; and trypsin digest (only one missed cleavage allowed). Alternatively, sequence tags were interpreted from the ESI-MS/MS spectra manually and were used in FASTA to search through the protein and genome databases (PIR, Atdb, SWISS-Prot, and NCBI). Database searching with N-terminal sequence tags from Edman degradation sequencing were also performed with FASTA. Analysis of hypothetical proteins was conducted using software at servers accessible on the Internet (Blast, Pfam, Prosite, Blocks, Prints, Prodom, and Proclass).

Predictions for chloroplast localization and chloroplast and lumenal transit peptides were made using the software programs PSORT (http://psort.nibb.ac.jp:8800/), ChloroP (http://www.cbs.dtu.dk/services/ ChloroP/), and SignalP (http://www.cbs.dtu.dk/services/SignalP/).

Miscellaneous

Protein determination was conducted according to Bradford (1976). Chlorophyll concentrations were spectroscopically determined in 80% acetone (Porra et al., 1989).

Acknowledgments

Mass spectrometry was conducted at the Department of Bioanalytical Chemistry at AstraZeneca R & D Mölndal (MALDI-TOF and ESI-MS/MS) and at the Department of Molecular Biology, Odense University (ESI-MS/MS). We thank Ann-Christine Nyström for performing the MS/MS analysis at AstraZeneca and Helena Brockenuus von Löwenhielm for her advice regarding MALDI-TOF analysis. Dr. Per-Ingvar Ohlsson at Umeå University is gratefully acknowledged for his excellent Edman sequencing analysis. We thank Drs. Thierry Rabilloud and Véronique Santoni for their advice and stimulating discussions, Jimmy Ytterberg for critically reading the manuscript, and Olof Emanuelsson and Jacob Halaska for their help with the logoplot analysis and discussions. M.-Amin Bakali Haraiki is acknowledged for his help with database searching and organizing the MS data in the initial stage of this study.

This study was supported by a postdoctoral fellowship to J.-B.P. from the Wenner-Grenska Samfundet; by the Nordisk Kontaktorgan för Jordsbrukforskning and the Swedish Foundation for Strategic Research (SSF), which provided general support to K.J.v.W.; and by the Swedish National Research Council, which provided financial support for the purchase of two-dimensional electrophoresis equipment to K.J.v.W. P.R. and D.E.K. are members of the Center for Experimental Bioinformatics, which is sponsored by the Danish National Research Foundation, and D.E.K. was also supported by a grant from the Brazilian Postgraduate Federal Agency.

References

Adam, Z. (1996). Protein stability and degradation in chloroplasts. Plant Mol. Biol. 32 773–783. [PubMed] [Google Scholar]
Bauw, G., De Loose, M., Inzé, D., van Montagu, M., and Vandekerckhove, J. (1987). Alterations in the phenotype of plant cells studied by NH2-terminal amino acid–sequence analysis of proteins electroblotted from two-dimensional gel-separated total extracts. Proc. Natl. Acad. Sci. USA 84 4806–4810. [PMC free article] [PubMed] [Google Scholar]
Bauw, G., Van Damme, J., Puype, M., Vandekerckhove, J., Gesser, B., Ratz, G.P., Lauridsen, J.B., and Celis, J.E. (1989). Protein-electroblotting and microsequencing strategies in generating protein data bases from two-dimensional gels. Proc. Natl. Acad. Sci. USA 86 7701–7705. [PMC free article] [PubMed] [Google Scholar]
Belanger, F.C., Leustek, T., Chu, B., and Kriz, A.L. (1995). Evidence for the thiamine biosynthetic pathway in higher-plant plastids and its developmental regulation. Plant Mol. Biol. 29 809–821. [PubMed] [Google Scholar]
Bevan, M., et al., (1998). Analysis of 1.9 Mb of contiguous sequence from chromosome 4 of Arabidopsis thaliana. Nature 39 485–488. [PubMed] [Google Scholar]
Bouchez, D., and Hofte, H. (1998). Functional genomics in plants. Plant Physiol. 118 725–732. [PMC free article] [PubMed] [Google Scholar]
Bradford, M.M. (1976). A rapid and sensitive method for the quantitation of microgram quantities of protein utilizing the principle of protein–dye binding. Anal. Biochem. 72 248–254. [PubMed] [Google Scholar]
Burlingame, A.L., Boyd, R.K., and Gaskell, S. (1998). Mass spectrometry. Anal. Chem. 70 647R–716R. [PubMed] [Google Scholar]
Chapman, J.R. (1996). Proteins and Peptide Analysis by Mass Spectrometry. Methods in Molecular Biology. (Totowa, NJ: Humana Press).
Cline, K. (1986). Import of proteins into chloroplasts. Membrane integration of a thylakoid precursor protein reconstituted in chloroplast lysates. J. Biol. Chem. 261 14804–14810. [PubMed] [Google Scholar]
Cristobal, S., de Gier, J.W., Nielsen, H., and von Heijne, G. (1999). Competition between Sec- and TAT-dependent protein translocation in Escherichia coli. EMBO J. 18 2982–2990. [PMC free article] [PubMed] [Google Scholar]
Dainese, P., Staudenmann, W., Quadroni, M., Korostensky, C., Gonnet, G., Kertesz, M., and James, P. (1997). Probing protein function using a combination of gene knockout and proteome analysis by mass spectrometry. Electrophoresis 18 432–442. [PubMed] [Google Scholar]
Dalbey, R.E., and Robinson, C. (1999). Protein translocation into and across the bacterial plasma membrane and the plant thylakoid membrane. Trends Biochem. Sci. 24 17–22. [PubMed] [Google Scholar]
Doremus, H.D. (1986). Organization of the pathway of de novo pyrimidine nucleotide biosynthesis in pea (Pisum sativum L. cv Progress No. 9) leaves. Arch. Biochem. Biophys. 250 112–119. [PubMed] [Google Scholar]
Edvardsson, U., Alexandersson, M., Brockenhuus von Löwenhielm, H., Nystrom, A.C., Ljung, B., Nilsson, F., and Dahllof, B. (1999). A proteome analysis of livers from obese (ob/ob) mice treated with the peroxisome proliferator WY14,643. Electrophoresis 20 935–942. [PubMed] [Google Scholar]
Emanuelsson, O., Nielsen, H., and von Heijne, G. (1999). ChloroP, a neural network-based method for predicting chloroplast transit peptides and their cleavage sites. Protein Sci. 8 978–984. [PMC free article] [PubMed] [Google Scholar]
Essigmann, B., Guler, S., Narang, R.A., Linke, D., and Benning, C. (1998). Phosphate availability affects the thylakoid lipid composition and the expression of SQD1, a gene required for sulfolipid biosynthesis in Arabidopsis thaliana. Proc. Natl. Acad. Sci. USA 95 1950–1955. [PMC free article] [PubMed] [Google Scholar]
Fulgosi, H., Vener, A.V., Altschmied, L., Herrmann, R.G., and Andersson, B. (1998). A novel multi-functional chloroplast protein: Identification of a 40-kDa immunophilin-like protein located in the thylakoid lumen. EMBO J. 17 1577–1587. [PMC free article] [PubMed] [Google Scholar]
Gallagher, T.F., and Ellis, R.J. (1982). Light-stimulated transcription of genes for two chloroplast polypeptides in isolated pea nuclei. EMBO. J. 12 1493–1498. [PMC free article] [PubMed] [Google Scholar]
Gaymard, F., Boucherez, J., and Briat, J.F. (1996). Characterization of a ferritin mRNA from Arabidopsis thaliana accumulated in response to iron through an oxidative pathway independent of abscisic acid. Biochem. J. 318 67–73. [PMC free article] [PubMed] [Google Scholar]
Gobom, J., Nordhoff, E., Mirgorodskaya, E., Ekman, R., and Roepstorff, P. (1998). A sample purification and preparation technique based on nano-scale RP-columns for the sensitive analysis of complex peptide mixtures by MALDI-MS. J. Mass Spectrom. 34 105–116. [PubMed] [Google Scholar]
Görg, A., Postel, W., Gunther, S. (1988). The current state of two-dimensional electrophoresis with immobilized pH gradients. Electrophoresis 9 531–546. [PubMed] [Google Scholar]
Gozzer, C., Zanetti, G., Galliano, M., Sacchi, G.A., Minchiotti, L., and Curti, B. (1997). Molecular heterogeneity of ferredoxin–NADP+ reductase from spinach leaves. Biochim. Biophys. Acta 485 278–290. [PubMed] [Google Scholar]
Grimm, R., Grimm, M., Eckerskorn, C., Pohlmeyer, K., Rohl, T., and Soll, J. (1997). Postimport methylation of the small subunit of ribulose-1,5-bisphosphate carboxylase in chloroplasts. FEBS Lett. 408 350–354. [PubMed] [Google Scholar]
Hattorie, T., and Margulies, M.M. (1986). Synthesis of large subunit of ribulose bisphosphate carboxylase by thylakoid-bound polyribosomes from spinach chloroplasts. Arch. Biochem. Biophys. 244 630–640. [PubMed] [Google Scholar]
Heldt, H.-W. (1997). Plant Biochemisty and Molecular Biology. (Oxford, UK: Oxford University Press).
Herbert, B. (1999). Advances in protein solubilisation for two-dimensional electrophoresis. Electrophoresis 20 660–663. [PubMed] [Google Scholar]
Herbert, B.R., Molloy, M.P., Gooley, A.A., Walsh, B.J., Bryson, W.G., and Williams, K.L. (1998). Improved protein solubility in two-dimensional electrophoresis using tributyl phosphine as reducing agent. Electrophoresis 19 845–851. [PubMed] [Google Scholar]
Ho, C.L., Noji, M., and Saito, K. (1999). Plastidic pathway of serine biosynthesis. Molecular cloning and expression of 3-phosphoserine phosphatase from Arabidopsis thaliana. J. Biol. Chem. 274 11007–11012. [PubMed] [Google Scholar]
Hochstrasser, D.F., and Merril, C.R. (1988). Catalyst for polyacrylamide gel polymerization and detection of proteins by silver staining. Appl. Theor. Electrophor. 1 35–40. [PubMed] [Google Scholar]
Hua, S., Dube, S.K., Barnett, N.M., and Kung, S.D. (1992). Photosystem II 23 kDa polypeptide of oxygen-evolving complex is encoded by a multigene family in tobacco. Plant Mol. Biol. 18 997–999. [PubMed] [Google Scholar]
Jagendorf, A., and Michaels, A. (1990). Rough thylakoids: Translation on photosynthetic membranes. Plant Sci. 71 137–145. [Google Scholar]
Keegstra, K., and Cline, K. (1999). Protein import and routing systems of chloroplasts. Plant Cell 11 557–570. [PMC free article] [PubMed] [Google Scholar]
Keller, Y., Bouvier, F., d'Harlingue, A., and Camara, B. (1998). Metabolic compartmentation of plastid prenyllipid biosynthesis—Evidence for the involvement of a multifunctional geranylgeranyl reductase. Eur. J. Biochem. 251 413–417. [PubMed] [Google Scholar]
Kessler, F., Schnell, D., and Blobel, G. (1999). Identification of proteins associated with plastoglobules isolated from pea (Pisum sativum L.) chloroplasts. Planta 208 107–113. [PubMed] [Google Scholar]
Kieselbach, T., Hagman, Å., Andersson, B., and Schröder, W.P. (1998). The thylakoid lumen of chloroplasts. Isolation and characterization. J. Biol. Chem. 273 6710–6716. [PubMed] [Google Scholar]
Koussevitzky, S., Ne'eman, E., Sommer, A., Steffens, J.C., and Harel, E. (1998). Purification and properties of a novel chloroplast stromal peptidase. Processing of polyphenol oxidase and other imported precursors. J. Biol. Chem. 273 27064–27069. [PubMed] [Google Scholar]
Kuster, B., and Mann, M. (1998). Identifying proteins and post-translational modifications by mass spectrometry. Curr. Opin. Struct. Biol. 8 393–400. [PubMed] [Google Scholar]
Lange, T. (1998). Molecular biology of gibberellin synthesis. Planta 204 409–419. [PubMed] [Google Scholar]
Li, H.M., Sullivan, T.D., and Keegstra, K. (1992). Information for targeting to the chloroplast inner envelope membrane is contained in the mature region of the maize Bt1-encoded protein. J. Biol. Chem. 267 18999–19004. [PubMed] [Google Scholar]
Liu, J.W., and Rose, R.J. (1992). The spinach chloroplast chromosome is bound to the thylakoid membrane in the region of the inverted repeat. Biochem. Biophys. Res. Commun. 184 993–1000. [PubMed] [Google Scholar]
Lobreaux, S., and Briat, J.F. (1991). Ferritin accumulation and degradation in different organs of pea (Pisum sativum) during development. Biochem. J. 274 601–606. [PMC free article] [PubMed] [Google Scholar]
Lubeck, J., Heins, L., and Soll, J. (1997). Protein import into chloroplasts. Physiol. Plant. 100 53–64. [Google Scholar]
Maione, T.E., and Jagendorf, A.T. (1984). Partial deglycosylation of chloroplast coupling factor 1 (CF1) prevents the reconstitution of photophosphorylation. Proc. Natl. Acad. Sci. USA 81 3733–3736. [PMC free article] [PubMed] [Google Scholar]
Mano, S., Yamaguchi, K., Hayashi, M., and Nishimura, M. (1997). Stromal and thylakoid-bound ascorbate peroxidases are produced by alternative splicing in pumpkin. FEBS Lett. 413 21–26. [PubMed] [Google Scholar]
Mant, A., Kieselbach, T., Schröder, W.P., and Robinson, C. (1999). Characterisation of an Arabidopsis thaliana cDNA encoding a novel thylakoid lumen protein imported by the Δ pH-dependent pathway. Planta 207 624–627. [PubMed] [Google Scholar]
Marin, E., Nussaume, L., Quesada, A., Gonneau, M., Sotta, B., Hugueney, P., Frey, A., and Marion-Poll, A. (1996). Molecular identification of zeaxanthin epoxidase of Nicotiana plumbaginifolia, a gene involved in abscisic acid biosynthesis and corresponding to the ABA locus of Arabidopsis thaliana. EMBO J. 15 2331–2342. [PMC free article] [PubMed] [Google Scholar]
Mattoo, A.K., and Edelman, M. (1987). Intramembrane translocation and post-translational palmitoylation of the chloroplast 32-kDa herbicide-binding protein. Proc. Natl. Acad. Sci. USA 84 1497–1501. [PMC free article] [PubMed] [Google Scholar]
McLafferty, F.W., Fridriksson, E.K., Horn, D.M., Lewsi, M.A., and Zubarev, R.A. (1999). Biomolecule mass spectrometry. Science 284 1289–1290. [PubMed] [Google Scholar]
Meinke, D.W., Cherry, J.M., Dean, C., Rounsley, S.D., and Koornneef, M. (1998). Arabidopsis thaliana: A model plant for genome analysis. Science 282 679–682. [PubMed] [Google Scholar]
Meurer, J., Plucken, H., Kowallik, K.V., and Westhoff, P. (1998). A nuclear-encoded protein of prokaryotic origin is essential for the stability of photosystem II in Arabidopsis thaliana. EMBO J. 17 5286–5297. [PMC free article] [PubMed] [Google Scholar]
Miquel, M., and Browse, J. (1992). Arabidopsis mutants deficient in polyunsaturated fatty acid synthesis. Biochemical and genetic characterization of a plant oleoyl-phosphatidylcholine desaturase. J. Biol. Chem. 267 1502–1509. [PubMed] [Google Scholar]
Molloy, M.P., Herbert, B.R., Walsh, B.J., Tyler, M.I., Traini, M., Sanchez, J.C., Hochstrasser, D.F., Williams, K.L., and Gooley, A.A. (1998). Extraction of membrane proteins by differential solubilization for separation using two-dimensional gel electrophoresis. Electrophoresis 19 837–844. [PubMed] [Google Scholar]
Mori, H., Summer, E.J., Ma, X., and Cline, K. (1999). Component specificity for the thylakoidal sec and delta pH-dependent protein transport pathways. J. Cell Biol. 146 45–56. [PMC free article] [PubMed] [Google Scholar]
Nakai, K., and Horton, P. (1999). PSORT: A program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem. Sci. 24 34–36. [PubMed] [Google Scholar]
Nielsen, H. (1999). From Sequence to Sorting. Prediction of Signal Peptides. PhD Dissertation (Stockholm, Sweden: Stockholm University).
Nielsen, H., Engelbrecht, J., Brunak, S., and von Heijne, G. (1997). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10 1–6. [PubMed] [Google Scholar]
Ort, D.R., and Yocum, C.F. (1996). Oxygenic Photosynthesis: The Light Reactions. Advances in Photosynthesis, Vol. 4. (Dordrecht, The Netherlands: Kluwer Academic Publishers).
Parker, K.C., Garrels, J.I., Hines, W., Butler, E.M., McKee, A.H., Patterson, D., and Martin, S. (1998). Identification of yeast proteins from two-dimensional gels: Working out spot cross-contamination. Electrophoresis 19 1920–1932. [PubMed] [Google Scholar]
Porra, R.J., Thompson, W.A., and Kriedemann, P.E. (1989). Determination of accurate extinction coefficients and simultaneous equations for assaying chlorophylls a and b extracted with four different solvents: Verification of the concentration of chlorophyll standards by atomic absorption spectroscopy. Biochim. Biophys. Acta 975 384–394. [Google Scholar]
Pozueta-Romero, J., Rafia, F., Houlne, G., Cheniclet, C., Carde, J.P., Schantz, M.L., and Schantz, R. (1997). A ubiquitous plant housekeeping gene, PAP, encodes a major protein component of bell pepper chromoplasts. Plant Physiol. 115 1185–1194. [PMC free article] [PubMed] [Google Scholar]
Rabilloud, T. (1998). Use of thiourea to increase the solubility of membrane proteins in two-dimensional electrophoresis. Electrophoresis 19 758–760. [PubMed] [Google Scholar]
Rabilloud, T., Valette, C., and Lawrence, J.-J. (1994. a). Sample application by in-gel rehydration improves the resolution of two-dimensional electrophoresis with immobilized pH gradients in the first dimension. Electrophoresis 15 1552–1558. [PubMed] [Google Scholar]
Rabilloud, T., Vuillard, L., Gilly, C., and Lawrence, J.-J. (1994. b). Silver-staining of proteins in polyacrylamide gels: A general overview. Cell. Mol. Biol. 40 57–75. [PubMed] [Google Scholar]
Rabilloud, T., Adessi, C., Giraudel, A., and Lunardi, J. (1997). Improvement of the solubilization of proteins in two-dimensional electrophoresis with immobilized pH gradients. Electrophoresis 18 307–316. [PMC free article] [PubMed] [Google Scholar]
Richter, S., and Lamppa, G.K. (1998). A chloroplast processing enzyme functions as the general stromal processing peptidase. Proc. Natl. Acad. Sci. USA 95 7463–7468. [PMC free article] [PubMed] [Google Scholar]
Roepstorff, P. (1997). Mass spectrometry in protein studies: From genome to function. Curr. Opin. Biotechnol. 8 6–13. [PubMed] [Google Scholar]
Rouquié, D., Peltier, J.B., Marquis-Mansion, M., Tournaire, C., Doumas, P., and Rossignol, M. (1997). Construction of a directory of tobacco plasma membrane proteins by combined two-dimensional gel electrophoresis and protein sequencing. Electrophoresis 18 654–660. [PubMed] [Google Scholar]
Sato, N., Albrieux, C., Joyard, J., Douce, R., and Kuroiwa, T. (1993). Detection and characterization of a plastid envelope DNA-binding protein which may anchor plastid nucleoids. EMBO J. 12 555–561. [PMC free article] [PubMed] [Google Scholar]
Schägger, H., and von Jagow, G. (1987). Tricine–sodium dodecyl sulfate–polyacrylamide gel electrophoresis for the separation of proteins in the range from 1 to 100 kDa. Anal. Biochem. 166 368–379. [PubMed] [Google Scholar]
Schlichter, T., and Soll, J. (1996). Molecular chaperones are present in the thylakoid lumen of pea chloroplasts. FEBS Lett. 379 302–304. [PubMed] [Google Scholar]
Schneider, G., and Stephens, R.M. (1990). Sequence logos: A new way to display consensus sequences. Nucleic Acids Res. 18 6097–6100. [PMC free article] [PubMed] [Google Scholar]
Settles, A.M., and Martienssen, R. (1998). Old and new pathways of protein export in chloroplasts and bacteria. Trends Cell Biol. 8 494–501. [PubMed] [Google Scholar]
Shackleton, J.B., and Robinson, C. (1991). Transport of proteins into chloroplasts. The thylakoidal processing peptidase is a signal-type peptidase with stringent substrate requirements at the −3 and −1 positions. J. Biol. Chem. 266 12152–12156. [PubMed] [Google Scholar]
Shevchenko, A., Jensen, O.N., Podtelejnikov, A.V., Sagliocco, F., Wilm, M., Vorm, O., Mortensen, P., Shevchenko, A., Boucherie, H., and Mann, M. (1996). Linking genome and proteome by mass spectrometry: Large-scale identification of yeast proteins from two-dimensional gels. Proc. Natl. Acad. Sci. USA 93 14440–14445. [PMC free article] [PubMed] [Google Scholar]
Smith, H.B., Larimer, F.W., and Hartman, F.C. (1988). Subtle alteration of the active site of ribulose bisphosphate carboxylase/oxygenase by concerted site-directed mutagenesis and chemical modification. Biochem. Biophys. Res. Commun. 152 579–584. [PubMed] [Google Scholar]
Smith, P.M., Mann, A.J., Goggin, D.E., and Atkins, C.A. (1998). AIR synthetase in cowpea nodules: A single gene product targeted to two organelles? Plant Mol. Biol. 36 811–820. [PubMed] [Google Scholar]
Snyders, S., and Kohorn, B.D. (1999). TAKs, thylakoid membrane protein kinases associated with energy transduction. J. Biol. Chem. 274 9137–9140. [PubMed] [Google Scholar]
Su, Q., and Boschetti, A. (1994). Substrate- and species-specific processing enzymes for chloroplast precursor proteins. Biochem. J. 300 787–792. [PMC free article] [PubMed] [Google Scholar]
Sugita, M., and Sugiura, M. (1996). Regulation of gene expression in chloroplasts of higher plants. Plant Mol. Biol. 32 315–326. [PubMed] [Google Scholar]
Sullivan, T.D., and Kaneko, Y. (1995). The maize brittle1 gene encodes amyloplast membrane polypeptides. Plant J. 196 477–484. [PubMed] [Google Scholar]
Sullivan, T.D., Strelow, L.I., Illingworth, C.A., Phillips, R.L., and Nelson, O.E., Jr. (1991). Analysis of maize brittle-1 alleles and a defective _Suppressor_-_mutator_–induced mutable allele. Plant Cell 3 1337–1348. [PMC free article] [PubMed] [Google Scholar]
Tsugita, A., Kamo, M., Kawakami, T., and Ohki, Y. (1996). Two-dimensional electrophoresis of plant proteins and standardization of gel patterns. Electrophoresis 17 855–865. [PubMed] [Google Scholar]
Vener, A.V., Ohad, I., and Andersson, B. (1998). Protein phosphorylation and redox sensing in chloroplast thylakoids. Curr. Opin. Plant Biol. 1 217–223. [PubMed] [Google Scholar]
Waldo, G.S., Wright, E., Whang, Z.H., Briat, J.F., Theil, E.C., and Sayers, D.E. (1995). Formation of the ferritin iron mineral occurs in plastids. Plant Physiol. 109 797–802. [PMC free article] [PubMed] [Google Scholar]
Wales, R., Newman, B.J., Pappin, D., and Gray, J.C. (1989). The extrinsic 33 kDa polypeptide of the oxygen-evolving complex of photosystem II is a putative calcium-binding protein and is encoded by a multi-gene family in pea. Plant Mol. Biol. 12 439–451. [PubMed] [Google Scholar]
Whitelegge, J.P., Gundersen, C.B., and Faull, K.F. (1998). Electrospray-ionization mass spectrometry of intact intrinsic membrane proteins. Protein Sci. 7 1423–1430. [PMC free article] [PubMed] [Google Scholar]
Wilkins, M.R., Gasteiger, E., Gooley, A.A., Herbert, B.R., Molloy, M.P., Binz, P.A., Ou, K., Sanchez, J.C., Bairoch, A., Williams, K.L, and Hochstrasser, D.F. (1999). High-throughput mass spectrometric discovery of protein post-translational modifications. J. Mol. Biol. 289 645–657. [PubMed] [Google Scholar]
Wollman, F.A., Minai, L., and Nechushtai, R. (1999). The biogenesis and assembly of photosynthetic proteins in thylakoid membranes. Biochim. Biophys. Acta 1411 21–85. [PubMed] [Google Scholar]
Yates III, J.R. (1998). Mass spectrometry and the age of the proteome. J. Mass Spectrom. 33 1–19. [PubMed] [Google Scholar]
Yoshimura, K., Yabuta, Y., Tamoi, M., Ishikawa, T., and Shigeoka, S. (1999). Alternatively spliced mRNA variants of chloroplast ascorbate peroxidase isoenzymes in spinach leaves. Biochem. J. 338 41–48. [PMC free article] [PubMed] [Google Scholar]
Yurina, N.P., Belkina, G.G., Karapetyan, N.V., and Odintsova, M.S. (1995). Nucleoids of pea chloroplasts: Microscopic and chemical characterization. Occurrence of histone-like proteins. Biochem. Mol. Biol. Int. 36 145–154. [PubMed] [Google Scholar]
Zheleva, D., Sharma, J., Panico, M., Morris, H.R., and Barber, J. (1998). Isolation and characterization of monomeric and dimeric CP47-reaction center photosystem II complexes. J. Biol. Chem. 273 16122–16127. [PubMed] [Google Scholar]

Articles from The Plant Cell are provided here courtesy of Oxford University Press