Chromosomal DNA Deletions Explain Phenotypic Characteristics of Two Antigenic Variants, Phase II and RSA 514 (Crazy), of the Coxiella burnetii Nine Mile Strain† (original) (raw)

Abstract

After repeated passages through embyronated eggs, the Nine Mile strain of Coxiella burnetii exhibits antigenic variation, a loss of virulence characteristics, and transition to a truncated lipopolysaccharide (LPS) structure. In two independently derived strains, Nine Mile phase II and RSA 514, these phenotypic changes were accompanied by a large chromosomal deletion (M. H. Vodkin and J. C. Williams, J. Gen. Microbiol. **132:**2587-2594, 1986). In the work reported here, additional screening of a cosmid bank prepared from the wild-type strain was used to map the deletion termini of both mutant strains and to accumulate all the segments of DNA that comprise the two deletions. The corresponding DNAs were then sequenced and annotated. The Nine Mile phase II deletion was completely nested within the deletion of the RSA 514 strain. Basic alignment and homology studies indicated that a large group of LPS biosynthetic genes, arranged in an apparent O-antigen cluster, was deleted in both variants. Database homologies identified, in particular, mannose pathway genes and genes encoding sugar methylases and nucleotide sugar epimerase-dehydratase proteins. Candidate genes for addition of sugar units to the core oligosaccharide for synthesis of the rare sugar 6-deoxy-3-_C_-methylgulose (virenose) were identified in the deleted region. Repeats, redundancies, paralogous genes, and two regions with reduced G+C contents were found within the deletions.


Coxiella burnetii is an obligately parasitic bacterium that replicates within phagolysosomes of eukaryotic hosts (1, 16, 27). This organism is the causative agent of Q fever in humans. It is endemic in many species of domestic animals, especially sheep, goats, and cattle. C. burnetii is extremely infectious; inhalation of one organism is enough to initiate an acute illness in a guinea pig (32). In humans, the disease usually is a moderate to severe flu-like illness with headache, fever and chills, malaise, myalgia, and anorexia or an atypical pneumonia with a dry cough (26). Recrudescence and chronic illness, usually in the form of hepatitis or intractable endocarditis, are the most serious manifestations of the disease (24, 25, 28, 55). Humans usually contract Q fever by inhaling contaminated dust, often in barnyard, stockyard, or abattoir settings (34, 45).

Some strains of C. burnetii have been observed to undergo variation in surface antigens (18, 40). Continual passage of the Nine Mile strain (the tick-derived United States prototype strain) through immunocompetent hosts (306 successive passages in guinea pigs) apparently maintained the organism in its original, native (wild-type) antigenic form. However, eight subsequent passages through embryonated eggs resulted in the emergence of organisms that were easily distinguished serologically from the original isolate (40). The new phenotype reacted strongly with complement-fixing antibody found in acute-phase serum, compared with the lack of a reaction demonstrated by the guinea pig-passaged antigenic form (40). The two antigenic types differed in their virulence properties as well (3, 31, 40). The less virulent form of the organism, which was termed antigenic phase II, was found to have lipopolysaccharide (LPS) characteristics different from those of the wild-type form, termed antigenic phase I (2). The composition of phase I LPS is considerably more complicated (2, 36). Phase II LPS possesses 2-keto-3-deoxyoctulonosic acid (KDO), d-mannose, d-_glycero_-d-_manno_-heptose, lipid A or a lipid A analog, and a very complex mixture of fatty acids, many of which are branched (2, 36, 47, 48, 58). Phase I LPS also possesses these components but, in addition, has virenose, dihydrohydroxystreptose (3, 37, 47), and galactosaminuronyl-α(1,6)-glucosamine (3). These observations are consistent with a smooth-to-rough transition analogous to that seen in gram-negative enteric bacteria (2, 13, 17, 36). Phase II organisms were found to display a truncated LPS chemotype and, in addition, lacked the branched-chain sugars virenose and dihydrohydroxystreptose present in the wild-type phase I organisms (3, 36). The likely mechanism underlying the pathogenic correlation was defined by Vishwanath and Hackstadt, who showed that the attenuated phase II organism readily bound complement factor C3′, whereas the wild type did not (52). The different mechanisms could be explained by a lack of interaction between αvβ3 integrin and CR3 during monocyte interaction with phase I organisms but not during monocyte interaction with phase II organisms (7); involvement of the integrin and LPS is also a factor in _C. burnetii-_stimulated tumor necrosis factor release (10).

The antigenic phase transition seen in the Nine Mile strain is accompanied by a chromosomal deletion (53). The presence of the deletion was identified by the existence of a large _Hae_III digestion fragment that was obtained from the phase I genome but was absent in the phase II genome (33, 53). A representative cosmid bank was constructed from the DNA of a phase I strain and then screened by hybridization with the _Hae_III fragment. A restriction map was made of one of the three positive clones obtained, pJB153, and the approximate locations of the deletion termini were determined (53). In this paper, sequencing results for the phase II Nine Mile deletion are presented. Another C. burnetii Nine Mile descendant was found to express an intermediate-length LPS (17). This variant (designated RSA 514) was isolated from a guinea pig placenta months after infection with phase I Nine Mile (17). Interestingly, this strain (designated crazy because it behaved like neither phase I nor phase II Nine Mile) also has a portion of its chromosome deleted (53). Its virulence properties were characterized as intermediate since while the pyrogenic properties were roughly equivalent, viable RSA 514 could not be recovered in guinea pig spleens, as Nine Mile phase I was after infection (31). Nine Mile phase II, in contrast, did not cause fever, except at very high doses (>107 organisms), nor were any viable organisms recovered after infection. We show here that the deletion found in RSA 514 includes all of the region missing from the phase II variant and extends beyond the borders of this region. Ftacek and coworkers also described the presence of two truncated LPS species in the Priscilla strain of Coxiella. In their experiments, they found that the prevalence of the two shortened LPS species increased with successive passages in embryonated eggs (14). It is not known how these variants are related to the two classes of shortened LPS species described in this paper.

MATERIALS AND METHODS

Subcloning and sequencing of deletion DNA.

The initial cosmid clone was subcloned into M13 vectors, and single-stranded sequencing was performed by using the dideoxy chain termination method. Gaps in the sequence and unconnected regions were then resolved by using the Deletion Factory system, version 2.0 (GIBCO-BRL), and pDelta2 as the vector. Three large inserts, designated DLC, BRC, and CBC (about 12 kb each), which were previously cloned into pBluescript vectors, were inserted into pDelta2, and transpositions were generated in the insert DNA by γδ transposition. Selection and counterselection yielded cloned DNA which was sequenced directly from transposition sites into the inserts by using the junction primers. PCR cycle sequencing was also used to close gaps in the region and to resolve apparent frameshifts.

DNA sequence analysis.

DNA sequence data were generated by primer extension with fluorescent-dye-labeled dideoxy terminators followed by electrophoresis with an Applied Biosystems model 373 DNA sequence apparatus. DNA sequences were edited and assembled with Applied Biosystems AutoAssembler software.

Analysis of sequence data.

After the final sequence had been assembled, its coding potential was examined by using a number of algorithms. Open reading frames (ORFs) were initially delineated by the Map function (version 10.1; University of Wisconsin Genetics Computer Group, Inc., Madison, Wis.). The entire sequence was also visually scanned with Map in all six frames to eliminate small ORFs that grossly overlapped larger ORFs in the reverse frame and to reveal potential sequencing errors that artificially terminated a frame. Such areas were resequenced by using PCR-generated genomic DNA fragments. All the ORFs were then screened for homology against the nonredundant protein database with (gapped) BLAST 2.1 (blastp option; http://www.ncbi.nlm.nih.gov/) and against families of conserved proteins (clusters of orthologous groups; http://www.ncbi.nlm.nih.gov/). For quality control purposes, the entire nucleotide sequence was searched for homology in all six frames against the protein database with the blastx option (http://www.ncbi.nlm.nih.gov/). Two types of searches were performed for ORFs that failed to show homology to any entries in the database. Distant relationships were sought with either dynamic searching with PSI-BLAST and FASTA-SWAP or more intensive searching with FASTA (http://www.hgsc.bcm.tmc.edu/SearchLauncher/). Searches that looked for homology with conserved motifs were performed so that clues might be found to infer the function of the whole protein; these searches included searches with RPS-BLAST, Pfam, PRINTS, PROSITE, and ProDom (http://www.ncbi.nlm.nih.gov/; http://www.ebi.ac.uk/interpro/)

Genes that potentially code for tRNA were sought with tRNAscan (http://www.genetics.wustl.edu/). BLASTN was used to search the two large noncoding segments for homology based strictly on the nucleic acid level. Other preliminary sequence data were obtained from The Institute for Genomic Research website (http://www.tigr.org).

Nucleotide sequence accession number.

The GenBank accession number for the deletions studied in this analysis is AF387640.

RESULTS

Cosmid clone pJB153 contains approximately 40 kb of C. burnetii Nine Mile phase I genomic DNA. The Nine Mile phase II and RSA 514 deletions were determined by PCR by using primers complementary to various fragments of Nine Mile phase I DNA, as suggested by the mapping experiments of Vodkin and Williams (53). While the approximate deletion endpoints of both mutant strains were found in the right end of pJB153, the left endpoints were determined to be not located in pJB153. To find an overlapping fragment containing the deletion endpoints, a DNA library consisting of _Eco_RI restriction fragments from C. burnetii Nine Mile phase I genomic DNA in pBluescript KS was probed by PCR by using primers that amplified a segment of DNA near the left-end _Bam_HI site in pJB153. One clone, designated pCBE700, contained a 7-kb fragment that yielded a PCR fragment of the correct size and, furthermore, contained a single _Bam_HI site separated from the flanking _Eco_RI by 2 kb, which corresponded to the length predicted from the pJB153 sequence data. This fragment was sequenced and added an additional 4.8 kb of data, and it enabled us to locate the left end of the deletion of the Nine Mile phase II strain but not the left end of the deletion of the RSA 514 strain. To find the left end of the deletion in RSA 514, the Nine Mile phase I cosmid library was searched again by PCR. Primers designed from sequence data obtained from pCBE700 to the left of the _Bam_HI site (which delineates the _Bam_HI-_Bam_HI C. burnetii genomic insert in pJB153) enabled amplification of a DNA fragment from one of the cosmid library clones. Clone pJB167 was isolated in this way and was used to generate _Hin_dIII and _Pst_I fragments that were subcloned and sequenced, resulting in extension of the contiguous DNA sequence of the region being examined by another 1.6 kb. The additional sequence data allowed determination of the left end of the deletion in RSA 514. The exact deletion endpoints in both mutants were obtained by sequencing small amplimers generated by using primers designed from sequences determined to be just outside the junctions and, therefore, still remaining in the genomes. The primer pairs that permitted determination of the deletion endpoints in both mutant strains did not result in amplification of fragments of any size when Nine Mile phase I genomic DNA was used as the target under the experimental conditions used in the PCRs (1-min extensions at 72°C). Such amplification might, however, be theoretically possible with the appropriate reagents and extension times necessary to generate such large fragments. The deletion in Nine Mile phase II was 25,992 bp long, and there were no changes, such as short duplications, at either deletion terminus. RSA 514 had a 31,568-bp deletion and had a single change, an A-T base pair added between the rejoined ends.

The total DNA sequenced in this study was 38,584 bp long and contained 249 ORFs more than 60 codons long. Most of these ORFs overlapped much longer ORFs and were not examined further; many of the longer ORFs appeared to exist in operons. The list of candidate ORFs to analyze further, therefore, was reduced to the 31 ORFs more than 100 codons long depicted in Fig. 1. Table 1 lists these ORFs along with their locations, sizes, and any homologies found in the nucleic acid or protein databases, as well as the calculated similarity values. Table 2 lists the parologous ORFs in this region of the C. burnetii genome. A number of these genes can be sorted into common pathways, which include intermediary metabolism, synthesis or interconversion of specific sugars (23), O-antigen synthesis (23), S assimilation (22), and regulatory pathways (44).

FIG. 1.

FIG. 1.

Map of C. burnetii Nine Mile strain deletion regions. Annotated ORFs believed to function in O-antigen synthesis are indicated by grey arrows. ORFs believed to be related to carbohydrate metabolism or to other LPS biosynthetic steps are indicated by boldface arrows. The locations of endpoints of cosmid clones pJB167 and pJB153 are indicated by cross-hatched bars, and the location of the pCBE700 _Eco_RI clone is indicated by the stippled bar. Beginning on the left, the designations of the ORFs correspond to JB167-1 through JB167-6, followed by JB153-1 through JB153-25. The approximate locations of useful restriction sites are indicated.

TABLE 1.

Designations, coordinates, and annotations of ORFs in Nine Mile strain deletions

ORF no. Designation Coordinatesa No. of codons 9MIIb 514 Best homology E valuec Conserved domain E valuec pfam no.
1 JB167-1 138-1151 337 + +/− dTDP-glucose 4,6-dehydratase 5.00E-30 NAD-dependent epimerase/dehydratase, nucleotide-sugar substrates 7.00E-40 pfam01370
2 JB167-2 1151-2185 344 + GDP-d-mannose dehydratase 2.00E-46 NAD-dependent epimerase/dehydratase, nucleotide-sugar substrates 1.00E-67 pfam01370
3 JB167-3 2160-3740 526 +/− ADP-heptose synthase 2.00E-15 pfkB family carbohydrate kinase 9.00E-18 pfam00294
Cytidylyltransferase 1.00E-11 pfam01467
4 JB167-4 3743-4887 314 No significant homology Oxidoreductase, NADP or NAD 7.00E-07 pfam01408
5 JB167-5 4672-5982 436 UDP-glucose 6-dehydrogenase 3.00E-32 UDP-glucose/GDP-mannose dehydrogenase 2.00E-48 pfam00984
6 JB167-6 5975-6880 301 UDP-glucose 4-epimerase 1.00E-14 NAD-dependent epimerase/dehydratase, nucleotide-sugar substrates 6.00E-25 pfam01370
7 JB153-1 6994-7536 180 No significant homology RNA adenine dimethylases 1.00E-05 smart00650
UbIE/COQ5 methyltransferase family 6.00E-04 pfam01209
8 JB153-2 (7533-8720) 395 No significant homology
9 JB153-3 9134-10552 472 3-Alpha-hydroxysteroid sulfotransferase 1.3
10 JB153-4 10997-14335 1112 No significant homology
11 JB153-5 14566-15273 235 Acetoin:2,6-dichlorophenolindophenol oxidoreductase, beta subunit 2.00E-16 Transketolase, C-terminal domainDehydrogenase E1 component 9.00E-23 2.00E-05 pfam02780 pfam00676
12 JB153-6 15290-16933 547 No significant homology
13 JB153-7 17313-18311 332 GDP-l-fucose synthetase 6.00E-82 NAD-dependent epimerase/dehydratase, nucleotide-sugar substrates 4.00E-19 pfam01370
14 JB153-8 18304-19350 348 GDP-d-mannose dehydratase 1.00E-134 NAD-dependent epimerase/dehydratase, nucleotide-sugar substrates 8.00E-34 pfam01370
15 JB153-9 19377-20216 279 Glycosyl transferase 2.00E-19 Glycosyl transferase 1.00E-13 pfam00535
16 JB153-10 20209-21447 412 NDP-hexose 3-c-methyltransferase 1.00E-103
17 JB153-11 (21468-22478) 336 Pyruvate dehydrogenase, beta subunit 1.00E-56 Transketolase, C-terminal domain 1.00E-22 pfam02780
Transketolase, pyridine binding domain 1.00E-22 pfam02779
18 JB153-12 (22526-23551) 341 Pyruvate dehydrogenase, alpha subunit 1.00E-52 Dehydrogenase E1 component 2.00E-54 pfam00676
19 JB153-13 (23702-24814) 370 Glycosyl transferase 5.00E-09 Glycosyl transferase 1.00E-13 pfam00535
20 JB153-14 (24876-26252) 458 No significant homology
21 JB153-15 (26245-27393) 382 Pleiotropic transcriptional control 2.00E-72 DegT-DnrJ-EryC1-StrS family, DNA binding proteins 2.00E-85 pfam01041
22 JB153-16 (27430-28590) 386 Pleiotropic transcriptional control 3.00E-74 DegT-DnrJ-EryC1-StrS family, DNA binding proteins 3.00E-102 pfam01041
23 JB153-17 (28613-29782) 389 +/− No significant homology
24 JB153-18 29985-31544 519 + Sterold sulfotransferase-like protein 3.4 Sulfotransferase 4.00E-05 pfam00685
25 JB153-19 (31580-33241) 553 + +/− ATP sulfurylase 1.00E-140 Sulfate adenylyltransferase 1.00E-90 pfam01747
Adenylylsulfate kinase 4.00E-55 pfam01583
26 JB153-20 (33319-34152) 277 + + Sulfur assimilation 2.00E-45 Inositol monophosphatase 4.00E-41 pfam00459
27 JB153-21 34289-35038 249 + + 3-Oxoacyl-(acyl-carrier protein) reductase 1.00E-10 Short-chain dehydrogenase 8.00E-12 pfam00106
28 JB153-22 35048-35911 287 + + ABC transporter permease, O-antigen export 2.00E-35 ABC-2-type transporter 1.00E-21 pfam01061
29 JB153-23 35921-36697 258 + + ABC transporter permease, O-antigen export 2.00E-63 ABC transporter 2.00E-26 pfam00005
30 JB153-24 (37409-37738) 109 + + No significant homology
31 JB153-25 (37755-38485) 236 + + No significant homology

TABLE 2.

Occurrence of redundant sequences within the deletion of the Nine Mile straina

First ORF Second ORF E valueb Length (no. of amino acids)c
ORF no. Designation ORF no. Designation
9 JB153-3 24 JB153-18 2.00E-52 400
10 JB153-4 20 JB153-14 7.00E-07 381
11 JB153-5 17 JB153-11 4.00E-87 170
11 JB153-5 18 JB153-12 2.00E-25 68
21 JB153-15 22 JB153-16 3.00E-76 386

2-Oxoacid dehydrogenases (pyruvate or acetoin).

Complete oxidative decarboxylation of a 2-oxo acid, such as pyruvate or acetoin, is accomplished by large multiprotein enzymes in almost all organisms. These dehydrogenases are generally comprised of multiple copies of three enzymes, E1 (or E1α and E1β), E2, and E3. Central to the reaction is binding of the cofactor thiamine pyrophosphate by the E1α subunit. Database searches revealed that ORFs JB153-11 and JB153-12 could encode the α and β subunits, respectively, of a gram-positive-like pyruvate dehydrogenase or acetoin dehydrogenase. In other organisms the gene loci encoding the E1α and E1β enzymes are almost always located in tandem with E2 and E3 coding regions, whereas JB153-11 and JB153-12 are segregated. The Institute for Genomic Research database indicates that there is another C. burnetii locus that encodes another set of E1α and β subunits, and these genes exist in tandem with E2 and E3 genes at that site. Interestingly, portions of the alpha and beta subunits represented by JB153-11 and JB153-12 are repeated with high homology in JB153-5; however, the sequence of the ORF JB153-5 product does not contain a thiamine pyrophosphate binding site and is therefore probably not functional as an oxoacid dehydrogenase.

Fucose synthesis.

The ORF JB153-8 product is very similar to a database GDP-d-mannose dehydratase (E value, 132; 64% identity for the entire protein). As such, it should use GDP-d-mannose as a substrate and form GDP-4-keto-6-deoxy-d-mannose. The JB153-7 product has very good homology to fucose synthetase and should complete the synthesis of fucose from the 4-keto sugar by providing the GDP-mannose epimerase-reductase (46% identity for most of the protein) activity necessary to effect epimerization of CH3 at C-5 and epimerization of OH at C-3. However, the presence of fucose in C. burnetii LPS is doubtful, as none of the three independent groups of workers who previously studied the carbohydrate structure of this organism reported the presence of fucose (2, 3, 8, 37). Fucose may be used elsewhere in the organism, such as in a colanic acid-like molecule (20, 39, 59) or in a capsular polysaccharide. Alternatively, fucose could serve as a precursor to make another necessary product.

ADP-heptose synthase and inner core synthesis.

The polypeptide encoded by JB167-3 is homologous to ADP-heptose synthase. This ORF was completely deleted in RSA 514 and partially deleted in the phase II strain, which resulted in the loss of 27 residues from the carboxyl end and the addition of several apparently random codons derived from the newly juxtaposed downstream sequence past the deletion junction. The unequivocal loss of ADP-heptose synthase should lead to a complete lack of synthesis of any of the inner (heptose) core; however, neither of the deletion strains lacks heptose. As a result of The Institute for Genomic Research sequencing initiative with C. burneti, another enzyme for ADP-heptose synthesis was found in the C. burnetii genome. This second ADP-heptose synthase actually has a higher degree of similarity to other rfaE genes in the database, and furthermore, its two functional domains (50) are arranged in the conventional orientation, whereas the rfaE homolog in this study (JB153-3) displays the reverse orientation. In fact, no other known bifunctional rfaE has this orientation. (After submission of the manuscript, the rfa designation was changed to hld; thus, rfaE was renamed hldE [51].)

O-antigen gene cluster in deletions.

The greatest similarities in the databanks for several ORFs within the deletions are similarities to genes potentially involved in O-antigen synthesis. These genes are (i) pathway genes, which are usually genes for synthesis and interconversion of nucleotide sugars; (ii) genes for transferases, mostly glycosyl transferases, which are used for synthesis and modification of the O units; and (iii) processing genes, which are used for transport and polymerization of O units (23). Completely or partially lost with the phase II deletion were five pathway genes (JB167-5, JB167-56, JB153-7, JB153-8, and JB153-10) and two glycosyl transferase genes (JB153-9 and JB153-13), but no obvious processing genes were lost (Fig. 1). The remainder of the genes in the phase II deletion are as follows: one gene (JB153-5) containing sequences paralogous to the genes encoding 2-oxoacid dehydrogenase subunits (JB153-11 and JB153-12), two genes believed to encode a combined membrane sensor and transcriptional regulator (JB153-15 and JB153-16), and six genes that are unknown or have very poor homology at best to anything (JB153-3, JB153-4, JB153-6, JB153-14, JB153-17, and JB153-18). The crazy Q strain is missing two additional pathway genes (JB167-1 and JB167-2), a potential sulfotransferase gene (JB153-18), and a portion of an ATP sulfurylase gene (JB153-19). Outside both deletions and thus still in each genome are at least two processing genes (JB153-22 and JB153-23) that function in O-antigen export (Fig. 1 and Table 1).

DISCUSSION

Within the 38-kb region of the C. burnetii Nine Mile genome are the junctions of the large, nested deletions from two Nine Mile derivatives that display altered LPS chemotypes, RSA 514 and Nine Mile phase II. The left ends of the deletions in both mutant strains were improperly mapped previously and have now been placed beyond the boundary of the original cosmid clone, pJB153 (53). The location of the right junction of the phase II deletion is still valid.

On the basis of its 16S ribosomal DNA sequence, C. burnetii is classified in the γ subdivision of the Proteobacteria (56). A recently prevailing view of prokaryotic evolution is that many genes have undergone horizontal transfer; phylogenetic analysis of some genes then contradicts the ribosomal DNA-based phylogeny. The proteins that have been identified in the database that most closely match homologs encoded in the C. burnetii deletion represent a diverse background. Fourteen of the proteins are eubacterial, seven are archaeal, and two are eukaryotic (by domains). Within the eubacterial homologs, nine proteins are from gram-positive organisms, one is from the γ subdivision of the Proteobacteria, two are from the α subdivision of the Proteobacteria, one is from the ɛ subdivision of the Proteobacteria, and one is from the cyanobacteria.

Previous gel electrophoresis experiments showed that the phase II variant had a single LPS species, which was approximately the same size as the Re Glycolipid characteristic of the deep rough Salmonella strains (i.e., about 2.5 to 3.0 kDa). It was comprised of a KDO-like compound, glucose, mannose, d-_manno_-d-glyceroheptose, and lipid A (3, 58). The presence of the KDO was later confirmed (48). The crazy Q isolate RSA 514 was found to produce a single band at about 10.5 kDa and to be rich in amino sugars, containing glucosamine, galactosaminuronyl-α(1,6)-glucosamine (termed compound X), and a hexosaminuronic acid; the phase II isolate contained extremely small amounts of these compounds. The M44 strain (European vaccine strain [21]), which like Nine Mile phase II has only a small (and presumably uncomplicated) 2.5- to 4.0-kDa LPS species, likewise contains no compound X or hexosaminuronic acid (2, 3). Because the Australian strain in antigenic phase II (C. burnetii AUSTII) possessed an LPS species whose size was identical to the size of the 10.5-kDa species seen in RSA 514 and because it was also found to be rich in the same three amino sugars, these constituents were tentatively identified as markers for this intermediate-size LPS species (3). The basis for these two non-phase-I variants is not a frank deletion within the 38-kb region in question (53; unpublished results). In preliminary work, we also used PCR and sequencing to determine if smaller deletions or aberrations occurred in these other phase II strains. In the Australian phase II strain, we found no evidence of any smaller lesions within the 38-kb region, although the search has not been exhaustive yet. Because M44 was reported to lack virenose (3), we looked specifically in the region from JB153-8 to JB153-11 in that strain and found only a few conservative amino acid changes. These studies are not complete yet, and sequencing of a considerable portion of the M44 and AUSTII genomes may be necessary to understand this phenomenon more fully. Nonetheless, we must conclude that a deletion mechanism of this magnitude and in this chromosomal region thus far is limited to Nine Mile strains.

Both AUSTII and RSA 514 also possessed 3-_C_-(hydroxymethyl)-lyxose (dihydrohydroxystreptose) and a few other unidentified components that were probably neutral sugars. Notably, these strains with the intermediate LPS species lacked 6-deoxy-3-_C-_methylgulose (virenose). Virenose was therefore suspected to be a component of only longer or more complex heteropolysaccharide O-antigen chains or perhaps a linkage sugar between the 10.5-kDa species and the longer, more heterodiffuse LPS species seen in the Nine Mile phase I strain, in the Ohio phase I strain, and in the Henzerling phase I strain (3).

Virenose synthesis.

The greatest database similarities for JB153-10 are those with the genes encoding _S_-adenosylmethionine-dependent _C_-methyltransferases that add CH3 to the C-3′ position of 4-keto dideoxyhexoses. This protein may be a methylase that functions in a pathway resulting in l-virenose (6-deoxy-3-_C_-methyl-l-gulose). The similar database enzymes function in the mycarose sugar biosynthetic pathway for the eventual synthesis of macrolide antibiotics (4, 15). l-Mycarose is an epimer of l-virenose. This assignment is consistent with the observations that neither the RSA 514 (crazy) nor phase II Nine Mile strain possesses virenose and that both strains lack this gene. Because of this gene's proximity to JB153-7 and JB153-8 and because a 5′ epimerase activity must be employed to make l-virenose from d-hexose, we suggest that this methyltransferase may function in a virenose synthesis pathway by using the product of the reaction catalyzed by the JB153-8 product, a 4-keto sugar, and producing an intermediate methylated 4-keto-dideoxyhexose that could be subsequently epimerized at C-3′ and reduced at C-4′ by the JB153-7 gene product. This suggestion requires that the latter epimerase-reductase enzyme accommodate a methyl group stererospecifically positioned at C-3′; it is clear that the enzymes that function at this step in macrolide biosynthesis do so, and they are also members of the GDP-mannose epimerase-reductase family and the reductase-epimerase-dehydratase superfamily (4, 15, 20). Detracting from assignment of JB153-10 to the synthesis of virenose are our recent unpublished findings that suggest that this gene and its surrounding region are present in the M44 strain, which lacks virenose in its LPS.

In passing, the functional and sequence similarities between the _O-_methylases required to make mycinose for macrolide assembly and the JB153-1 and JB153-2 ORFs discovered here are also interesting. The latter may somehow be involved in the synthesis of 3-_C_-(hydroxymethyl)-lyxose. The JB153-9 product is a glycosyl transferase whose the sugar specificity cannot be easily inferred from sequence data alone. JB153-9 may or may not be functionally related to the gene cluster in which it resides. It is noteworthy that one protein with database similarity to the product of this ORF is a dolichol mannose phosphate synthetase. Dolichol is found primarily in eukaryotes, but it is also found in a few bacterial species (30). The significance of this for Coxiella is not known.

Amino sugar differences between phase II and RSA 514 (crazy).

Less satisfying is the search for the genetic explanation for the amino sugar richness of RSA 514 compared to that of the Nine Mile phase II strain. Hypothetically, the region studied here has candidate genes that are presumably pertinent to the synthesis of these compounds or critical intermediates. Possibly the JB167-5 gene product could form UDP-glucuronic acid or galacturonic acid from UDP-glucose or galactose. The JB167-2 product matches well as a UDP-glucose epimerase to make UDP-galactose, and the JB167-4 product, which shuttles reducing equivalents between fructose and glucose, produces gluconolactone and sorbitol. Any or all of these could be involved in syntheses leading to galactosaminuronyl-α-(1,6)-glucosamine, glucosamine, or aminoglucuronic acid. However, both strains lack JB167-4 and JB167-5, and RSA 514 lacks JB167-2, whereas the phase II Nine Mile strain does not. Thus, the sequence data for these two deletions do not suggest a frank or direct explanation for the differences between the two. Furthermore, this study revealed no candidate gene(s) for synthesis of dihydrohydroxystreptose, which was expected to be missing in phase II and present in RSA 514. In this regard, the JB167-1 product has suspicious similarity to dTDP-glucose-4,6-dehyratase, which is used in the pathway leading to dTDP-l-dihydrostreptose synthesis. However, almost in contraindication to what we expected to find, JB167-1 is probably not functional in RSA 514 but is functional in Nine Mile phase II. Because of this and many other results, we are forced to conclude that other genetic loci may be involved in the differences between the phase II and crazy phenotypes, and further genome comparison work is necessary to elucidate this. An alternative explanation of this conundrum is to propose that there is a second pathway of synthesis for LPS and that the pathway is active in RSA 514 as well as in the wild type but is not active in phase II. Certain products may accumulate and repress or inhibit the secondary pathway, or partially deleted or fused genes at the junctions could act epistatically with the same results. Thus, some of the nucleotide sugar intermediates could be used in both pathways and serve as regulators to coordinate their synthesis. For example, the putative product of the JB167-2-encoded protein, a 6-deoxynucleotide sugar, may be present but unutilized in phase II and thus able to inhibit the secondary pathway.

ATP sulfurylase and adenosine phosphosulfate synthesis.

The RSA 514 deletion terminates within JB153-19. This gene and the adjacent one, JB153-20, are involved in sulfur and energy metabolism. The former encodes a sulfate adenylyltransferase, and the latter encodes a homolog of a regulatory protein called cysteine Q (CysQ). The sulfate adenylyltransferase enzyme activity is critical for assimilation of inorganic sulfur (as SO4−2) into organic compounds, and most microbes possess it. As is the case with the homologs in many other organisms, the Coxiella enzyme has two subdomains, an ATP-sulfurylase on the amino end and an adenosine phosphokinase on the carboxyl end. If both are active, then this enzyme synthesizes adenosine phosphosulfate (APS) from ATP and inorganic sulfate and synthesizes 3′-phosphoadenosine 5′-phosphosulfate by adding another ATP to APS. In some organisms, such as Penicillium, the APS kinase domain is in fact not an enzyme but serves as a regulatory site for the sulfuryltransferase activity (12); the same may be true for Aspergillus (35). In the Rhizobium enzyme both domains are present and functional (38). In Escherichia coli, these activities are on separate enzymes (CysD, CysN, and CysC). It is possible that in the intact Coxiella enzyme the APS kinase domain serves only a regulatory function and the functional APS kinase locus is located elsewhere. The absence of either one or both of these activities would likely result in cysteine auxotrophy, as is the case in E. coli (22). In RSA 514 crazy, the sulfuryltransferase, at least, could be functional since the deletion occurs so near the C terminus and the sulfurylase subdomain is in the N terminus (Fig. 1).

Redundancies, paralogs, and possible mechanism of excision.

Within the 38-kb segment are three paralogous pairs of genes (JB153-3 and JB153-18, JB153-5 and JB153-11 plus JB153-12, and JB153-15 and JB153-16), 1.5 incomplete copies of one gene (JB153-14) that are present in another gene (JB153-4), and a domain shared by three genes (JB167-1, JB167-2, and JB167-6). This means that about 2.6 kb or 7% of this nucleotide sequence is redundant. Only members of the last paralogous pair are contiguous. If tandem duplication is invoked as a mechanism that generated paralogs from an ancestral gene, then subsequent evolution of the genome was associated with dispersal of the pairs. Two of the three duplications probably occurred in the distant past, as shown by similar levels (55%) of aligned and identical nucleotides. However, the pair consisting of JB153-5 and JB153-11 plus JB153-12 seems to be much more recent because the genes exhibit >85% aligned and identical nucleotides. There is evidence which supports the idea that multiple chromosomal aberrations have also occurred in this area. Two independent, spontaneous deletions have been documented during genesis of strains that were derived from a common ancestor within the last 40 years. Two further considerations also suggest rearrangements. First, the previously cited reverse orientation of the two domains of JB167-3 is consistent with a translocation. Second, JB153-5 is truncated on both sides relative to JB153-11 and JB153-12 and also has the reverse polarity relative to these ORFs. The generation of JB153-5 is consistent with intrachromosomal pairing followed by a double crossover. Although there are 19 copies of an insertion sequence (IS_1111_) within the C. burnetii genome (19), there are no copies in this area that explain the high level of activity observed.

Strains that are defective for LPS production might have a selective advantage when they are grown in ovo or in cell culture. Perhaps the amount of redundancy in this area has contributed to the probability that deletion, as opposed to physiological or other genetic events, is a favored mechanism for generation of phase II variants in the Nine Mile strain. On the other hand, there may be other physiological forces at play. Although the selective advantage theory for phase II over phase I is an attractive one from the standpoint of energy expenditure, it does not conform with the observed growth yields of phase I and phase II cells in eggs, in which the phase I yield is greater in terms of wet weight (54) and quantity (J. D. Miller and H. A. Thompson, in press). This could be explained if the phase II organisms are favored in growth within the cellular vacuoles but disfavored in spread from cell to cell because of differences in the binding mechanism for internalization in the preferred growth niche, which is reportedly different in phase I and phase II variants (7).

G+C content and mobile islands.

The G+C content of C. burnetii is 42 to 43 mol% (51). When the deletion region was analyzed in 1-kb increments (upper strand), two stretches with reduced G+C contents were identified. The region from nucleotide 7000 to nucleotide 12000 had (on average) a G+C content of 35.26 mol%, and the region from nucleotide 15000 to nucleotide 21000 had (on average) a G+C content of 36.62 mol%. The first low-G+C region contains genes encoding two methyltransferase candidates, JB153-1 and JB153-2, plus unannotated sequence. It is known that a mobile methylase gene plays a role in serotype reversal, between Inaba and Ogawa, in Vibrio cholerae (41), and species of Vibrio are also known to shuffle large sections of O-antigen and other LPS gene regions (6, 9, 11, 42, 43) The DNA change in cholera strains was associated with a transposon, although the transposon was probably not responsible for the larger cointegration events (11). The second low-G+C-content region begins with JB153-5 and ends with JB153-11 and JB153-12. The first region contains sequences with 80 to 90% identity to parts of the second region, and there are two similarities between the region including JB153-7 to JB153-10 and the Mycobacterium avium ser2 gene cluster (5, 30, 46), which (apparently) is a genetic island for LPS biosynthesis. It is therefore possible that portions of the C. burnetii O-antigen region are a result of lateral genetic transfer. However, such an event could not explain the regular, predictable, repeated observation of phase reversal in C. burnetii.

In summary, the two deleted strains studied, both of which have a truncated LPS, lost approximately the same chromosomal region. This region contains a mixture of gene types and functions but most notably has two operon-like sections that possess ORFs with homology to LPS biosynthetic genes, especially genes encoding epimerases, dehydratases, and glycosyl transferases of nucleotide sugars. One gene (represented by ORF JB153-10) encodes a possible C-methyltransferase for virenose synthesis. Also present are some genes with strong homology to genes that encode members of the pyruvate dehydrogenase-acetoin dehydrogenase family and one gene that encodes an adenosyl sulfotransferase for synthesis of APS. Although the RSA 514 deletion is larger, extending on both ends beyond the phase II deletion junctions, the crazy Q strain possesses an LPS that is larger or more complicated than the LPS of phase II. The reason for this discrepancy is not obvious from the sequence data or annotations. The mechanism of excision remains unknown.

Acknowledgments

Part of the cost of this research was supported by NIAID grant RO1 AI34984 to H.A.T. while he was at West Virginia University, Morgantown. Computer support was provided to M.H.V. by the Pittsburgh Supercomputer Center (grant PSCB DMB890077P).

Preliminary sequence data for other regions of the C. burnetii chromosome were obtained from The Institute for Genomic Research website (http://www.tigr.org). We thank the members of the Viral and Rickettsial Zoonoses Branch, National Center for Infectious Diseases, for their many helpful discussions and suggestions during the course of this work and for their constructive criticisms of the manuscript.

REFERENCES