Population structure, biogeography and transmissibility of Mycobacterium tuberculosis - PubMed (original) (raw)

. 2021 Oct 20;12(1):6099.

doi: 10.1038/s41467-021-26248-1.

Roger Vargas Jr 2 3, Ashaque Husain 4, S M Mostofa Kamal 5, Alena Skrahina 6, Sabira Tahseen 7, Nazir Ismail 8 9, Anna Barbova 10, Stefan Niemann 11, Daniela Maria Cirillo 12, Anna S Dean 13, Matteo Zignol 13, Maha Reda Farhat 14 15

Affiliations

Population structure, biogeography and transmissibility of Mycobacterium tuberculosis

Luca Freschi et al. Nat Commun. 2021.

Abstract

Mycobacterium tuberculosis is a clonal pathogen proposed to have co-evolved with its human host for millennia, yet our understanding of its genomic diversity and biogeography remains incomplete. Here we use a combination of phylogenetics and dimensionality reduction to reevaluate the population structure of M. tuberculosis, providing an in-depth analysis of the ancient Indo-Oceanic Lineage 1 and the modern Central Asian Lineage 3, and expanding our understanding of Lineages 2 and 4. We assess sub-lineages using genomic sequences from 4939 pan-susceptible strains, and find 30 new genetically distinct clades that we validate in a dataset of 4645 independent isolates. We find a consistent geographically restricted or unrestricted pattern for 20 groups, including three groups of Lineage 1. The distribution of terminal branch lengths across the M. tuberculosis phylogeny supports the hypothesis of a higher transmissibility of Lineages 2 and 4, in comparison with Lineages 3 and 1, on a global scale. We define an expanded barcode of 95 single nucleotide substitutions that allows rapid identification of 69 M. tuberculosis sub-lineages and 26 additional internal groups. Our results paint a higher resolution picture of the M. tuberculosis phylogeny and biogeography.

© 2021. The Author(s).

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1

Fig. 1. Phylogenetic tree reconstruction of lineage 1 (binary tree).

Gray circles define splits where the _F_ST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (blue: sub-lineages already described in the literature; green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.

Fig. 2

Fig. 2. Phylogenetic tree reconstruction of lineage 3 (binary tree).

Gray circles define splits where the _F_ST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.

Fig. 3

Fig. 3. Phylogenetic tree reconstruction of lineage 2 (binary tree).

Gray circles define splits where the _F_ST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (blue: sub-lineages already described in the literature; green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.

Fig. 4

Fig. 4. Phylogenetic tree reconstruction of lineage 4 (binary tree).

Gray circles define splits where the _F_ST (fixation index) calculated using the descendants of the two children nodes is greater than 0.33. The sub-lineages are defined by colored areas (blue: sub-lineages already described in the literature; green: sub-lineages described here; purple: internal sub-lineages). Source data are provided as a Source Data file.

Fig. 5

Fig. 5. Histogram of the Simpson diversity index calculated for sub-lineages of lineages 1–4.

A data set of 17,432 isolates from 74 countries was used to perform this analysis. Yellow triangles designate the Simpson diversity index values of sub-lineages designated as geographically restricted by Stucki et al. Light gray circles designate the Simpson diversity index values of sub-lineages designated as geographically unrestricted by Stucki et al. Source data are provided as a Source Data file.

Fig. 6

Fig. 6. Geographic distribution of internal sub-lineage 1.1.3.i1.

Colors represent the percentage of 1.1.3.i1 strains isolated in a given country with respect to all lineage 1 strains isolated in such country. Source data are provided as a Source Data file.

Fig. 7

Fig. 7. Geographic distribution of internal sub-lineage 1.1.1.1.

Colors represent the percentage of 1.1.1.1 strains isolated in a given country with respect to all lineage 1 strains isolated in such country. Source data are provided as a Source Data file.

Fig. 8

Fig. 8. Geographic distribution of internal sub-lineage 1.1.2.

Colors represent the percentage of 1.1.2 strains isolated in a given country with respect to all lineage 1 strains isolated in such country. Source data are provided as a Source Data file.

Fig. 9

Fig. 9. Distributions of terminal branch lengths for the four global Mtb lineages (L1–L4).

Two-sided Wilcoxon rank sum tests were performed to test that two distributions were significantly different. Medians: 6.2 × 10−5 (L4), 8.2 × 10−5 (L2), 10.2 × 10−5 (L3), 17.5 × 10−5 (L1). Comparisons: L1 vs L2, L3 or L4 (_p_-value < 2.2 × 10−16); L2 vs L3 (_p_-value = 3.6 × 10−6), L2 vs L4 (_p_-value < 2.2 × 10−16); L3 vs L4 (_p_-value < 2.2 × 10−16). Description of the distributions (L1: n = 739, Min: 0.5 × 10−5, 1st Quartile: 6.7 × 10−5, Median: 17.5 × 10−5, 3rd Quartile: 28 × 10−5, Max: 120 × 10−5; L2: n = 2193, Min: 0.7 × 10−5, 1st Quartile: 5.3 × 10−5, Median: 8.2 × 10−5, 3rd Quartile: 12 × 10−5, Max: 110 × 10−5; L3: n = 1103, Min: 0.5 × 10−5, 1st Quartile: 4.5 × 10−5, Median: 10.2 × 10−5, 3rd Quartile: 20 × 10−5, Max: 80 × 10−5; L4: n = 5514, Min: 0.2 × 10−5, 1st Quartile: 2.6 × 10−5, Median: 6.2 × 10−5, 3rd Quartile: 13 × 10−5, Max: 70 × 10−5). Source data are provided as a Source Data file.

Similar articles

Cited by

References

    1. Gagneux S, Small PM. Global phylogeography of Mycobacterium tuberculosis and implications for tuberculosis product development. Lancet Infect. Dis. 2007;7:328–337. doi: 10.1016/S1473-3099(07)70108-1. - DOI - PubMed
    1. Sreevatsan S, et al. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent global dissemination. Proc. Natl Acad. Sci. USA. 1997;94:9869–9874. doi: 10.1073/pnas.94.18.9869. - DOI - PMC - PubMed
    1. Gagneux S, et al. Variable host–pathogen compatibility in Mycobacterium tuberculosis. Proc. Natl Acad. Sci. USA. 2006;103:2869–2873. doi: 10.1073/pnas.0511240103. - DOI - PMC - PubMed
    1. Brudey K, et al. Mycobacterium tuberculosis complex genetic diversity: mining the fourth international spoligotyping database (SpolDB4) for classification, population genetics and epidemiology. BMC Microbiol. 2006;6:23. doi: 10.1186/1471-2180-6-23. - DOI - PMC - PubMed
    1. Baker L, Brown T, Maiden MC, Drobniewski F. Silent nucleotide polymorphisms and a phylogeny for Mycobacterium tuberculosis. Emerg. Infect. Dis. 2004;10:1568–1577. doi: 10.3201/eid1009.040046. - DOI - PMC - PubMed

MeSH terms

LinkOut - more resources