Carhta Gene: multipopulation integrated genetic and radiation hybrid mapping (original) (raw)

Journal Article

,

INRA, Biométrie et Intelligence Artificielle/Génétique Cellulaire BP 27, 31326 Castanet-Tolosan Cedex, France

*To whom correspondence should be addressed.

Search for other works by this author on:

,

INRA, Biométrie et Intelligence Artificielle/Génétique Cellulaire BP 27, 31326 Castanet-Tolosan Cedex, France

Search for other works by this author on:

,

INRA, Biométrie et Intelligence Artificielle/Génétique Cellulaire BP 27, 31326 Castanet-Tolosan Cedex, France

Search for other works by this author on:

,

INRA, Biométrie et Intelligence Artificielle/Génétique Cellulaire BP 27, 31326 Castanet-Tolosan Cedex, France

Search for other works by this author on:

INRA, Biométrie et Intelligence Artificielle/Génétique Cellulaire BP 27, 31326 Castanet-Tolosan Cedex, France

Search for other works by this author on:

Received:

14 September 2004

Revision received:

03 November 2004

Accepted:

03 November 2004

Published:

14 December 2004

Cite

Simon de Givry, Martin Bouchez, Patrick Chabrier, Denis Milan, Thomas Schiex, Carhta Gene: multipopulation integrated genetic and radiation hybrid mapping, Bioinformatics, Volume 21, Issue 8, April 2005, Pages 1703–1704, https://doi.org/10.1093/bioinformatics/bti222
Close

Navbar Search Filter Mobile Enter search term Search

Abstract

Summary: Carhta Gene: is an integrated genetic and radiation hybrid (RH) mapping tool which can deal with multiple populations, including mixtures of genetic and RH data. Carhta Gene: performs multipoint maximum likelihood estimations with accelerated expectation–maximization algorithms for some pedigrees and has sophisticated algorithms for marker ordering. Dedicated heuristics for framework mapping are also included. Carhta Gene: can be used as a C++ library, through a shell command and a graphical interface. The XML output for companion tools is integrated.

Availability: The program is available free of charge from www.inra.fr/bia/T/CarthaGene for Linux, Windows and Solaris machines (with Open Source).

Contact: tschiex@toulouse.inra.fr

INTRODUCTION

The genetic mapping technique is used to locate polymorphic markers on chromosomes by making use of a probabilistic model of crossing-over. Many genetic mapping tools are available to analyze data in experimental crosses. Most of them are designed to analyze line crosses, one family at a time; few can integrate data from several crosses to build consensus maps (see http://linkage.rockfeller.edu/soft).

Radiation hybrid (RH) mapping is a somatic cell technique that complements the genetic mapping technique, allowing for finer resolution. Most existing RH mapping packages are listed at compgen.rutgers.edu/rhmap

Although capable of handling line crosses, Carhta Gene has been designed to create consensus maps from multiple populations. Instead of directly integrating existing maps, Carhta Gene computes maximum multipoint likelihood maps, taking into account all the available information, offering additional reliability. It can integrate RH and genetic data together.

METHODS

Parametric probabilistic HMM based models for crossover during meiosis (Lander et al., 1987) and for chromosome breakage and retention during RH panel construction (Lange et al., 1995) include parameters such as recombination and breakage probabilities between adjacent markers as well as retention probability. Given experimental data, and assuming some marker ordering, the values of these parameters can be estimated by maximizing the probability that the data observed has been generated by the model. This likelihood therefore allows simultaneous evaluation of the assumed marker ordering and estimation of the corresponding parameters (distances). This is done by a so-called expectation–maximization (EM) algorithm. Models for backcross, f2 intercross, recombinant inbred lines (self and sibs) and phase-known outbreds are available. For RH estimation, the equal retention model both in its haploid and diploid forms is used. The EM forward–backward algorithm used in Carhta Gene has been accelerated by taking into account specific properties of backcross and haploid RH data (see Schiex et al., 2001). Compared with usual EM implementations, the accelerated algorithm can run one or two orders of magnitude faster with no loss of precision.

Carhta Gene provides two ways to merge data files:

  1. If one assumes that the data merged represents a single map (same order and distances, either genetic or RH), a so-called genetic merging is done. Untyped markers in a population/panel are considered as missing data and one consensus map is produced. RH and genetic data cannot be merged under this model because they use different distances.
  2. Otherwise, it is assumed that the data files are representative of maps with a common marker ordering but specific distances per dataset. Here, a single consensus ordering is produced but a specific set of distances is estimated for each model merged. Any type of data, genetic or RH, can be merged here. This is called order merging. These two methods can be combined freely.

For genetic merging, the EM implementation deals with combined datasets by performing an E computation on each dataset and then using an M step that takes into account the merging performed. For datasets merged by order, independent log-likelihoods are obtained per dataset and summed up.

The main problem in genetic or RH mapping arises from the number of possible marker orders. For n markers, there exists n!/2 different possible orders. Since the connection between mapping and the traveling salesman problem (TSP) is well known for genetic mapping (Schiex and Gaspin, 1997www.inra.fr/bia/T/CartaGene) and for RH mapping, (Ben-Dor et al., 2000) Carhta Gene relies on this connection and provides extensions of TSP solving algorithms:

A unique feature of Carhta Gene is that, instead of producing a single supposedly optimal map, it produces an ordered set of alternative maps which allows an estimation of the reliability of ordering of each marker. The set of all these maps can be explored manually and compared graphically (with a Postscript output; Fig. 1). Dedicated automatic tools facilitate the identification of unreliable markers for further analysis.

Final maps can be produced under MapMaker (Lander et al., 1987) and XML formats for data exchange with MCQTL (Jourjon et al., 2004) (a multiple population QTL mapping software) and BioMercator (Arcade et al., 2004) (a software for integrating genetic maps and QTL detected in independent experiments).

IMPLEMENTATION

Carhta Gene is implemented as a C++ library. A Tcl programable shell command for automated mapping is available and a graphical Tcl/Tk interface for interactive mapping. Binaries for Windows, Solaris and Linux are provided with an open source distribution (using the G-Forge site mulcyber.toulouse.inra.fr).

A typical graphical session of CarHTa Gene:, displaying possible maps with distance and log-likelihoods.

Fig. 1

A typical graphical session of CarHTa Gene:, displaying possible maps with distance and log-likelihoods.

This work was supported by GENOPLANTE project ‘Integrative Tools for Genetic Mapping’.

REFERENCES

Agarwala, R., Applegate, D.L., Maglott, D., Schuler, G.D., Schaffer, A.A.

2000

A fast and scalable radiation hybrid map construction and integration strategy.

Genome Res.

10

350

– 364

Arcade, A., Labourdette, A., Falque, M., Mangin, B., Chardon, F., Charcosset, A., Joets, J.

2004

Biomercator: integrating genetic maps and QTL towards discovery of candidate genes.

Bioinformatics

20

2324

– 2326

Ben-Dor, A., Chor, B., Pelleg, D.

2000

RHO—radiation hybrid ordering.

Genome Res.

10

365

– 378

Helsgaun, K.

2000

An effective implementation of the Lin-Kernighan traveling salesman heuristic.

Eur. J. Oper. Res.

126

106

– 130

Jourjon, M.-F., Jasson, S., Marcel, J., Ngom, B., Mangin, B.

2004

MCQTL: Multi-allelic QTL mapping in multi-cross design.

Bioinformatics

21

128

– 130

Lander, E.S., Green, P., Abrahamson, J., Barlow, A., Daly, M.J., Lincoln, S.E., Newburg, L.

1987

MAPMAKER: an interactive computer package for constructing primary genetic linkage maps of experimental and natural populations.

Genomics

1

174

– 181

Lange, K., Boehnke, M., Cox, D.R., Lunetta, K.L.

1995

Statistical methods for polyploid radiation hybrid mapping.

Genome Res.

5

136

– 150

Lin, S. and Kernighan, B.W.

1973

An effective heuristic algorithm for the traveling salesman problem.

Oper. Res.

21

498

– 516

Schiex, T. and Gaspin, C.

1997

Cartagene: constructing and joining maximum likelihood genetic maps.

Proceedings of ISMB'97

, Halkidiki, Greece Porto Carras, pp.

258

–267

Schiex, T., Chabrier, P., Bouchez, M., Milan, D.

2001

Boosting EM for radiation hybrid and genetic mapping.

Proceedings of WABI'01

, pp.

41

–51 vol. 2149 LNCS

© The Author 2004. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oupjournals.org

Citations

Views

Altmetric

Metrics

Total Views 1,759

1,350 Pageviews

409 PDF Downloads

Since 12/1/2016

Month: Total Views:
December 2016 1
January 2017 1
February 2017 5
March 2017 18
April 2017 3
May 2017 16
June 2017 8
July 2017 17
August 2017 24
September 2017 4
October 2017 7
November 2017 4
December 2017 33
January 2018 27
February 2018 24
March 2018 29
April 2018 35
May 2018 29
June 2018 29
July 2018 27
August 2018 38
September 2018 12
October 2018 24
November 2018 25
December 2018 24
January 2019 36
February 2019 22
March 2019 28
April 2019 33
May 2019 12
June 2019 13
July 2019 22
August 2019 24
September 2019 32
October 2019 14
November 2019 22
December 2019 18
January 2020 17
February 2020 24
March 2020 18
April 2020 16
May 2020 6
June 2020 19
July 2020 15
August 2020 11
September 2020 22
October 2020 22
November 2020 13
December 2020 19
January 2021 11
February 2021 27
March 2021 34
April 2021 13
May 2021 39
June 2021 20
July 2021 24
August 2021 18
September 2021 14
October 2021 18
November 2021 18
December 2021 7
January 2022 12
February 2022 20
March 2022 12
April 2022 24
May 2022 18
June 2022 17
July 2022 18
August 2022 22
September 2022 16
October 2022 18
November 2022 6
December 2022 15
January 2023 18
February 2023 19
March 2023 17
April 2023 9
May 2023 12
June 2023 9
July 2023 6
August 2023 24
September 2023 17
October 2023 17
November 2023 19
December 2023 27
January 2024 23
February 2024 24
March 2024 24
April 2024 15
May 2024 18
June 2024 20
July 2024 12
August 2024 17
September 2024 18
October 2024 11

Citations

314 Web of Science

×

Email alerts

Citing articles via

More from Oxford Academic