Identifying spatially similar gene expression patterns in early stage fruit fly embryo images: binary feature versus invariant moment digital representations - PubMed (original) (raw)
Identifying spatially similar gene expression patterns in early stage fruit fly embryo images: binary feature versus invariant moment digital representations
Rajalakshmi Gurunathan et al. BMC Bioinformatics. 2004.
Abstract
Background: Modern developmental biology relies heavily on the analysis of embryonic gene expression patterns. Investigators manually inspect hundreds or thousands of expression patterns to identify those that are spatially similar and to ultimately infer potential gene interactions. However, the rapid accumulation of gene expression pattern data over the last two decades, facilitated by high-throughput techniques, has produced a need for the development of efficient approaches for direct comparison of images, rather than their textual descriptions, to identify spatially similar expression patterns.
Results: The effectiveness of the Binary Feature Vector (BFV) and Invariant Moment Vector (IMV) based digital representations of the gene expression patterns in finding biologically meaningful patterns was compared for a small (226 images) and a large (1819 images) dataset. For each dataset, an ordered list of images, with respect to a query image, was generated to identify overlapping and similar gene expression patterns, in a manner comparable to what a developmental biologist might do. The results showed that the BFV representation consistently outperforms the IMV representation in finding biologically meaningful matches when spatial overlap of the gene expression pattern and the genes involved are considered. Furthermore, we explored the value of conducting image-content based searches in a dataset where individual expression components (or domains) of multi-domain expression patterns were also included separately. We found that this technique improves performance of both IMV and BFV based searches.
Conclusions: We conclude that the BFV representation consistently produces a more extensive and better list of biologically useful patterns than the IMV representation. The high quality of results obtained scales well as the search database becomes larger, which encourages efforts to build automated image query and retrieval systems for spatial gene expression patterns.
Figures
Figure 1
BESTi search results with smaller dataset. Results from the BESTi-search for the same query image [22] based on (A) BFV [S _S_], (B) IMV [D _φ_] and (C) BFV [S _C_] representations in the original dataset (226 images); and based on (D) BFV [S S_] and (E) IMV [D φ_] representations in the domain database (in which distinct domains of the multi-domain expression patterns were added to the original dataset as additional data points). The search argument and the results retrieved are shown on the left and right of the arrow, respectively. The original data used to generate these expression patterns are shown above this row. BESTi-matches are arranged in descending order starting with the best hit for the given search image. Values of difference in centroids (Δ_C XY) and principal angles (Δ_θ) are also given. Each image is identified by the last name of the first author of the original research article and the figure number with the following abbreviations: Ashe [19]; Casares [20]; Gaul1 [28]; Grossniklaus [22]; Hartmann [24]; Hulskamp1 [27]; Hulskamp3 [26].
Figure 2
BESTi search results for S S with larger dataset. Comparison of search results from the small (226 images) and large (1819 images) dataset using the S S measure for the same query image (Figure 1A) [22]. Panels (A-K) are based on the genes whose expression patterns were retrieved as follows (A) slp1, (B) slp1 and otd, (C) otd, (D) slp2, (E) Kr, (F) hb, (G) hb and bcd, (H) Hb, bcd and nanos, (I) snail, (J) hts and (K) hairy. Images are referenced with the last name of the first author of the original article and its figure number: Grossniklaus [22]; Zhao [43]; Gao [44]; Wimmer [45]; Schulz1 [46]; Tsai [47]; Janody [48]; Stathopoulos [31]; Brent [32]; Zhang [33]. Common search results between the small and large dataset are indicated with dark blue image names.
Figure 3
BESTi search results for D φ with larger dataset. Comparison of search results from the small (226 images) and large (1819 images) dataset using the D φ measure for the same query image (Figure 1A) [22]. Panels (A-O) are based on the genes whose expression patterns were retrieved as follows (A) slp1, (B) bcd, (C) Kr, (D) hb(D1,D3) and Hb(D2), (E) tll, (F) gt, (G) hairy, (H) AS-C, (I) hb and Kr, (J) kni, (K) iab (type I transcript), (L) IAB5 enhancer, (M) vnd, (N) sog and (O) nanos, bcd and cnc. Images are referenced with the last name of the first author of the original article and its figure number: Grossniklaus [22]; Sauer[49]; Tsai[47]; Hulskamp1[27]; Gaul1[28]; Strunk[50]; Colas[51]; Wu[52]; Ghiglione[53]; Pankratz[54]; Melnick[55]; Janody[48]; Zhang[33]; Parkhurst[56]; Zhou[57]; Stathopoulos[31]. Common search results between the small and large datasets are indicated with dark blue image names.
Figure 4
BESTi search results for S C with larger dataset. Comparison of search results from the small (226 images) and large (1819 images) dataset using the D φ measure for the same query image (Figure 1A) [22]. Panels (A-Z) are based on the genes whose expression patterns were retrieved as follows (A) slp1, (B) otd, (C) hb, (D) AS-C, (E) nanos, bcd and Hb, (F) Kr, (G) sc, (H) snail, (I) en and hb, (J) bcd and hb, (K) kni and hb, (L) tll, (M) eve, (N) twist, (O) dpp, (P) en, (Q) arm, (R) hairy, (S) zen, (T) run, (U) Hsp83, (V) nmo, (W) Tc'hb, (X) iab, (Y) hts and (Z) sog. Images are referenced with the last name of the first author of the original article and its figure number: Grossniklaus [22]; Gao [44]; Hulskamp1 [27]; Hulskamp3 [26]; Zhao [43]; Gaul1 [28]; Tsai [47]; Niessing [58]; Sauer [49]; Parkhurst [56]; Janody [48]; Schulz2 [46]; Yagi [59] Cowden [60]; Stathopoulos [31]; Miskiewicz [61]; Schulz1 [62]; Goff [63]; Sackerson [64]; Rusch [65]; Steingrimsson [66]; Hamada [67]; Zhang [33]; Klingler [68]; Bashirullah [69]; Verheyen [70]; Wolff [71]; Casares [20]; Brent [32]. Common search results between the small and large dataset are indicated with dark blue image names.
Figure 5
BESTi search results with multiple domains of expression using smaller database. Results from BESTi-search for a query image with multiple domains of expression. (A) BFV [S _S_], (B) IMV [D _φ_] and (C) BFV [S _C_] searches for the same expression pattern in the original database (226 images). (D) BFV [S S_] search using the complete multi-domain expression in the original database and (E) BFV [S S_] search using only the pattern on the left in the domain database. Search argument and the results retrieved are shown on the left and right of the arrow, respectively. Original data used to generate these expression patterns are shown above this row. BESTi-matches are arranged in descending order starting with the best hit for the given search statistic. Values of difference in centroids (Δ_C XY) and principal angles (Δ_θ) are also given for panels A, B and C. Each image is identified by the last name of the first author of the original research article and the figure number; with the abbreviations as follows: Ashe [19]; Arnosti [17]; Borggreve [18]; Casares [20]; Gaul1 [28]; Gaul2 [29]; Grossniklaus [22]; Hartmann [24]; Hulskamp1 [27]; Hulskamp2 [25]; Hulskamp3 [26].
Similar articles
- Classification of Drosophila embryonic developmental stage range based on gene expression pattern images.
Ye J, Chen J, Li Q, Kumar S. Ye J, et al. Comput Syst Bioinformatics Conf. 2006:293-8. Comput Syst Bioinformatics Conf. 2006. PMID: 17369647 - BEST: a novel computational approach for comparing gene expression patterns from early stages of Drosophila melanogaster development.
Kumar S, Jayaraman K, Panchanathan S, Gurunathan R, Marti-Subirana A, Newfeld SJ. Kumar S, et al. Genetics. 2002 Dec;162(4):2037-47. doi: 10.1093/genetics/162.4.2037. Genetics. 2002. PMID: 12524369 Free PMC article. - Automated annotation of developmental stages of Drosophila embryos in images containing spatial patterns of expression.
Yuan L, Pan C, Ji S, McCutchan M, Zhou ZH, Newfeld SJ, Kumar S, Ye J. Yuan L, et al. Bioinformatics. 2014 Jan 15;30(2):266-73. doi: 10.1093/bioinformatics/btt648. Epub 2013 Dec 3. Bioinformatics. 2014. PMID: 24300439 Free PMC article. - Development of high-throughput tools to unravel the complexity of gene expression patterns in the mammalian brain.
Herzig U, Cadenas C, Sieckmann F, Sierralta W, Thaller C, Visel A, Eichele G. Herzig U, et al. Novartis Found Symp. 2001;239:129-46; discussion 146-59. doi: 10.1002/0470846674.ch11. Novartis Found Symp. 2001. PMID: 11529308 Review. - The shape of things to come: Topological data analysis and biology, from molecules to organisms.
Amézquita EJ, Quigley MY, Ophelders T, Munch E, Chitwood DH. Amézquita EJ, et al. Dev Dyn. 2020 Jul;249(7):816-833. doi: 10.1002/dvdy.175. Epub 2020 Apr 13. Dev Dyn. 2020. PMID: 32246730 Free PMC article. Review.
Cited by
- FlyExpress 7: An Integrated Discovery Platform To Study Coexpressed Genes Using in Situ Hybridization Images in Drosophila.
Kumar S, Konikoff C, Sanderford M, Liu L, Newfeld S, Ye J, Kulathinal RJ. Kumar S, et al. G3 (Bethesda). 2017 Aug 7;7(8):2791-2797. doi: 10.1534/g3.117.040345. G3 (Bethesda). 2017. PMID: 28667017 Free PMC article. - Medium-throughput processing of whole mount in situ hybridisation experiments into gene expression domains.
Crombach A, Cicin-Sain D, Wotton KR, Jaeger J. Crombach A, et al. PLoS One. 2012;7(9):e46658. doi: 10.1371/journal.pone.0046658. Epub 2012 Sep 28. PLoS One. 2012. PMID: 23029561 Free PMC article. - Learning sparse representations for fruit-fly gene expression pattern image annotation and retrieval.
Yuan L, Woodard A, Ji S, Jiang Y, Zhou ZH, Kumar S, Ye J. Yuan L, et al. BMC Bioinformatics. 2012 May 23;13:107. doi: 10.1186/1471-2105-13-107. BMC Bioinformatics. 2012. PMID: 22621237 Free PMC article. - Comparison of embryonic expression within multigene families using the FlyExpress discovery platform reveals more spatial than temporal divergence.
Konikoff CE, Karr TL, McCutchan M, Newfeld SJ, Kumar S. Konikoff CE, et al. Dev Dyn. 2012 Jan;241(1):150-60. doi: 10.1002/dvdy.22749. Epub 2011 Sep 29. Dev Dyn. 2012. PMID: 21960044 Free PMC article. - Drosophila Gene Expression Pattern Annotation Using Sparse Features and Term-Term Interactions.
Ji S, Yuan L, Li YX, Zhou ZH, Kumar S, Ye J. Ji S, et al. KDD. 2009 Jun 28;2009:407-415. doi: 10.1145/1557019.1557068. KDD. 2009. PMID: 21614142 Free PMC article.
References
- Carroll SB, Grenier JK, Weatherbee SD. From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design. Massachusetts, MA, Blackwell Scientific; 2000.
- Davidson E. Genomic Regulatory Systems: Development and Evolution. New York, NY, Academic Press; 2000.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
Molecular Biology Databases