Microarray Data Analysis: General Concepts, Gene Selection, and Classification (original) (raw)

Discoveries from the genome sequencing projects facilitated the development of novel techniques able to screen thousands of molecules in parallel and identify sets of potentially interesting sequences associated with physiological/pathological conditions. As a consequence, high-throughput, large-scale experimental methodologies, combined with bioinformatics analysis of DNA, RNA, and protein data projected biological sciences into the so-called post-genomic functional genomics era. The exploration of all genes or proteins at once, in a systematic fashion, represents a sort of revolution, shifting molecular biology and medicine research from a reductionistic, hypothesis-driven approach toward deciphering how genes and their products work, how they interact in pathways within the cells, and what roles they play in health and disease (Chipping Forecast I, 1999; Chipping Forecast II, 2002). Oligonucleotide and cDNA microarrays for transcriptional profiling (Lockhart et al., 1996; Schena et al., 1995) allow measuring such interaction patterns, thus representing an unprecedented opportunity to boost the identification of diagnostic and therapeutic targets (Brownetal., 1999). The principle of a microarray for gene expression analysis is basically that of the classical northern blot extended to the whole genome level. Specifically, mRNA from a given cell line or tissue is labeled with a fluorescent dye and hybridized to a large number of DNA sequences, immobi