Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., & Levine, A. (1999). Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon cancer tissues probed by oligonucleotide arrays. Cell Biology, 96, 6745–6750. Google Scholar
Ambroise, C., & McLachlan, G. J. (2002). Selection bias in gene extraction on the basis of microarray gene-expression data. Proc. Natl. Acad. Sci. USA, 99:10, 6562–6566. Google Scholar
Anthony, M., & Bartlett, P. L. (1999). _Neural Network Learning: Theoretical Foundation_s. Cambridge University Press.
Ben-Dor, A., Bruhn, L., Friedman, N., Nachman, I., Schummer, M., & Yakhini, Z. (2000). Tissue classification with gene expression profiles. Journal of Computational Biology, 7, 559–584. Google Scholar
Breiman, L. (1998). Arcing classifiers. _The Annals of Statistic_s.
Dubhashi, D., & Ranjan, D. (1998). Balls and bins: A study in negative dependence. Random Structures and Algorithms, 13:2, 99–124. Google Scholar
Duda, R. O., & Hart, P. E. (1973). _Pattern Classification and Scene Analysi_s. Wiley.
Dudoit, S., Fridlyand, J., & Speed, T. P. (2002). Comparison of discrimination methods for the classification of tumors using gene expression data. Journal of the American Statistical Association, 97:457, 77–87. Google Scholar
Freund, Y. (1995). Boosting a weak learning algorithm by majority. _Information and Computation, 121:_2, 256–285. Google Scholar
Freund, Y., & Schapire, R. (1996). Experiments with a new boosting algorithm. In _Proceedings of the Thirteenth International Conference on Machine Learnin_g. J. Japan. Soc. for Artif. Intel., 14:5, 771–780. Google Scholar
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55:1, 119–139. Google Scholar
Golub, T. R., Slonim, D. K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J. P., Coller, H., Loh, M., Downing, J. R., Caligiuri, M. A., Bloomfield, C. D., & Lander, E. S. (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286, 531–537. Google Scholar
Guyon, I., Weston, J., Barnhill, S., & Vapnik, V. (2002). Gene selection for cancer classification using support vector machines. Machine Learning, 46: 1–3, 389–422. Google Scholar
Haussler, D. (1992). Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100:1, 78–150. Google Scholar
Haussler, D., Littlestone, N., & Warmuth, M. K. (1994). Predicting {0, 1}-functions on randomly drawn points. Information and Computation, 115:2, 129–161. Google Scholar
Iba, W., & Langley, P. (1992). Induction of one-level decision trees. In _Proc. of the 9th International Workshop on Machine Learnin_g.
Joachims, T. (1998). Making large-scale support vector machines learning practical. In B. S. Olkopf, C. Burges, & A. Smola (Eds.), _Advances in Kernel Methods: Support vector machine_s, pp. 169–184.
Kearns, M., Mansour, Y., Ng, A. Y., & Ron, D. (1997). An experimental and theoretical comparison of model selection methods. Machine Learning, 27, 7–50. Google Scholar
Kivinen, J., & Warmuth, M. (1999). Boosting as entropy projection. In Proc. COLT'99.
Li, Y., Campbell, C., & Tipping, M. (2002). Bayesian automatic relevance determination algorithms for classifying gene expression data. Bioinformatics, 18:10, 1332–1339. Google Scholar
Li, Y., Long, P. M., & Srinivasan, A. (2001). Improved bounds on the sample complexity of learning. Journal of Computer and System Sciences, 62:3, 516–527. Google Scholar
Mason, L., Bartlett, P. L., & Baxter, J. (2000). Improved generalization through explicit optimization of margins. Machine Learning, 38:3, 243–255. Google Scholar
Miller, L. D., Long, P. M., Wong, L., Mukherjee, S., McShane, L. M., & Liu, E. T. (2002). Optimal gene expression analysis by microarrays. Cancer Cell, 2:5, 353–361. Google Scholar
Panchenko, D., & Koltchinskii, V. (2002). Empirical margin distributions and bounding the generalization error of combined classifiers. Annals of Statistics, 30:1.
Parker, C. W. (1990). Immunoassays. In M. P. Deutscher (Ed.), Guide to Protein Purification. Academic Press.
Pomeroy, S. L., Tamayo, P., Gaasenbeek, M., Sturla, L. M., Angelo, M., McLaughlin, M. E., Kim, J. Y., Goumnerova, L. C., Black, P. M., Lau, C., Allen, J. C., Zagzag, D., Olson, J. M., Curran, T., Wetmore, C., Biegel, J. A., Poggio, T., Mukherjee, S., Rifkin, R., Califano, A., Stolovitzky, G., Louis, D. N., Mesirov, J. P., Lander, E. S., & Golub, T. R. (2002). Prediction of central nervous system embryonal tumour outcome based on gene expression. Nature, 415, 436–442. Google Scholar
Rätsch, G., Onoda, T., & Müller, K.-R. (2001). Soft margins for adaBoost. Machine Learning, 42:3, 287–320. Also NeuroCOLT Technical Report NC-TR-1998-021. In press. Google Scholar
Schapire, R., & Singer, Y. (1999). Improved boosting algorithms using confidence-rated predictions. Machine Learning, 37:3, 297–336. Google Scholar
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5:2, 197–226. Google Scholar
Schapire, R. E., Freund, Y., Bartlett, P., & Lee, W. S. (1998). Boosting the Margin: A new explanation for the effectiveness of voting methods. The Annals of Statistics, 26:5, 1651–1686. Google Scholar
Shawe-Taylor, J., Bartlett, P., Williamson, R., & Anthony, M. (1998). Structural risk minimization over data-dependent hierarchies. IEEE Transactions on Information Theory, 44:5, 1926–1940. Google Scholar
Talagrand, M. (1994). Sharper bounds for Gaussian and empirical processes. Annals of Probability, 22, 28–76. Google Scholar
Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27:11, 1134–1142. Google Scholar
Vapnik, V. (1998). _Statistical Learning Theor_y. New York.
Vapnik, V. N. (1982). _Estimation of Dependencies based on Empirical Dat_a. Springer Verlag.
Vapnik, V. N. (1989). Inductive principles of the search for empirical dependences (methods based on weak convergence of probability measures). In _Proceedings of the 1989 Workshop on Computational Learning Theor_y.
Vapnik, V. N. (1995). _The Nature of Statistical Learning Theor_y. Springer
Vapnik, V. N., & Chervonenkis, A. Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16:2, 264–280. Google Scholar
West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., J. A. O., Jr., Marks, J. R., & Nevins, J. R. (2001). Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA, 98:20, 11462–11467. Google Scholar
Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T., & Vapnik, V. (2000). Feature selection for SVMs. In NIPS, pp. 668–674.
Xing, E., Jordan, M., & Karp, R. (2001). Feature selection for high-dimensional genomic microarray data. _Eigh-teenth International Conference on Machine Learnin_g.