Technical Note: Naive Bayes for Regression (original) (raw)

References

Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Proceedings of the 2nd International Symposium on Information Theory (pp. 267–281). Budapest: Akadémiai Kiadó.
Google Scholar
Atkeson, C. G., Moore, A.W., & Schaal, S. (1997). Locally weighted learning. Artificial Intelligence Review, 11, 11–73.
Google Scholar
Blake, C., Keogh, E., & Merz, C. J. (1998). UCI repository of machine learning data-bases. Irvine, CA: University of California, Department of Information and Computer Science. [http://www.ics.uci.edu/\~mlearn/ MLRepository.html].
Google Scholar
Cestnik, B. (1990). Estimating probabilities:A crucial task in machine learning. In Proceedings of the 9th European Conference on Artificial Intelligence, Stockholm, Sweden (pp. 147–149). London: Pitman.
Google Scholar
Clark, P. & Niblett, T. (1989). The CN2 Induction Algorithm. Machine Learning, 3(4), 261–283.
Google Scholar
Domingos, P. & Pazzani, M. (1997). On the optimality of the simple Bayesian classifier under zero-one loss. In Machine Learning, 29(2/3), 103–130.
Google Scholar
Duda, R. & Hart, P. (1973). Pattern classification and scene analysis. New York: Wiley.
Google Scholar
Fayyad, U. M. & Irani, K. B. (1993). Multi-interval discretization of continuous-valued attributes for classification learning. In Proceedings of the 13th International Joint Conference on Artificial Intelligence, Chambery, France (pp. 1022–1027). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Frank, E., Wang, Y., Inglis, S., Holmes, G., & Witten, I. H. (1998). Using model trees for classification. Machine Learning, 32(1), 63–76.
Google Scholar
Friedman, J. (1997). On bias, variance, 0/1-loss, and the curse-of-dimensionality. Data Mining and Knowledge Discovery, 1, 55–77.
Google Scholar
Friedman, N., Geiger, D., & Goldszmidt, M. (1997). Bayesian network classifiers. Machine Learning, 29(2/3), 131–163.
Google Scholar
Ghahramani, Z.& Jordan, M. I. (1994). Supervised learning from incomplete data via anEMapproach. In Advances in neural information processing systems 6 (pp. 120–127). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Holte, R. C. (1993). Very simple classification rules perform well on most commonly used datasets. Machine Learning, 11, 63–91.
Google Scholar
John, G. H. & Kohavi, R. (1997). Wrappers for feature subset selection. Artificial Intelligence, 97(1/2), 273–324.
Google Scholar
John, G. H. & Langley P. (1995). Estimating continuous distributions in Bayesian Classifiers. Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, Montreal, Quebec (pp. 338–345). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Kasif, S., Salzberg, S., Waltz, D., Rachlin, J., & Aha, D.W. (1998). A probabilistic framework for memory-based reasoning. Artificial Intelligence, 104(1/2), 297–312.
Google Scholar
Kilpatrick, D. & Cameron-Jones, M. (1998). Numeric prediction using instance-based learning with encoding length selection. In Progress in Connectionist-Based Information Systems, Dunedin, New Zealand (pp. 984–987). Singapore: Springer-Verlag.
Google Scholar
Kononenko, I. (1991). Semi-naive Bayesian classifier. In Proceedings of the 6th European Working Session on Learning, Porto, Portugal (pp. 206–219). Berlin: Springer-Verlag.
Google Scholar
Kononenko, I. (1998). Personal Communication.
Langley, P., Iba, W., & Thompson, K. (1992). An analysis of Bayesian classifiers. In Proceedings of the 10th National Conference on Artificial Intelligence, San Jose, CA (pp. 223–228). Menlo Park, CA: AAAI Press.
Google Scholar
Langley, P. (1993).Induction of recursive Bayesian classifiers. In Proceedings of the 8th European Conference on Machine Learning, Vienna, Austria (pp. 153–164). Berlin: Springer-Verlag.
Google Scholar
Langley, P. & Sage, S. (1994). Induction of selective Bayesian classifiers, In Proceedings of the 10th Conference on Uncertainty in Artificial Intelligence, Seattle, WA (pp. 399–406). San Mateo, CA: Morgan Kaufmann.
Google Scholar
Lehmann, E. L. (1983). Theory of point estimation. New York: Wiley.
Google Scholar
Pazzani, M. (1996). Searching for dependencies in Bayesian classifiers. In Learning from data: Artificial intelligence and statistics V (pp. 343–348). New York: Springer-Verlag.
Google Scholar
Quinlan, J. R. (1992). Learning with continuous classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, Australia (pp. 343–348). Singapore: World Scientific.
Google Scholar
Quinlan, J. R. (1993). C4.5: Programs for machine learning. San Mateo, CA: Morgan Kaufmann.
Google Scholar
Sahami, M. (1996). Learning limited dependence Bayesian classifiers. In Proceedings of the 2nd International Conference on Knowledge Discovery and Data Mining, Portland, OR (pp. 335–338). Menlo Park, CA: AAAI Press.
Google Scholar
Silverman, B. W. (1986). Density estimation for statistics and data analysis. New York: Chapman and Hall.
Google Scholar
Simonoff, J. S. (1996). Smoothing methods in statistics. New York: Springer-Verlag.
Google Scholar
Smyth, P., Gray, A., & Fayyad, U. M. (1995). Retrofitting decision tree classifiers using kernel density estimation, In Proceedings of the 12th International Conference on Machine Learning, Tahoe City, CA (pp. 506–514). San Francisco, CA: Morgan Kaufmann.
Google Scholar
StatLib (1999). Department of Statistics, Carnegie Mellon University. [http://lib.stat.cmu.edu\].
Wang, Y. & Witten, I. H. (1997). Induction of model trees for predicting continuous classes, In Proceedings of the Poster Papers of the European Conference on Machine Learning, Prague (pp. 128–137). Prague: University of Economics, Faculty of Informatics and Statistics.
Google Scholar

Download references