Rigorous Learning Curve Bounds from Statistical Mechanics (original) (raw)
References
Amari, S., Fujita, N., & Shinomoto, S. (1992). Four types of learning curves. Neural Computation, 4(4):605–618. Google Scholar
Baum, E.B., & Lyuu, Y.-D. (1991). The transition to perfect generalization in perceptrons. Neural Comput., 3:386–401. Google Scholar
Benedek, G., & Itai, A. (1991). Learnability with respect to fixed distributions. Theoret. Comput. Sci., 86(2):377–389. Google Scholar
Cohn, D., & Tesauro, G. (1992). How tight are the Vapnik-Chervonenkis bounds? Neural Comput., 4:249–269. Google Scholar
Cover, T., & Thomas, J. (1991). Elements of Information Theory, Wiley.
Devroye, L., & Lugosi, G. (1994). Lower bounds in pattern recognition and learning. Preprint.
Dudley, R.M. (1978). Central limit theorems for empirical measures. Annals of Probability, 6(6):899–929. Google Scholar
Ehrenfeucht, A., Haussler, D., Kearns, M., & Valiant, L. (1989). Ageneral lower bound on the number of examples needed for learning. Information and Computation, 82(3):247–251. Google Scholar
Engel, A., & Fink, W. (1993). Statistical mechanics calculation of Vapnik Chervonenkis bounds for perceptrons. J. Phys., 26:6893–6914. Google Scholar
Engel, A., & van den Broeck, C. (1993). Systems that can learn from examples: replica calculation of uniform convergence bounds for the perceptron. Phys. Rev. Lett., 71:1772–1775. Google Scholar
Gardner, E. (1988). The space of interactions in neural network models. J. Phys., A21:257–270. Google Scholar
Gardner, E., & Derrida, B. (1989). Three unfinished works on the optimal storage capacity of networks. J. Phys., A22:1983–1994. Google Scholar
Goldman, S.A., Kearns, M.J., & Schapire, R.E. (1990). On the sample complexity of weak learning. In Proceedings of the 3rd Workshop on Computational Learning Theory(pp. 217–231), San Mateo, CA: Morgan Kaufmann. Google Scholar
Györgyi, G. (1990). First-order transition to perfect generalization in a neural network with binary synapses. Phys. Rev., A41:7097–7100. Google Scholar
Gyorgyi, G., & Tishby, N. (1990). Statistical theory of learning a rule. In K. Thuemann & R. Koeberle (Eds.), Neural Networks and Spin Glasses, World Scientific.
Haussler, D. (1992). Decision-theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1):78–150. Google Scholar
Haussler, D., Kearns, M., & Schapire, R.E. (1991). Bounds on the sample complexity of Bayesian learning using information theory and the VC dimension. In Proceedings of the 4th Workshop on Computational Learning Theory (pp. 61–74), San Mateo, CA: Morgan Kaufmann. Google Scholar
Levin, E., Tishby, N., & Solla, S. (1989). A statistical approach to learning and generalization in neural networks. In R. Rivest, (Ed.), Proc. 3rd Annu. Workshop on Comput. Learning Theory, Morgan Kaufmann.
Lyuu, Y.-D., & Rivin, I. (1992) Tight bounds on transition to perfect generalization in perceptrons. Neural Comput., 4:854–862. Google Scholar
Martin, G.L., & Pittman, J.A. (1991). Recognizing hand-printed letters and digits using backpropagation learning. Neural Comput., 3:258–267. Google Scholar
Oblow, E. (1992). Implementing Valiant's learnability theory using random sets. Machine Learning, 8(1):45–74. Google Scholar
Pollard, D. (1984). Convergence of Stochastic Processes, Springer-Verlag.
Sarrett, W., & Pazzani, M. (1992). Average case analysis of empirical and explanation-based learning algorithms. Machine Learning, 9(4):349–372. Google Scholar
Seung, H.S., Sompolinsky, H., & Tishby, N. (1992). Statistical mechanics of learning from examples. Physical Review, A45:6056–6091. Google Scholar
Simon, H.U. (1993). General bounds on the number of examples needed for learning probabilistic concepts. In Proceedings of the 6th Annual ACM Conference on Computational Learning Theory(pp. 402–411), New York, NY: ACM Press. Google Scholar
Sompolinsky, H., Seung, H.S., & Tishby, N. (1991). Learning curves in large neural networks. In Proc. 4th Annu. Workshop on Comput. Learning Theory (pp. 112–127), San Mateo, CA: Morgan Kaufmann. Google Scholar
Sompolinsky, H., Tishby, N., & Seung, H.S. (1990). Learning from examples in large neural networks. Phys. Rev. Lett., 65(13):1683–1686. Google Scholar
Vapnik, V., Levin, E., & LeCun, Y. (1994). Measuring the VC dimension of a learning machine. Neural Compu-tation, 6(5):851–876. Google Scholar
Vapnik, V.N. (1982). Estimation of Dependences Based on Empirical Data, Springer-Verlag, New York. Google Scholar
Vapnik, V.N., & Chervonenkis, A.Y. (1971). On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2):264–280. Google Scholar
Watkin, T.L.H., Rau, A., & Biehl, M. (1993). The statistical mechanics of learning a rule. Rev. Mod. Phys., 65:499–556. Google Scholar