Learning Team Strategies: Soccer Case Studies (original) (raw)
References
Albus, J.S. (1975). A new approach to manipulator control: The cerebellar model articulation controller (CMAC). Dynamic Systems, Measurement and Control, 97, 220–227. Google Scholar
Asada, M., Uchibe, E., Noda, S., Tawaratsumida, S., & Hosoda, K. (1994). A vision-based reinforcement learning for coordination of soccer playing behaviors. Proceedings of AAAI-94 Workshop on AI and A-life and Entertainment (pp. 16–21).
Baluja, S. (1994). Population-based incremental learning: A method for integrating genetic search based function optimization and competitive learning (Technical Report CMU-CS-94-163), Carnegie Mellon University, Pittsburgh. Google Scholar
Baluja, S., & Caruana, R. (1995). Removing the genetics from the standard genetic algorithm. In A. Prieditis, & S. Russell (Eds.), Machine Learning: Proceedings of the Twelfth International Conference (pp. 38–46). San Francisco, CA: Morgan Kaufmann Publishers. Google Scholar
Cramer, N.L. (1985). A representation for the adaptive generation of simple sequential programs. In J. Grefenstette (Ed.), Proceedings of an International Conference on Genetic Algorithms and their Applications (pp. 183–187). Hillsdale, NJ: Lawrence Erlbaum Associates. Google Scholar
Crites, R., & Barto, A. (1996). Improving elevator performance using reinforcement learning. In D. Touretzky, M. Mozer, & M. Hasselmo (Eds.), Advances in Neural Information Processing Systems (Vol. 8, pp. 1017–1023). Cambridge, MA: MIT Press. Google Scholar
Dickmanns, D., Schmidhuber, J., & Winklhofer, A. (1987). Der genetische Algorithmus: Eine Implementierung in Prolog. Fortgeschrittenenpraktikum, Institut für Informatik, Lehrstuhl Prof. Radig, Technische Universität München.
Gallant, S.I. (1993). Neural Network Learning and Expert Systems. Cambridge, MA: MIT Press. Google Scholar
Koza, J.R. (1992). Genetic Programming—On the Programming of Computers by Means of Natural Selection. Cambridge, MA: MIT Press. Google Scholar
Levin, L.A. (1973). Universal sequential search problems. Problems of Information Transmission, 9(3), 265–266. Google Scholar
Levin, L.A. (1984). Randomness conservation inequalities: Information and independence in mathematical theories. Information and Control, 61, 15–37. Google Scholar
Li, M., & Vitányi, P.M.B. (1993). An Introduction to Kolmogorov Complexity and its Applications. New York, NY: Springer-Verlag. Google Scholar
Lin, L.J. (1993). Reinforcement learning for robots using neural networks. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA. Google Scholar
Littman, M.L. (1994). Markov games as a framework for multi-agent reinforcement learning. In A. Prieditis, & S. Russell (Eds.), Machine Learning: Proceedings of the Eleventh International Conference (pp. 157–163). San Francisco, CA: Morgan Kaufmann Publishers. Google Scholar
Luke, S., Hohn, C., Farris, J., Jackson, G., & Hendler, J. (1997). Co-evolving soccer softbot team coordination with genetic programming. Proceedings of the First International Workshop on RoboCup, at the International Joint Conference on Artificial Intelligence (IJCAI-97).
Matsubara, H., Noda, I., & Hiraki, K. (1996). Learning of cooperative actions in multi-agent systems: A case study of pass play in soccer. In S. Sen (Ed.), Working Notes for the AAAI-96 Spring Symposium on Adaptation, Coevolution and Learning in Multi-agent Systems (pp. 63–67). Menlo Park, CA: AAAI Press. Google Scholar
Nadella, R., & Sen, S. (1996). Correlating internal parameters and external performance: Learning soccer agents. In G. Weiss (Ed.), Distributed Artificial Intelligence Meets Machine Learning. Learning in Multi-Agent Environments (pp. 137–150). Berlin: Springer-Verlag. Google Scholar
Nowlan, S.J., & Hinton, G.E. (1992). Simplifying neural networks by soft weight sharing. Neural Computation, 4, 173–193. Google Scholar
Peng, J., & Williams, R. (1996). Incremental multi-step Q-learning. Machine Learning, 22, 283–290. Google Scholar
Sahota, M. (1993). Real-time intelligent behaviour in dynamic environments: Soccer-playing robots. Master's thesis, University of British Columbia.
Sałustowicz, R.P., & Schmidhuber, J. (1997). Probabilistic incremental program evolution. Evolutionary Compu-tation, 5(2), 123–141. Google Scholar
Sałustowicz, R.P., Wiering, M.A., & Schmidhuber, J. (1997a). Evolving soccer strategies. Proceedings of the Fourth International Conference on Neural Information Processing (ICONIP'97) (pp. 502–506). Singapore: Springer-Verlag. Google Scholar
Sałustowicz, R.P., Wiering, M.A., & Schmidhuber, J. (1997b). On learning soccer strategies. In W. Gerstner, A. Germond, M. Hasler, & J.-D. Nicoud (Eds.), Proceedings of the Seventh International Conference on Artificial Neural Networks (ICANN'97), volume 1327 of Lecture Notes in Computer Science (pp. 769–774). Berlin Heidelberg: Springer-Verlag. Google Scholar
Schmidhuber, J. (1997a). Discovering neural nets with low Kolmogorov complexity and high generalization capability. Neural Networks, 10(5), 857–873. Google Scholar
Schmidhuber, J. (1997b). A general method for incremental self-improvement and multi-agent learning in unrestricted environments. In X. Yao (Ed.), Evolutionary Computation: Theory and Applications. Singapore: Scientific Publ. Co., in press.
Schmidhuber, J., Zhao, J., & Schraudolph, N. (1997a). Reinforcement learning with self-modifying policies. In S. Thrun & L. Pratt (Eds.), Learning to Learn (pp. 293–309). Boston, MA: Kluwer. Google Scholar
Schmidhuber, J., Zhao, J., & Wiering, M. (1997b). Shifting inductive bias with success-story algorithm, adaptive Levin search, and incremental self-improvement. Machine Learning, 28, 105–130. Google Scholar
Solomonoff, R. (1986). An application of algorithmic probability to problems in artificial intelligence. In L.N. Kanal & J.F. Lemmer (Eds.), Uncertainty in Artificial Intelligence (pp. 473–491). Elsevier Science Publishers.
Stone, P., & Veloso, M. (1996a). Beating a defender in robotic soccer: Memory-based learning of a continuous function. In G. Tesauro, D.S. Touretzky, & T.K. Leen (Eds.), Advances in Neural Information Processing Systems (Vol. 8, pp. 896–902). Cambridge, MA: MIT Press. Google Scholar
Stone, P., & Veloso, M. (1996b). A layered approach to learning client behaviors in the robocup soccer server. Applied Artificial Intelligence (AAI), 1998, to appear.
Sutton, R.S. (1988). Learning to predict by the methods of temporal differences. Machine Learning, 3, 9–44. Google Scholar
Sutton, R.S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. In D.S. Touretzky, M.C. Mozer, & M.E. Hasselmo (Eds.), Advances in Neural Information Processing Systems (Vol. 8, pp. 1038–1045). Cambridge, MA: MIT Press. Google Scholar
Sutton, R.S. (1997). Personal communication at the Seventh International Conference on Artificial Neural Networks (ICANN'97).
Tesauro, G. (1994). TD-gammon, a self-teaching backgammon program, achieves master-level play. Neural Computation, 6(2), 215–219. Google Scholar
Versino, C., & Gambardella, L.M. (1997). Learning real team solutions. In G. Weiss (Ed.), DAI Meets Machine Learning, volume 1221 of Lecture Notes in Artificial Intelligence (pp. 40–61). Berlin: Springer-Verlag. Google Scholar
Watkins, C. (1989). Learning from Delayed Rewards. PhD thesis, King's College, Cambridge. Google Scholar
Weiss, G. (1996). Adaptation and learning in multi-agent systems: Some remarks and a bibliography. In G.Weiss & S. Sen (Eds.), Adaptation and Learning in Multi-Agent Systems, volume 1042 of Lecture Notes in Artificial Intelligence (pp. 1–21). Berlin Heidelberg: Springer-Verlag. Google Scholar
Widrow, B., & Hoff, M.E. (1960). Adaptive switching circuits. 1960 IRE WESCON Convention Record (Vol. 4, pp. 96–104). New York: IRE. Reprinted in Anderson and Rosenfeld (1988). Google Scholar
Wiering, M.A., & Schmidhuber, J. (1996). Solving POMDPs with Levin search and EIRA. In L. Saitta (Ed.), Machine Learning: Proceedings of the Thirteenth International Conference (pp. 534–542). San Francisco, CA: Morgan Kaufmann Publishers. Google Scholar
Wiering, M.A., & Schmidhuber, J. (1997). Fast online Q(λ) (Technical Report IDSIA-21-97), IDSIA,Lugano, Switzerland. Google Scholar