SUBGOAL LEARNING - HIERARCHICAL LEARNING (original) (raw)

12. M. Ring, T. Schaul, J. Schmidhuber. The Two-Dimensional Organization of Behavior. In Proc. Joint IEEE International Conference on Development and Learning (ICDL) and on Epigenetic Robotics (ICDL-EpiRob 2011), Frankfurt, 2011.PDF.

11. B. Bakker and J. Schmidhuber.Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization (PDF). In F. Groen, N. Amato, A. Bonarini, E. Yoshida, and B. Kr�se (Eds.), Proceedings of the 8-th Conference on Intelligent Autonomous Systems, IAS-8, Amsterdam, The Netherlands, p. 438-445, 2004.

10. B. Bakker and J. Schmidhuber.Hierarchical Reinforcement Learning with Subpolicies Specializing for Learned Subgoals (PDF). In M. H. Hamza (Ed.), Proceedings of the 2nd IASTED International Conference on Neural Networks and Computational Intelligence, NCI 2004, Grindelwald, Switzerland, p. 125-130, 2004.

9. An optimal way of creating and solving subgoals in general reinforcement learning settings is the Goedel machine (J. Schmidhuber, 2003).

8. A bias-optimal way of creating and solving subgoals in the context of ordered problem sequences is theOptimal Ordered Problem Solver (J. Schmidhuber, 2002-2004).

7. R. Salustowicz and J. Schmidhuber.Learning to predict through PIPE and automatic task decomposition.Technical Report IDSIA-11-98, IDSIA, April 1998.

6. M. Wiering and J. Schmidhuber.HQ-Learning.Adaptive Behavior 6(2):219-246, 1997 (122 K).PDF.HTML.

5. M. Wiering and J. Schmidhuber.HQ-Learning: Discovering Markovian subgoals for non-Markovian reinforcement learning.Technical Report IDSIA-95-96, IDSIA, October 1996.

4. J. Schmidhuber. Netzwerkarchitekturen, Zielfunktionen und Kettenregel. (Net architectures, objective functions, and chain rule.) Habilitation (postdoctoral thesis - qualification for a tenure professorship), Institut für Informatik, Technische Universität München, 1993 (496 K). PDF . HTML.

3. J. Schmidhuber and R. Wahnsiedler. Planning simple trajectories using neural subgoal generators. In J. A. Meyer, H. L. Roitblat, and S. W. Wilson, editors, Proc. of the 2nd International Conference on Simulation of Adaptive Behavior, pages 196-202. MIT Press, 1992. PDF . HTML without images. HTML & images in German.

2. J. Schmidhuber. Learning to generate sub-goals for action sequences. In T. Kohonen, K. Mäkisara, O. Simula, and J. Kangas, editors,Artificial Neural Networks, pages 967-972. Elsevier Science Publishers B.V., North-Holland, 1991. PDF . HTML. HTML & images in German.

1. J. Schmidhuber. Towards compositional learning with dynamic neural networks. Technical Report FKI-129-90, Institut für Informatik, Technische Universität München, 1990.

Related work on hierarchies of Recurrent Neural Networks with multiple self-organizing time scales:

**(B)**J. Schmidhuber.Learning complex, extended sequences using the principle of history compression.Neural Computation, 4(2):234-242, 1992 (41 K).PDF.HTML.

**(A)**J. Schmidhuber. Learning unambiguous reduced sequence descriptions. In J. E. Moody, S. J. Hanson, and R. P. Lippman, editors, Advances in Neural Information Processing Systems 4, NIPS'4, pages 291-298. San Mateo, CA: Morgan Kaufmann, 1992. PDF . HTML.