John Loch | University of Colorado, Boulder (original) (raw)
Papers by John Loch
effect of eligibility traces on finding optimal memoryless
A reflexive hierarchical control architecture is developed which generates a trajectory that atta... more A reflexive hierarchical control architecture is developed which generates a trajectory that attains a goal position while avoiding obstacles and underwater terrain for an autonomous underwater vehicle (AUV) being developed by the M.I.T. Sea Grant College Program and C.S. Draper Laboratory. The reflexive hierarchical control architecture consists of multiple modules that independently generate control commands for the vehicle control system to track. Instead of satisfying all mission requirements in one executive module, the reflexive modules generate commands which satisfy only their subset of the mission requirements. These modules vie for control of the vehicle through an arbitration algorithm which yields control to the most critical module. By limiting the planning scope of the individual modules they are able to generate commands in real-time. Simulation results are shown which demonstrate that the vehicle is capable of traversing an unknown underwater environment to a goal po...
Semiautonomous rover vehicle serves as testbed for evaluation of navigation and obstacle-avoidanc... more Semiautonomous rover vehicle serves as testbed for evaluation of navigation and obstacle-avoidance techniques. Designed to traverse variety of terrains. Concepts developed applicable to robots for service in dangerous environments as well as to robots for exploration of remote planets. Called Robby, vehicle 4 m long and 2 m wide, with six 1-m-diameter wheels. Mass of 1,200 kg and surmounts obstacles as large as 1 1/2 m. Optimized for development of machine-vision-based strategies and equipped with complement of vision and direction sensors and image-processing computers. Front and rear cabs steer and roll with respect to centerline of vehicle. Vehicle also pivots about central axle, so wheels comply with almost any terrain.
Recent research on hidden-state reinforcement learning (RL) problems has concentrated on overcomi... more Recent research on hidden-state reinforcement learning (RL) problems has concentrated on overcoming partial observability by using memory to estimate state. However, such methods are computationally extremely expensive and thus have very limited applicability. This emphasis on state estimation has come about because it has been widely observed that the presence of hidden state or partial observability renders popular RL methods such as Q-learning and Sarsa useless. However, this observation is misleading in two ways: first, the theoretical results supporting it only apply to RL algorithms that do not use eligibility traces, and second these results are worst-case results, which leaves open the possibility that there may be large classes of hidden-state problems in which RL algorithms work well without any state estimation. In this paper we show empirically that Sarsa(λ), a well known family of RL algorithms that use eligibility traces, can work very well on hidden state problems tha...
This paper describes a series of experiments that were performed on the Rocky III robot. Rocky II... more This paper describes a series of experiments that were performed on the Rocky III robot. Rocky III is a small autonomous rover capable of navigating through rough outdoor terrain to a predesignated area, searching that area for soft soil, acquiring a soil sample, and depositing the sample in a container at its home base. The robot is programmed according to a reactive behavior-control paradigm using the ALFA programming language. This style of programming produces robust autonomous performance while requiring significantly less computational resources than more traditional mobile robot control systems. The code for Rocky III runs on an 8-bit processor and uses about 10k of memory.
1993 Proceedings Ieee International Conference on Robotics and Automation, May 2, 1993
Small, light, highly mobile robotic vehicles called "microrovers" use sensors and artif... more Small, light, highly mobile robotic vehicles called "microrovers" use sensors and artificial intelligence to perform complicated tasks autonomously. Vehicle navigates, avoids obstacles, and picks up objects using reactive control scheme selected from among few preprogrammed behaviors to respond to environment while executing assigned task. Under development for exploration and mining of other planets. Also useful in firefighting, cleaning up chemical spills, and delivering materials in factories. Reactive control scheme and principle of behavior-description language useful in reducing computational loads in prosthetic limbs and automotive collision-avoidance systems.
Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems Ii, Jul 20, 1999
Agents acting in the real world are confronted with the problem of making good decisions with lim... more Agents acting in the real world are confronted with the problem of making good decisions with limited knowledge of the environment. Partially observable Markov decision processes (POMDPs) model decision problems in which an agent tries to maximize its reward in the face of limited sensor feedback. Recent work has shown empirically that a reinforcement learning (RL) algorithm called Sarsa(A) can efficiently find optimal memoryless policies, which map current observations to actions, for POMDP problems (Loch and Singh 1998). The Sarsa(A) algorithm uses a form of short-term memory called an eligibility trace, which distributes temporally delayed rewards to observation-action pairs which lead up to the reward. This paper explores the effect of eligibility traces on the ability of the Sarsa(A) algorithm to find optimal memoryless policies. A variant of Sarsa(A) called k-step truncated Sarsa(A) is applied to four test problems taken from the recent work of Littman, Littman, Cassandra and Kaelbling, Parr and Russell, and Chrisman. The empirical results show that eligibility traces can be significantly truncated without affecting the ability of Sarsa(A) to find optimal memoryless policies for POMDPs.
Missions Technologies and Design of Planetary Mobile Vehicles, 1993
A series of experiments that were performed on the Rocky 3 robot is described. Rocky 3 is a small... more A series of experiments that were performed on the Rocky 3 robot is described. Rocky 3 is a small autonomous rover capable of navigating through rough outdoor terrain to a predesignated area, searching that area for soft soil, acquiring a soil sample, and depositing the sample in a container at its home base. The robot is programmed according to a reactive behavior control paradigm using the ALFA programming language. This style of programming produces robust autonomous performance while requiring significantly less computational resources than more traditional mobile robot control systems. The code for Rocky 3 runs on an eight bit processor and uses about ten k of memory.
SPIE Proceedings, 1992
The performance of autonomous mobile robots performing complex navigation tasks can be dramatical... more The performance of autonomous mobile robots performing complex navigation tasks can be dramatically improved by directly expensive sensing and planning in service of the task. The task-direction algorithms can be quite simple. In this paper we describe a simple task-directed vision system which has been implemented on a real outdoor robot which navigates using stereo vision. While the performance of this particular robot was improved by task-directed vision, the performance of task-directed vision in general is influenced in complex ways by many factors. We briefly discuss some of these, and present some initial simulated results.
Sensor Fusion V, 1992
This paper describes the control system for Rocky IV, a prototype microrover designed to demonstr... more This paper describes the control system for Rocky IV, a prototype microrover designed to demonstrate proof-of-concept for a low-cost scientific mission to Mars. Rocky IV uses a behavior-based control architecture which implements a large variety of functions displaying various degrees of autonomy, from completely autonomous long-duration conditional sequences of actions to very precisely described actions resembling classical AI operators. The control system integrates information from infrared proximity sensors, proprioceptive encoders which report on the state of the articulation of the rover's suspension system and other mechanics, a homing beacon, a magnetic compass, and contact sensors. In addition, significant functionality is implemented as 'virtual sensors', computed values which are presented to the system as if they were sensors values. The robot is able to perform a variety of useful tasks, including soil sample collection, removal of surface weathering layers from rocks, spectral imaging, instrument deployment, and sample return, under realistic mission- like conditions in Mars-like terrain.
IEEE Transactions on Automation Science and Engineering, 1994
This paper describes a series of robots developed at JPL to demonstrate the feasibility of using ... more This paper describes a series of robots developed at JPL to demonstrate the feasibility of using a behavior-control approach to control small robots on planetary surfaces. The round-trip light-time delay makes direct teleoperation of a mobile robot on a planetary surface impossible. Planetary rovers must therefore possess a certain degree of autonomy. However, small robots can only support small computers
... vehicles to perform intelligent robotic behaviors. A primary reasons for slow progress in the... more ... vehicles to perform intelligent robotic behaviors. A primary reasons for slow progress in these areas of research, is the limited availability of AUV platforms to missionplanning and control researchers. The lack of AUV testbeds has ...
Proceedings of the National Conference on Artificial …
This paper describes a series of experiments that were performed on the Rocky III robot. 1 Rocky ... more This paper describes a series of experiments that were performed on the Rocky III robot. 1 Rocky III is a small autonomous rover capable of navigating through rough outdoor terrain to a predesignated area, searching that area for soft ...
Proceedings of the Fifteenth International Conference …
Recent research on hidden-state reinforce- ment learning (RL) problems has concen- trated on over... more Recent research on hidden-state reinforce- ment learning (RL) problems has concen- trated on overcoming partial observability by using memory to estimate state. However, such methods are computationally extremely expensive and ...
effect of eligibility traces on finding optimal memoryless
A reflexive hierarchical control architecture is developed which generates a trajectory that atta... more A reflexive hierarchical control architecture is developed which generates a trajectory that attains a goal position while avoiding obstacles and underwater terrain for an autonomous underwater vehicle (AUV) being developed by the M.I.T. Sea Grant College Program and C.S. Draper Laboratory. The reflexive hierarchical control architecture consists of multiple modules that independently generate control commands for the vehicle control system to track. Instead of satisfying all mission requirements in one executive module, the reflexive modules generate commands which satisfy only their subset of the mission requirements. These modules vie for control of the vehicle through an arbitration algorithm which yields control to the most critical module. By limiting the planning scope of the individual modules they are able to generate commands in real-time. Simulation results are shown which demonstrate that the vehicle is capable of traversing an unknown underwater environment to a goal po...
Semiautonomous rover vehicle serves as testbed for evaluation of navigation and obstacle-avoidanc... more Semiautonomous rover vehicle serves as testbed for evaluation of navigation and obstacle-avoidance techniques. Designed to traverse variety of terrains. Concepts developed applicable to robots for service in dangerous environments as well as to robots for exploration of remote planets. Called Robby, vehicle 4 m long and 2 m wide, with six 1-m-diameter wheels. Mass of 1,200 kg and surmounts obstacles as large as 1 1/2 m. Optimized for development of machine-vision-based strategies and equipped with complement of vision and direction sensors and image-processing computers. Front and rear cabs steer and roll with respect to centerline of vehicle. Vehicle also pivots about central axle, so wheels comply with almost any terrain.
Recent research on hidden-state reinforcement learning (RL) problems has concentrated on overcomi... more Recent research on hidden-state reinforcement learning (RL) problems has concentrated on overcoming partial observability by using memory to estimate state. However, such methods are computationally extremely expensive and thus have very limited applicability. This emphasis on state estimation has come about because it has been widely observed that the presence of hidden state or partial observability renders popular RL methods such as Q-learning and Sarsa useless. However, this observation is misleading in two ways: first, the theoretical results supporting it only apply to RL algorithms that do not use eligibility traces, and second these results are worst-case results, which leaves open the possibility that there may be large classes of hidden-state problems in which RL algorithms work well without any state estimation. In this paper we show empirically that Sarsa(λ), a well known family of RL algorithms that use eligibility traces, can work very well on hidden state problems tha...
This paper describes a series of experiments that were performed on the Rocky III robot. Rocky II... more This paper describes a series of experiments that were performed on the Rocky III robot. Rocky III is a small autonomous rover capable of navigating through rough outdoor terrain to a predesignated area, searching that area for soft soil, acquiring a soil sample, and depositing the sample in a container at its home base. The robot is programmed according to a reactive behavior-control paradigm using the ALFA programming language. This style of programming produces robust autonomous performance while requiring significantly less computational resources than more traditional mobile robot control systems. The code for Rocky III runs on an 8-bit processor and uses about 10k of memory.
1993 Proceedings Ieee International Conference on Robotics and Automation, May 2, 1993
Small, light, highly mobile robotic vehicles called "microrovers" use sensors and artif... more Small, light, highly mobile robotic vehicles called "microrovers" use sensors and artificial intelligence to perform complicated tasks autonomously. Vehicle navigates, avoids obstacles, and picks up objects using reactive control scheme selected from among few preprogrammed behaviors to respond to environment while executing assigned task. Under development for exploration and mining of other planets. Also useful in firefighting, cleaning up chemical spills, and delivering materials in factories. Reactive control scheme and principle of behavior-description language useful in reducing computational loads in prosthetic limbs and automotive collision-avoidance systems.
Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems Ii, Jul 20, 1999
Agents acting in the real world are confronted with the problem of making good decisions with lim... more Agents acting in the real world are confronted with the problem of making good decisions with limited knowledge of the environment. Partially observable Markov decision processes (POMDPs) model decision problems in which an agent tries to maximize its reward in the face of limited sensor feedback. Recent work has shown empirically that a reinforcement learning (RL) algorithm called Sarsa(A) can efficiently find optimal memoryless policies, which map current observations to actions, for POMDP problems (Loch and Singh 1998). The Sarsa(A) algorithm uses a form of short-term memory called an eligibility trace, which distributes temporally delayed rewards to observation-action pairs which lead up to the reward. This paper explores the effect of eligibility traces on the ability of the Sarsa(A) algorithm to find optimal memoryless policies. A variant of Sarsa(A) called k-step truncated Sarsa(A) is applied to four test problems taken from the recent work of Littman, Littman, Cassandra and Kaelbling, Parr and Russell, and Chrisman. The empirical results show that eligibility traces can be significantly truncated without affecting the ability of Sarsa(A) to find optimal memoryless policies for POMDPs.
Missions Technologies and Design of Planetary Mobile Vehicles, 1993
A series of experiments that were performed on the Rocky 3 robot is described. Rocky 3 is a small... more A series of experiments that were performed on the Rocky 3 robot is described. Rocky 3 is a small autonomous rover capable of navigating through rough outdoor terrain to a predesignated area, searching that area for soft soil, acquiring a soil sample, and depositing the sample in a container at its home base. The robot is programmed according to a reactive behavior control paradigm using the ALFA programming language. This style of programming produces robust autonomous performance while requiring significantly less computational resources than more traditional mobile robot control systems. The code for Rocky 3 runs on an eight bit processor and uses about ten k of memory.
SPIE Proceedings, 1992
The performance of autonomous mobile robots performing complex navigation tasks can be dramatical... more The performance of autonomous mobile robots performing complex navigation tasks can be dramatically improved by directly expensive sensing and planning in service of the task. The task-direction algorithms can be quite simple. In this paper we describe a simple task-directed vision system which has been implemented on a real outdoor robot which navigates using stereo vision. While the performance of this particular robot was improved by task-directed vision, the performance of task-directed vision in general is influenced in complex ways by many factors. We briefly discuss some of these, and present some initial simulated results.
Sensor Fusion V, 1992
This paper describes the control system for Rocky IV, a prototype microrover designed to demonstr... more This paper describes the control system for Rocky IV, a prototype microrover designed to demonstrate proof-of-concept for a low-cost scientific mission to Mars. Rocky IV uses a behavior-based control architecture which implements a large variety of functions displaying various degrees of autonomy, from completely autonomous long-duration conditional sequences of actions to very precisely described actions resembling classical AI operators. The control system integrates information from infrared proximity sensors, proprioceptive encoders which report on the state of the articulation of the rover's suspension system and other mechanics, a homing beacon, a magnetic compass, and contact sensors. In addition, significant functionality is implemented as 'virtual sensors', computed values which are presented to the system as if they were sensors values. The robot is able to perform a variety of useful tasks, including soil sample collection, removal of surface weathering layers from rocks, spectral imaging, instrument deployment, and sample return, under realistic mission- like conditions in Mars-like terrain.
IEEE Transactions on Automation Science and Engineering, 1994
This paper describes a series of robots developed at JPL to demonstrate the feasibility of using ... more This paper describes a series of robots developed at JPL to demonstrate the feasibility of using a behavior-control approach to control small robots on planetary surfaces. The round-trip light-time delay makes direct teleoperation of a mobile robot on a planetary surface impossible. Planetary rovers must therefore possess a certain degree of autonomy. However, small robots can only support small computers
... vehicles to perform intelligent robotic behaviors. A primary reasons for slow progress in the... more ... vehicles to perform intelligent robotic behaviors. A primary reasons for slow progress in these areas of research, is the limited availability of AUV platforms to missionplanning and control researchers. The lack of AUV testbeds has ...
Proceedings of the National Conference on Artificial …
This paper describes a series of experiments that were performed on the Rocky III robot. 1 Rocky ... more This paper describes a series of experiments that were performed on the Rocky III robot. 1 Rocky III is a small autonomous rover capable of navigating through rough outdoor terrain to a predesignated area, searching that area for soft ...
Proceedings of the Fifteenth International Conference …
Recent research on hidden-state reinforce- ment learning (RL) problems has concen- trated on over... more Recent research on hidden-state reinforce- ment learning (RL) problems has concen- trated on overcoming partial observability by using memory to estimate state. However, such methods are computationally extremely expensive and ...