Continuous control of an underground loader using deep reinforcement learning (original) (raw)

Adaptation of a wheel loader automatic bucket filling neural network using reinforcement learning

2020 International Joint Conference on Neural Networks (IJCNN), 2020

Bucket-filling is a repetitive task in earth-moving operations with wheel-loaders, which needs to be automated to enable efficient remote control and autonomous operation. Ideally, an automated bucket-filling solution should work for different machine-pile environments, with a minimum of manual retraining. It has been shown that for a given machine-pile environment, a time-delay neural network can efficiently fill the bucket after imitation-based learning from 100 examples by one expert operator. Can such a bucket-filling network be automatically adapted to different machine-pile environments without further imitation learning by optimization of a utility or reward function? This paper investigates the use of a deterministic actor-critic reinforcement learning algorithm for automatic adaptation of a neural network in a new pile environment. The algorithm is used to automatically adapt a bucket-filling network for medium coarse gravel to a cobble-gravel pile environment. The experime...

Expert Level Control of Ramp Metering Based on Multi-Task Deep Reinforcement Learning

IEEE Transactions on Intelligent Transportation Systems, 2017

This article shows how the recent breakthroughs in Reinforcement Learning (RL) that have enabled robots to learn to play arcade video games, walk or assemble colored bricks, can be used to perform other tasks that are currently at the core of engineering cyberphysical systems. We present the first use of RL for the control of systems modeled by discretized non-linear Partial Differential Equations (PDEs) and devise a novel algorithm to use non-parametric control techniques for large multi-agent systems. Cyberphysical systems (e.g., hydraulic channels, transportation systems, the energy grid, electromagnetic systems) are commonly modeled by PDEs which historically have been a reliable way to enable engineering applications in these domains. However, it is known that the control of these PDE models is notoriously difficult. We show how neural network based RL enables the control of discretized PDEs whose parameters are unknown, random, and time-varying. We introduce an algorithm of Mutual Weight Regularization (MWR) which alleviates the curse of dimensionality of multi-agent control schemes by sharing experience between agents while giving each agent the opportunity to specialize its action policy so as to tailor it to the local parameters of the part of the system it is located in. A discretized PDE such as the scalar Lighthill-Whitham-Richards (LWR) PDE can indeed be considered as a macroscopic freeway traffic simulator and which presents the most salient challenges for learning to control large cyberphysical system with multiple agents. We consider two different discretization procedures and show the opportunities offered by applying deep reinforcement for continuous control on both. Using a neural RL PDE controller on a traffic flow simulation based on a Godunov discretization of the San Francisco Bay Bridge we are able to achieve precise adaptive metering without model calibration thereby beating the state of the art in traffic metering. Furthermore, with the more accurate BeATS simulator we manage to achieve a control performance on par with ALINEA, a state of the art parametric control scheme, and show how using MWR improves the learning procedure.

Intelligent Control of Groundwater in Slopes with Deep Reinforcement Learning

Sensors

The occurrence of landslides has been increasing in recent years due to intense and prolonged rainfall events. Lowering the groundwater in natural and man-made slopes can help to mitigate the hazards. Subsurface drainage systems equipped with pumps have traditionally been regarded as a temporary remedy for lowering the groundwater in geosystems, whereas long-term usage of pumping-based techniques is uncommon due to the associated high operational costs in labor and energy. This study investigates the intelligent control of groundwater in slopes enabled by deep reinforcement learning (DRL), a subfield of machine learning for automated decision-making. The purpose is to develop an autonomous geosystem that can minimize the operating cost and enhance the system’s safety without introducing human errors and interventions. To prove the concept, a seepage analysis model was implemented using a partial differential equation solver, FEniCS, to simulate the geosystem (i.e., a slope equipped ...

Field test of neural-network based automatic bucket-filling algorithm for wheel-loaders

Automation in Construction, 2019

Automation of earth-moving industries (construction, mining and quarry) require automatic bucket-filling algorithms for efficient operation of front-end loaders. Autonomous bucket-filling is an open problem since three decades due to difficulties in developing useful earth models (soil, gravel and rock) for automatic control. Operators make use of vision, sound and vestibular feedback to perform the bucket-filling operation with high productivity and fuel efficiency. In this paper, field experiments with a small time-delayed neural network (TDNN) implemented in the bucket control-loop of a Volvo L180H front-end loader filling medium coarse gravel are presented. The total delay time parameter of the TDNN is found to be an important hyperparameter due to the variable delay present in the hydraulics of the wheel-loader. The TDNN network successfully performs the bucket-filling operation after an initial period (100 examples) of imitation learning from an expert operator. The demonstrated solution show only 26% longer bucket-filling time, an improvement over manual tele-operation performance.

Control of rough terrain vehicles using deep reinforcement learning

ArXiv, 2021

We explore the potential to control terrain vehicles using deep reinforcement in scenarios where human operators and traditional control methods are inadequate. This letter presents a controller that perceives, plans, and successfully controls a 16-tonne forestry vehicle with two frame articulation joints, six wheels, and their actively articulated suspensions to traverse rough terrain. The carefully shaped reward signal promotes safe, environmental, and efficient driving, which leads to the emergence of unprecedented driving skills. We test learned skills in a virtual environment, including terrains reconstructed from high-density laser scans of forest sites. The controller displays the ability to handle obstructing obstacles, slopes up to 27◦, and a variety of natural terrains, all with limited wheel slip, smooth, and upright traversal with intelligent use of the active suspensions. The results confirm that deep reinforcement learning has the potential to enhance control of vehicl...

An Imitation Learning Approach for Truck Loading Operations in Backhoe Machines

15th International Conference on Climbing and Walking Robots and the Support Technologies for Mobile Machines (CLAWAR), 2012

This paper presents a motion planning and control system architecture development for autonomous earthmoving operations in excavating machines such as loading a dump truck. The motion planning system is imitation learning based, which is a general approach for learning motor skills from human demonstration. This scheme of supervised learning is based on a dynamical movement primitives (DMP) as control policies (CP). The DMP is a non-linear differential equation that encode movements, which are used to learn tasks in backhoe machines. A general architecture to achieve autonomous truck loading operations is described. Also, the effectiveness of our approach for truck loading task is demonstrated, where the machine can adapt to different operating scenarios.

Automated Excavator Based on Reinforcement Learning and Multibody System Dynamics

IEEE Access, 2020

Fully autonomous earth-moving heavy equipment able to operate without human intervention can be seen as the primary goal of automated earth construction. To achieve this objective requires that the machines have the ability to adapt autonomously to complex and changing environments. Recent developments in automation have focused on the application of different machine learning approaches, of which the use of reinforcement learning algorithms is considered the most promising. The key advantage of reinforcement learning is the ability of the system to learn, adapt and work independently in a dynamic environment. This paper investigates an application of reinforcement learning algorithm for heavy mining machinery automation. To this end, the training associated with reinforcement learning is done using the multibody approach. The procedure used combines a multibody approach and proximal policy optimization with a covariance matrix adaptation learning algorithm to simulate an autonomous excavator. The multibody model includes a representation of the hydraulic system, multiple sensors observing the state of the excavator and deformable ground. The task of loading a hopper with soil taken from a chosen point on the ground is simulated. The excavator is trained to load the hopper effectively within a given time while avoiding collisions with the ground and the hopper. The proposed system demonstrates the desired behavior after short training times.

Longitudinal Deep Truck: Deep learning and deep reinforcement learning for modeling and control of longitudinal dynamics of heavy duty trucks

ArXiv, 2021

Heavy duty truck mechanical configuration is often tailor designed and built for specific truck mission requirements. This renders the precise derivation of analytical dynamical models and controls for these trucks from first principles challenging, tedious, and often requires several theoretical and applied areas of expertise to carry through. This article investigates deep learning and deep reinforcement learning as truck-configuration-agnostic longitudinal modeling and control approaches for heavy duty trucks. The article outlines a process to develop and validate such models and controllers and highlights relevant practical considerations. The process is applied to simulation and real-full size trucks for validation and experimental performance evaluation. The results presented demonstrate applicability of this approach to trucks of multiple configurations; models generated were accurate for control development purposes both in simulation and the field.

Smart Autonomous Vehicles in High Dimensional Warehouses Using Deep Reinforcement Learning Approach

2021

In this paper, we propose a smart planning and control system for autonomous vehicles in a high dimensional space. It is a complete unsupervised scheduler and motion planner. Many warehouses take advantage of using an automated material handling process for product transshipment to speed up procedures. However, the growth of the space dimensions becomes a big issue for the control system that becomes increasingly complex. The introduced model uses, as input, a kernel with a control system based on Deep Reinforcement Learning for the low dimensional space. Besides, it employes a global transition-control system to intelligently coordinate communications between the kernels. The global transitioncontrol system creates virtual paths for each product, assigns tasks to kernels for handling products in their zones, and ensures transitions between blocks to brings each product to their destination. Our approach yields good performance in terms of speed and number of movements. The system i...

Position Control of a Mobile Robot through Deep Reinforcement Learning

Applied Sciences

This article proposes the use of reinforcement learning (RL) algorithms to control the position of a simulated Kephera IV mobile robot in a virtual environment. The simulated environment uses the OpenAI Gym library in conjunction with CoppeliaSim, a 3D simulation platform, to perform the experiments and control the position of the robot. The RL agents used correspond to the deep deterministic policy gradient (DDPG) and deep Q network (DQN), and their results are compared with two control algorithms called Villela and IPC. The results obtained from the experiments in environments with and without obstacles show that DDPG and DQN manage to learn and infer the best actions in the environment, allowing us to effectively perform the position control of different target points and obtain the best results based on different metrics and indices.