Luis Paulo Reis | Universidade do Minho (original) (raw)
Uploads
Papers by Luis Paulo Reis
Journal of Intelligent and Robotic Systems, Jan 8, 2019
Group Decision and Negotiation, Jan 13, 2020
When compared with their single-agent counterpart, multi-agent systems have an additional set of ... more When compared with their single-agent counterpart, multi-agent systems have an additional set of challenges for reinforcement learning algorithms, including increased complexity, non-stationary environments, credit assignment, partial observability, and achieving coordination. Deep reinforcement learning has been shown to achieve successful policies through implicit coordination, but does not handle partial-observability. This paper describes a deep reinforcement learning algorithm, based on multi-agent actor-critic, that simultaneously learns action policies for each agent, and communication protocols that compensate for partial-observability and help enforce coordination. We also research the effects of noisy communication, where messages can be late, lost, noisy, or jumbled, and how that affects the learned policies. We show how agents are able to learn both high-level policies and complex communication protocols for several different partially-observable environments. We also show how our proposal outperforms other state-of-the-art algorithms that don’t take advantage of communication, even with noisy communication channels.
arXiv (Cornell University), Mar 28, 2023
Advances in intelligent systems and computing, Nov 20, 2019
Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove... more Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove challenging to decide on the most appropriate one to use in order to solve a given Reinforcement Learning task. This work presents a benchmark study on the performance of several Reinforcement Learning algorithms for discrete learning environments. The study includes several deep as well as non-deep learning algorithms, with special focus on the Deep Q-Network algorithm and its variants. Neural Fitted Q-Iteration, the predecessor of Deep Q-Network as well as Vanilla Policy Gradient and a planner were also included in this assessment in order to provide a wider range of comparison between different approaches and paradigms. Three learning environments were used in order to carry out the tests, including a 2D maze and two OpenAI Gym environments, namely a custom-built Foraging/Tagging environment and the CartPole environment.
Lecture Notes in Computer Science, 2019
Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills... more Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills from scratch is not only appealing due to the artificial creation of knowledge. It can also replace years of work and refinement in a matter of hours. From all the developed skills in the RoboCup 3D Soccer Simulation League, running is still considerably relevant to determine the winner of any match. However, current approaches do not make full use of the robotic soccer agents’ potential. To narrow this gap, we propose a way of leveraging the Proximal Policy Optimization using the information provided by the simulator for official RoboCup matches. To do this, our algorithm uses a mix of raw, computed and internally generated data. The final result is a sprinting and a stopping behavior that work in tandem to bring the agent from point a to point b in a very short time. The sprinting speed stabilizes at around 2.5 m/s, which is a great improvement over current solutions. Both the sprinting and stopping behaviors are remarkably stable.
IEEE transactions on games, 2023
Frontiers in Robotics and AI, Apr 10, 2023
Integrated Computer-aided Engineering, Sep 11, 2020
Advances in intelligent systems and computing, 2019
Deep learning models have as of late risen as popular function approximators for single-agent rei... more Deep learning models have as of late risen as popular function approximators for single-agent reinforcement learning challenges, by accurately estimating the value function of complex environments and being able to generalize to new unseen states. For multi-agent fields, agents must cope with the non-stationarity of the environment, due to the presence of other agents, and can take advantage of information sharing techniques for improved coordination. We propose an neural-based actor-critic algorithm, which learns communication protocols between agents and implicitly shares information during the learning phase. Large numbers of agents communicate with a self-learned protocol during distributed execution, and reliably learn complex strategies and protocols for partially observable multi-agent environments.
Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple im... more Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple impcdancc mcasurcmcnis using.; urt'acc clcctrodcs. Thc clccirical impcdancc of LI tissuc, which characicrizcs iis ability io conduct clcctrical currcnt. tlcpcnds on its structures and physiological sutc. In ccllular mediums, ihc transmission 01'clccirical signals involvcs the ionic conduction in the interstitial spacc and cytoplasm and thc capacitivc propcriics of thc ccll mcmbrancs (1). Thc ac impcdancc ol'a tissuc varics with thc applied signal I' ...
arXiv (Cornell University), Nov 27, 2020
arXiv (Cornell University), Mar 1, 2021
arXiv (Cornell University), Nov 27, 2020
Journal of Intelligent and Robotic Systems, May 12, 2021
ICERI proceedings, Nov 1, 2017
Advances in intelligent systems and computing, Dec 21, 2017
There are many open issues and challenges in the reinforcement learning field, such as handling h... more There are many open issues and challenges in the reinforcement learning field, such as handling high-dimensional environments. Function approximators, such as deep neural networks, have been successfully used in both single- and multi-agent environments with high dimensional state-spaces. The multi-agent learning paradigm faces even more problems, due to the effect of several agents learning simultaneously in the environment. One of its main concerns is how to learn mixed policies that prevent opponents from exploring them in competitive environments, achieving a Nash equilibrium. We propose an extension of several algorithms able to achieve Nash equilibriums in single-state games to the deep-learning paradigm. We compare their deep-learning and table-based implementations, and demonstrate how WPL is able to achieve an equilibrium strategy in a complex environment, where agents must find each other in an infinite-state game and play a modified version of the Rock Paper Scissors game.
Robotics and Autonomous Systems, Dec 1, 2021
Journal of Intelligent and Robotic Systems, Jan 8, 2019
Group Decision and Negotiation, Jan 13, 2020
When compared with their single-agent counterpart, multi-agent systems have an additional set of ... more When compared with their single-agent counterpart, multi-agent systems have an additional set of challenges for reinforcement learning algorithms, including increased complexity, non-stationary environments, credit assignment, partial observability, and achieving coordination. Deep reinforcement learning has been shown to achieve successful policies through implicit coordination, but does not handle partial-observability. This paper describes a deep reinforcement learning algorithm, based on multi-agent actor-critic, that simultaneously learns action policies for each agent, and communication protocols that compensate for partial-observability and help enforce coordination. We also research the effects of noisy communication, where messages can be late, lost, noisy, or jumbled, and how that affects the learned policies. We show how agents are able to learn both high-level policies and complex communication protocols for several different partially-observable environments. We also show how our proposal outperforms other state-of-the-art algorithms that don’t take advantage of communication, even with noisy communication channels.
arXiv (Cornell University), Mar 28, 2023
Advances in intelligent systems and computing, Nov 20, 2019
Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove... more Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove challenging to decide on the most appropriate one to use in order to solve a given Reinforcement Learning task. This work presents a benchmark study on the performance of several Reinforcement Learning algorithms for discrete learning environments. The study includes several deep as well as non-deep learning algorithms, with special focus on the Deep Q-Network algorithm and its variants. Neural Fitted Q-Iteration, the predecessor of Deep Q-Network as well as Vanilla Policy Gradient and a planner were also included in this assessment in order to provide a wider range of comparison between different approaches and paradigms. Three learning environments were used in order to carry out the tests, including a 2D maze and two OpenAI Gym environments, namely a custom-built Foraging/Tagging environment and the CartPole environment.
Lecture Notes in Computer Science, 2019
Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills... more Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills from scratch is not only appealing due to the artificial creation of knowledge. It can also replace years of work and refinement in a matter of hours. From all the developed skills in the RoboCup 3D Soccer Simulation League, running is still considerably relevant to determine the winner of any match. However, current approaches do not make full use of the robotic soccer agents’ potential. To narrow this gap, we propose a way of leveraging the Proximal Policy Optimization using the information provided by the simulator for official RoboCup matches. To do this, our algorithm uses a mix of raw, computed and internally generated data. The final result is a sprinting and a stopping behavior that work in tandem to bring the agent from point a to point b in a very short time. The sprinting speed stabilizes at around 2.5 m/s, which is a great improvement over current solutions. Both the sprinting and stopping behaviors are remarkably stable.
IEEE transactions on games, 2023
Frontiers in Robotics and AI, Apr 10, 2023
Integrated Computer-aided Engineering, Sep 11, 2020
Advances in intelligent systems and computing, 2019
Deep learning models have as of late risen as popular function approximators for single-agent rei... more Deep learning models have as of late risen as popular function approximators for single-agent reinforcement learning challenges, by accurately estimating the value function of complex environments and being able to generalize to new unseen states. For multi-agent fields, agents must cope with the non-stationarity of the environment, due to the presence of other agents, and can take advantage of information sharing techniques for improved coordination. We propose an neural-based actor-critic algorithm, which learns communication protocols between agents and implicitly shares information during the learning phase. Large numbers of agents communicate with a self-learned protocol during distributed execution, and reliably learn complex strategies and protocols for partially observable multi-agent environments.
Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple im... more Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple impcdancc mcasurcmcnis using.; urt'acc clcctrodcs. Thc clccirical impcdancc of LI tissuc, which characicrizcs iis ability io conduct clcctrical currcnt. tlcpcnds on its structures and physiological sutc. In ccllular mediums, ihc transmission 01'clccirical signals involvcs the ionic conduction in the interstitial spacc and cytoplasm and thc capacitivc propcriics of thc ccll mcmbrancs (1). Thc ac impcdancc ol'a tissuc varics with thc applied signal I' ...
arXiv (Cornell University), Nov 27, 2020
arXiv (Cornell University), Mar 1, 2021
arXiv (Cornell University), Nov 27, 2020
Journal of Intelligent and Robotic Systems, May 12, 2021
ICERI proceedings, Nov 1, 2017
Advances in intelligent systems and computing, Dec 21, 2017
There are many open issues and challenges in the reinforcement learning field, such as handling h... more There are many open issues and challenges in the reinforcement learning field, such as handling high-dimensional environments. Function approximators, such as deep neural networks, have been successfully used in both single- and multi-agent environments with high dimensional state-spaces. The multi-agent learning paradigm faces even more problems, due to the effect of several agents learning simultaneously in the environment. One of its main concerns is how to learn mixed policies that prevent opponents from exploring them in competitive environments, achieving a Nash equilibrium. We propose an extension of several algorithms able to achieve Nash equilibriums in single-state games to the deep-learning paradigm. We compare their deep-learning and table-based implementations, and demonstrate how WPL is able to achieve an equilibrium strategy in a complex environment, where agents must find each other in an infinite-state game and play a modified version of the Rock Paper Scissors game.
Robotics and Autonomous Systems, Dec 1, 2021