Luis Paulo Reis | Universidade do Minho (original) (raw)

Uploads

Papers by Luis Paulo Reis

Research paper thumbnail of Contextual Direct Policy Search

Journal of Intelligent and Robotic Systems, Jan 8, 2019

Research paper thumbnail of Game Adaptation by Using Reinforcement Learning Over Meta Games

Group Decision and Negotiation, Jan 13, 2020

Research paper thumbnail of Multi-Agent Deep Reinforcement Learning with Emergent Communication

When compared with their single-agent counterpart, multi-agent systems have an additional set of ... more When compared with their single-agent counterpart, multi-agent systems have an additional set of challenges for reinforcement learning algorithms, including increased complexity, non-stationary environments, credit assignment, partial observability, and achieving coordination. Deep reinforcement learning has been shown to achieve successful policies through implicit coordination, but does not handle partial-observability. This paper describes a deep reinforcement learning algorithm, based on multi-agent actor-critic, that simultaneously learns action policies for each agent, and communication protocols that compensate for partial-observability and help enforce coordination. We also research the effects of noisy communication, where messages can be late, lost, noisy, or jumbled, and how that affects the learned policies. We show how agents are able to learn both high-level policies and complex communication protocols for several different partially-observable environments. We also show how our proposal outperforms other state-of-the-art algorithms that don’t take advantage of communication, even with noisy communication channels.

Research paper thumbnail of FC Portugal 3D Simulation Team: Team Description Paper 2020

arXiv (Cornell University), Mar 28, 2023

Research paper thumbnail of Benchmarking Deep and Non-deep Reinforcement Learning Algorithms for Discrete Environments

Advances in intelligent systems and computing, Nov 20, 2019

Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove... more Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove challenging to decide on the most appropriate one to use in order to solve a given Reinforcement Learning task. This work presents a benchmark study on the performance of several Reinforcement Learning algorithms for discrete learning environments. The study includes several deep as well as non-deep learning algorithms, with special focus on the Deep Q-Network algorithm and its variants. Neural Fitted Q-Iteration, the predecessor of Deep Q-Network as well as Vanilla Policy Gradient and a planner were also included in this assessment in order to provide a wider range of comparison between different approaches and paradigms. Three learning environments were used in order to carry out the tests, including a 2D maze and two OpenAI Gym environments, namely a custom-built Foraging/Tagging environment and the CartPole environment.

Research paper thumbnail of Learning to Run Faster in a Humanoid Robot Soccer Environment Through Reinforcement Learning

Lecture Notes in Computer Science, 2019

Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills... more Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills from scratch is not only appealing due to the artificial creation of knowledge. It can also replace years of work and refinement in a matter of hours. From all the developed skills in the RoboCup 3D Soccer Simulation League, running is still considerably relevant to determine the winner of any match. However, current approaches do not make full use of the robotic soccer agents’ potential. To narrow this gap, we propose a way of leveraging the Proximal Policy Optimization using the information provided by the simulator for official RoboCup matches. To do this, our algorithm uses a mix of raw, computed and internally generated data. The final result is a sprinting and a stopping behavior that work in tandem to bring the agent from point a to point b in a very short time. The sprinting speed stabilizes at around 2.5 m/s, which is a great improvement over current solutions. Both the sprinting and stopping behaviors are remarkably stable.

Research paper thumbnail of An Adversarial Approach for Automated Pokémon Team Building and Meta-Game Balance

IEEE transactions on games, 2023

Research paper thumbnail of Learning hybrid locomotion skills—Learn to exploit residual actions and modulate model-based gait control

Frontiers in Robotics and AI, Apr 10, 2023

Research paper thumbnail of Exploring communication protocols and centralized critics in multi-agent deep learning

Integrated Computer-aided Engineering, Sep 11, 2020

Research paper thumbnail of Multi-agent Neural Reinforcement-Learning System with Communication

Advances in intelligent systems and computing, 2019

Deep learning models have as of late risen as popular function approximators for single-agent rei... more Deep learning models have as of late risen as popular function approximators for single-agent reinforcement learning challenges, by accurately estimating the value function of complex environments and being able to generalize to new unseen states. For multi-agent fields, agents must cope with the non-stationarity of the environment, due to the presence of other agents, and can take advantage of information sharing techniques for improved coordination. We propose an neural-based actor-critic algorithm, which learns communication protocols between agents and implicitly shares information during the learning phase. Large numbers of agents communicate with a self-learned protocol during distributed execution, and reliably learn complex strategies and protocols for partially observable multi-agent environments.

Research paper thumbnail of Pattern recognition in raw data of electrical impedance tomography using neural networks

Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple im... more Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple impcdancc mcasurcmcnis using.; urt'acc clcctrodcs. Thc clccirical impcdancc of LI tissuc, which characicrizcs iis ability io conduct clcctrical currcnt. tlcpcnds on its structures and physiological sutc. In ccllular mediums, ihc transmission 01'clccirical signals involvcs the ionic conduction in the interstitial spacc and cytoplasm and thc capacitivc propcriics of thc ccll mcmbrancs (1). Thc ac impcdancc ol'a tissuc varics with thc applied signal I' ...

Research paper thumbnail of Learning Hybrid Locomotion Skills -- Learn to Exploit Residual Dynamics and Modulate Model-based Gait Control

arXiv (Cornell University), Nov 27, 2020

Research paper thumbnail of A CPG-Based Agile and Versatile Locomotion Framework Using Proximal Symmetry Loss

arXiv (Cornell University), Mar 1, 2021

Research paper thumbnail of A Hybrid Biped Stabilizer System Based on Analytical Control and Learning of Symmetrical Residual Physics

arXiv (Cornell University), Nov 27, 2020

Research paper thumbnail of Generic Coordination Methodologies Applied to the RoboCup Simulation Leagues

Research paper thumbnail of FC Portugal: RoboCup 2022 3D Simulation League and Technical Challenge Champions

Research paper thumbnail of 6D Localization and Kicking for Humanoid Robotic Soccer

Journal of Intelligent and Robotic Systems, May 12, 2021

Research paper thumbnail of Fostering Efficient Learning in the Technical Field of Robotics by Changing the Autonomous Driving Competition of the Portuguese Robotics Open

ICERI proceedings, Nov 1, 2017

Research paper thumbnail of Mixed-Policy Asynchronous Deep Q-Learning

Advances in intelligent systems and computing, Dec 21, 2017

There are many open issues and challenges in the reinforcement learning field, such as handling h... more There are many open issues and challenges in the reinforcement learning field, such as handling high-dimensional environments. Function approximators, such as deep neural networks, have been successfully used in both single- and multi-agent environments with high dimensional state-spaces. The multi-agent learning paradigm faces even more problems, due to the effect of several agents learning simultaneously in the environment. One of its main concerns is how to learn mixed policies that prevent opponents from exploring them in competitive environments, achieving a Nash equilibrium. We propose an extension of several algorithms able to achieve Nash equilibriums in single-state games to the deep-learning paradigm. We compare their deep-learning and table-based implementations, and demonstrate how WPL is able to achieve an equilibrium strategy in a complex environment, where agents must find each other in an infinite-state game and play a modified version of the Rock Paper Scissors game.

Research paper thumbnail of Robust biped locomotion using deep reinforcement learning on top of an analytical control approach

Robotics and Autonomous Systems, Dec 1, 2021

Research paper thumbnail of Contextual Direct Policy Search

Journal of Intelligent and Robotic Systems, Jan 8, 2019

Research paper thumbnail of Game Adaptation by Using Reinforcement Learning Over Meta Games

Group Decision and Negotiation, Jan 13, 2020

Research paper thumbnail of Multi-Agent Deep Reinforcement Learning with Emergent Communication

When compared with their single-agent counterpart, multi-agent systems have an additional set of ... more When compared with their single-agent counterpart, multi-agent systems have an additional set of challenges for reinforcement learning algorithms, including increased complexity, non-stationary environments, credit assignment, partial observability, and achieving coordination. Deep reinforcement learning has been shown to achieve successful policies through implicit coordination, but does not handle partial-observability. This paper describes a deep reinforcement learning algorithm, based on multi-agent actor-critic, that simultaneously learns action policies for each agent, and communication protocols that compensate for partial-observability and help enforce coordination. We also research the effects of noisy communication, where messages can be late, lost, noisy, or jumbled, and how that affects the learned policies. We show how agents are able to learn both high-level policies and complex communication protocols for several different partially-observable environments. We also show how our proposal outperforms other state-of-the-art algorithms that don’t take advantage of communication, even with noisy communication channels.

Research paper thumbnail of FC Portugal 3D Simulation Team: Team Description Paper 2020

arXiv (Cornell University), Mar 28, 2023

Research paper thumbnail of Benchmarking Deep and Non-deep Reinforcement Learning Algorithms for Discrete Environments

Advances in intelligent systems and computing, Nov 20, 2019

Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove... more Given the plethora of Reinforcement Learning algorithms available in the literature, it can prove challenging to decide on the most appropriate one to use in order to solve a given Reinforcement Learning task. This work presents a benchmark study on the performance of several Reinforcement Learning algorithms for discrete learning environments. The study includes several deep as well as non-deep learning algorithms, with special focus on the Deep Q-Network algorithm and its variants. Neural Fitted Q-Iteration, the predecessor of Deep Q-Network as well as Vanilla Policy Gradient and a planner were also included in this assessment in order to provide a wider range of comparison between different approaches and paradigms. Three learning environments were used in order to carry out the tests, including a 2D maze and two OpenAI Gym environments, namely a custom-built Foraging/Tagging environment and the CartPole environment.

Research paper thumbnail of Learning to Run Faster in a Humanoid Robot Soccer Environment Through Reinforcement Learning

Lecture Notes in Computer Science, 2019

Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills... more Reinforcement learning techniques bring a new perspective to enduring problems. Developing skills from scratch is not only appealing due to the artificial creation of knowledge. It can also replace years of work and refinement in a matter of hours. From all the developed skills in the RoboCup 3D Soccer Simulation League, running is still considerably relevant to determine the winner of any match. However, current approaches do not make full use of the robotic soccer agents’ potential. To narrow this gap, we propose a way of leveraging the Proximal Policy Optimization using the information provided by the simulator for official RoboCup matches. To do this, our algorithm uses a mix of raw, computed and internally generated data. The final result is a sprinting and a stopping behavior that work in tandem to bring the agent from point a to point b in a very short time. The sprinting speed stabilizes at around 2.5 m/s, which is a great improvement over current solutions. Both the sprinting and stopping behaviors are remarkably stable.

Research paper thumbnail of An Adversarial Approach for Automated Pokémon Team Building and Meta-Game Balance

IEEE transactions on games, 2023

Research paper thumbnail of Learning hybrid locomotion skills—Learn to exploit residual actions and modulate model-based gait control

Frontiers in Robotics and AI, Apr 10, 2023

Research paper thumbnail of Exploring communication protocols and centralized critics in multi-agent deep learning

Integrated Computer-aided Engineering, Sep 11, 2020

Research paper thumbnail of Multi-agent Neural Reinforcement-Learning System with Communication

Advances in intelligent systems and computing, 2019

Deep learning models have as of late risen as popular function approximators for single-agent rei... more Deep learning models have as of late risen as popular function approximators for single-agent reinforcement learning challenges, by accurately estimating the value function of complex environments and being able to generalize to new unseen states. For multi-agent fields, agents must cope with the non-stationarity of the environment, due to the presence of other agents, and can take advantage of information sharing techniques for improved coordination. We propose an neural-based actor-critic algorithm, which learns communication protocols between agents and implicitly shares information during the learning phase. Large numbers of agents communicate with a self-learned protocol during distributed execution, and reliably learn complex strategies and protocols for partially observable multi-agent environments.

Research paper thumbnail of Pattern recognition in raw data of electrical impedance tomography using neural networks

Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple im... more Elcctrical Impcdancc Tomography (EIT) is a rcccnt mcdical imaging tcchniquc bascti on multiple impcdancc mcasurcmcnis using.; urt'acc clcctrodcs. Thc clccirical impcdancc of LI tissuc, which characicrizcs iis ability io conduct clcctrical currcnt. tlcpcnds on its structures and physiological sutc. In ccllular mediums, ihc transmission 01'clccirical signals involvcs the ionic conduction in the interstitial spacc and cytoplasm and thc capacitivc propcriics of thc ccll mcmbrancs (1). Thc ac impcdancc ol'a tissuc varics with thc applied signal I' ...

Research paper thumbnail of Learning Hybrid Locomotion Skills -- Learn to Exploit Residual Dynamics and Modulate Model-based Gait Control

arXiv (Cornell University), Nov 27, 2020

Research paper thumbnail of A CPG-Based Agile and Versatile Locomotion Framework Using Proximal Symmetry Loss

arXiv (Cornell University), Mar 1, 2021

Research paper thumbnail of A Hybrid Biped Stabilizer System Based on Analytical Control and Learning of Symmetrical Residual Physics

arXiv (Cornell University), Nov 27, 2020

Research paper thumbnail of Generic Coordination Methodologies Applied to the RoboCup Simulation Leagues

Research paper thumbnail of FC Portugal: RoboCup 2022 3D Simulation League and Technical Challenge Champions

Research paper thumbnail of 6D Localization and Kicking for Humanoid Robotic Soccer

Journal of Intelligent and Robotic Systems, May 12, 2021

Research paper thumbnail of Fostering Efficient Learning in the Technical Field of Robotics by Changing the Autonomous Driving Competition of the Portuguese Robotics Open

ICERI proceedings, Nov 1, 2017

Research paper thumbnail of Mixed-Policy Asynchronous Deep Q-Learning

Advances in intelligent systems and computing, Dec 21, 2017

There are many open issues and challenges in the reinforcement learning field, such as handling h... more There are many open issues and challenges in the reinforcement learning field, such as handling high-dimensional environments. Function approximators, such as deep neural networks, have been successfully used in both single- and multi-agent environments with high dimensional state-spaces. The multi-agent learning paradigm faces even more problems, due to the effect of several agents learning simultaneously in the environment. One of its main concerns is how to learn mixed policies that prevent opponents from exploring them in competitive environments, achieving a Nash equilibrium. We propose an extension of several algorithms able to achieve Nash equilibriums in single-state games to the deep-learning paradigm. We compare their deep-learning and table-based implementations, and demonstrate how WPL is able to achieve an equilibrium strategy in a complex environment, where agents must find each other in an infinite-state game and play a modified version of the Rock Paper Scissors game.

Research paper thumbnail of Robust biped locomotion using deep reinforcement learning on top of an analytical control approach

Robotics and Autonomous Systems, Dec 1, 2021