Discover the key research papers that have shaped the field of reinforcement learning. These papers provide essential insights into the algorithms, methodologies, and applications driving this area of AI. Whether you are a researcher, student, or enthusiast, delve into these top papers to deepen your understanding of reinforcement learning.
Looking for research-backed answers?Try AI Search
Yan Duan, John Schulman, Xi Chen + 3 more
journal unavailable
This paper proposes to represent a “fast” reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, and shows that its performance on new MDPs is close to human-designed algorithms with optimality guarantees.
A. Issa, A. Aldair
Iraqi Journal for Electrical and Electronic Engineering
DDPG reinforcement learning is proposed to generate and improve the walking motion of the quadruped robot, and the results show that the behaviour of the walking robot has been improved compared with the previous cases.
P. Graf, J. Annoni, C. Bay + 4 more
2019 American Control Conference (ACC)
A new algorithm for distributed Reinforcement Learning (RL), a combination of the Alternating Direction Method of Multipliers (ADMM) and reinforcement learning that allows for integrating learned controllers as subsystems in generally convergent distributed control applications.
Yan Duan, John Schulman, Xi Chen + 3 more
ArXiv
This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.
Walt Woods
2021 IEEE Security and Privacy Workshops (SPW)
This work proposes a novel set of mechanisms for grammar inference, RL-GRIT1, and shows that RL can be used to surpass the expressiveness of both classes, and offers a clear path to learning context-sensitive languages.
Shelley Nason
journal unavailable
An architectural modification to Soar is described that gives a Soar agent the opportunity to learn statistical information about the past success of its actions and utilize this information when selecting an operator.
Aldo Pacchiano, Aadirupa Saha, Jonathan Lee
journal unavailable
This work is one of the first to give tight regret guarantees for preference based RL problems with trajectory preferences, where the trajectory preferences are encoded by a generalized linear model of dimension $d.
Caglar Gulcehre, Ziyun Wang, Alexander Novikov + 15 more
ArXiv
This paper proposes a benchmark called RL Unplugged to evaluate and compare offline RL methods, a suite of benchmarks that will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.
Ahmed Hallawa, Thorsten Born, A. Schmeink + 6 more
Proceedings of the Genetic and Evolutionary Computation Conference Companion
Results show that reinforcement learning algorithms embedded within the Evolutionary-Driven Reinforcement Learning approach significantly outperform the stand-alone versions of the same RL algorithms on OpenAI Gym control problems with rewardless states constrained by the same computational budget.
Bryan M. Li, A. Cowen-Rivers, Piotr Kozakowski + 6 more
J. Open Source Softw.
This work argues the five fundamental properties of a sophisticated research codebase are modularity, reproducibility, many RL algorithms pre-implemented, speed and ease of running on different hardware/ integration with visualization packages.
H. Jomaa, Josif Grabocka, L. Schmidt-Thieme
ArXiv
Experiments demonstrate that the model does not have to rely on a heuristic acquisition function like SMBO, but can learn which hyperparameters to test next based on the subsequent reduction in validation loss they will eventually lead to, and outperforms the state-of-the-art approaches for hyperparameter learning.
Brijen Thananjeyan, A. Balakrishna, Suraj Nair + 7 more
IEEE Robotics and Automation Letters
This work proposes Recovery RL, an algorithm which navigates this tradeoff by leveraging offline data to learn about constraint violating zones before policy learning and separating the goals of improving task performance and constraint satisfaction across two policies: a task policy that only optimizes the task reward and a recovery policy that guides the agent to safety when constraint violation is likely.
Zhecheng Yuan, Sizhe Yang, Pu Hua + 4 more
ArXiv
RL-ViGen is a novel Reinforcement Learning Benchmark for Visual Generalization, which contains diverse tasks and a wide spectrum of generalization types, thereby facilitating the derivation of more reliable conclusions and laying a foundation for the future creation of universal visual generalization RL agents suitable for real-world scenarios.
Anant A. Joshi, A. Taghvaei, P. Mehta
ArXiv
A novel simulation-based algorithm, namely an ensemble Kalman Kalman Kalman (EnKF), is introduced and used to obtain formulae for optimal control, expressed entirely in terms of the EnKF particles.
Varun Geetha, Mohamed Ariff Ameedeen
journal unavailable
Why Python require in Reinforcement Learning (RL) is discussed and its wide array of open source code libraries, package management and ability to work well on platforms other than Windows OS are discussed.
Kanishka Rao, Chris Harris, A. Irpan + 3 more
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
The RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning, is obtained by incorporating the RL-scene consistency loss into unsupervised domain translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image.
Member Ieee Junjie Zhang, Minghao Ye, Senior Member Ieee Zehua Guo + 2 more
IEEE Journal on Selected Areas in Communications
CFR-RL (Critical Flow Rerouting-Reinforcement Learning), a Reinforcement Learning-based scheme that learns a policy to select critical flows for each given traffic matrix automatically and reroutes these selected critical flows to balance link utilization of the network by formulating and solving a simple Linear Programming problem.
Agustin Castellano, J. Bazerque, Enrique Mallada
ArXiv
This work defines value and action-value functions that satisfy a barrier-based decomposition which allows for the identification of feasible policies independently of the reward process and develops a Barrier-learning algorithm, based on Q-Learning, that identifies unsafe state-action pairs.
João Miguel Proença Abreu, Meic N. 67329
journal unavailable
Novel active exploration strategies that combine and extend existing approaches for exploration with Deep RL architectures are contributed, showcasing the positive impact of active exploration in the learning performance of RL algorithms with neural network approximations.
Caglar Gulcehre, Ziyun Wang, Alexander Novikov + 15 more
journal unavailable
This paper proposes a benchmark called RL Unplugged to evaluate and compare offline RL methods, and proposes detailed evaluation protocols for each domain in RL Unplugged and provides an extensive analysis of supervised learning and offline RL methods using these protocols.
Giulia Milan, L. Vassio, I. Drago + 1 more
2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS)
Our life is getting filled by Internet of Things (IoT) devices. These devices often rely on closed or poorly documented protocols, with unknown formats and semantics. Learning how to interact with such devices in an autonomous manner is the key for interoperability and automatic verification of their capabilities. In this paper, we propose RL-IoT, a system that explores how to automatically interact with possibly unknown IoT devices. We leverage reinforcement learning (RL) to recover the semantics of protocol messages and to take control of the device to reach a given goal, while minimizing th...
Bruno C. da Silva, Eduardo W. Basso, A. Bazzan + 1 more
journal unavailable
A method for managing multiple partial models of the environment is proposed and described and previous results show that the proposed mechanism has better convergence times comparing to standard RL algorithms.
S. Mohanty, Erik Nygren, Florian Laurent + 11 more
ArXiv
A two-dimensional simplified grid environment called "Flatland" that allows for faster experimentation and demonstrates that ML has potential in solving the VRSP on Flatland and identifies key topics that need further research.
Xiaoxiao Liang, Yikang Ouyang, Haoyu Yang + 2 more
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
This article pioneer introducing the reinforcement learning (RL) model for mask optimization, which directly optimizes the preferred objective without leveraging a differentiable proxy, and outperforms state-of-the-art solutions.
Shaoteng Liu, Haoqi Yuan, Minda Hu + 5 more
ArXiv
This work introduces a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent, comprising a slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks.
Yingjie Miao, Xingyou Song, Daiyi Peng + 3 more
ArXiv
Throughout this training process, it is shown that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
Johanna Hansen, F. Hogan, D. Rivkin + 3 more
2022 International Conference on Robotics and Automation (ICRA)
This paper focuses on the problem setting where both visual and tactile sensors provide pixel-level feedback for Visuotactile reinforcement learning agents, and investigates the challenges associated with multimodal learning and proposes several improvements to existing RL methods.
Vially Kazadi Mutombo, Seungyeon Lee, Jusuk Lee + 1 more
Mob. Inf. Syst.
The proposed EER-RL, an energy-efficient routing protocol based on reinforcement learning, allows devices to adapt to network changes, such as mobility and energy level, and improve routing decisions, and the results show that the proposed protocol performs better in terms of energy efficiency and network lifetime and scalability.
Alexandru Rinciog, Anne Meyer
2021 Winter Simulation Conference (WSC)
This work introduces FabricatioRL, an RL compatible, customizable and extensible benchmarking simulation framework that can interface with both traditional approaches and RL, and ensure that generic production setups can be covered, and experiments are reproducible.
Farzad Peyravi, V. Derhami, A. Latif
2015 The International Symposium on Artificial Intelligence and Signal Processing (AISP)
Comparison of RLS with Simple Search Algorithm, Referral Algorithm and SNPageRank shows increase in both precision and recall.
Xin Wang, Ziwei Luo, Jing Hu + 5 more
journal unavailable
This work reformulate I2IT as a step-wise decision-making problem via deep reinforcement learning (DRL) and proposes a novel framework that performs RL-based I2IT (RL-I2IT), to decompose a monolithic learning process into small steps with a lightweight model to progressively transform a source image successively to a target image.
Chih-Kai Ho, C. King
IEEE Access
A contrastive spatio-temporal representation learning framework for RL, called CST-RL, is introduced, which leverages 3D Convolutional Neural Network (3D CNN) alongside contrastive learning for sample-efficient RL.
Tunde Aderinwale, Charles W Christoffer, D. Kihara
Frontiers in Molecular Biosciences
A novel method is introduced, RL-MLZerD, which builds multiple protein complexes using reinforcement learning (RL), and it emerged that the docking order of multi-chain complexes can be naturally predicted by examining preferred paths of episodes in the RL computation.
Dongsheng Zuo, Yikang Ouyang, Yuzhe Ma
2023 60th ACM/IEEE Design Automation Conference (DAC)
RL-MUL is proposed, a multiplier design optimization framework based on reinforcement learning that utilizes matrix and tensor representations for the compressor tree of a multiplier, based on which the convolutional neural networks can be seamlessly incorporated as the agent network.
Xiaoyu Chen, Jiachen Hu, Lihong Li + 1 more
ArXiv
A new formulation of constrained RL, known as RL with knapsack constraints (RLwK), is studied, and the first sample-efficient algorithm based on FMDP-BF is provided, which indicates that the algorithm is near-optimal w.r.t. timestep $T$, horizon $H$ and factored state-action subspace cardinality.
Anutusha Dogra, R. Jha, K. R. Jha
2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)
An RL-based approach in which the agent will select the access point with the highest SNR for establishing a communication link with the user is proposed, which helps in effective spectrum utilization and minimizing power wastage.
Xiaoyan Zhang, Yukai Song, Zhuopeng Li + 1 more
IEEE Transactions on Multimedia
The experiments show that the proposed PR-RL method outperforms state-of-the-art methods in generating locally effective and interpretable high resolution relighting results for wild portrait images.
Sivaraman Sivaraj, S. Rajendran
OCEANS 2022 - Chennai
The heading control of a KVLCC2 tanker in calm water and waves is investigated and the ship dynamics is represented using a 3DoF numerical model.
H. Handa
Proceedings of the 11th Annual conference on Genetic and evolutionary computation
Conditional Random Fields by Lafferty et al. is newly introduced into EDAs in this paper, which are extended to solve reinforcement learning problems which arise naturally in a framework for autonomous agents.
Malcolm R. K. Ryan, Mark D. Pendrith
journal unavailable
The RL-TOPs architecture for robot learning, a hybrid system combining teleo-reactive planning and reinforcement learning techniques, is introduced, to speed up learning by decomposing complex tasks into hierarchies of simple behaviours which can be learnt more easily.
Giulia Milan, L. Vassio, I. Drago + 1 more
journal unavailable
RL-IoT opens the opportunity to use RL to automatically explore how to interact with IoT protocols with limited information, and paving the road for interoperable systems.
Yi-Fan Jin, Greg Slabaugh, Simon Lucas
ArXiv
This paper delves into the integration of adapters in reinforcement learning, presenting an innovative adaptation strategy that demonstrates enhanced training efficiency and improvement of the base-agent, experimentally in the nanoRTS environment, a real-time strategy (RTS) game simulation.
Sharda Tripathi, Carla-Fabiana Chiasserini
GLOBECOM 2023 - 2023 IEEE Global Communications Conference
The solution, named MERGE, leverages the knowledge of the radio connectivity dynamics that each DU can acquire through the local use of a deep reinforcement learning radio agent to create up-to-date radio agents of the right size to fit the computing constraints of the individual DUs.
Nico Bohlinger, Klaus Dorer
journal unavailable
The new Deep Reinforcement Learning library RL-X provides a flexible and easy-to-extend codebase with self-contained single directory algorithms and its application to the RoboCup Soccer Simulation 3D League and classic DRL benchmarks is presented.
Maiyue Chen, Ying Tan
2023 IEEE Congress on Evolutionary Computation (CEC)
The main idea is to enhance the explosion operator with policy gradient-guided explosion and conduct firework cooperation using distillation-based cooperation and the proposed algorithm can outperform state-of-the-art pure reinforcement learning (RL) algorithms and other hybrid evolutionary reinforcement learning algorithms (EARL) on the standard MuJoCo benchmark suite for continuous control.
Harshit S. Sikchi, Qinqing Zheng, Amy Zhang + 1 more
journal unavailable
This work casts several state-of-the-art offline RL and offline imitation learning algorithms as instances of dual RL approaches with shared structures, and proposes a new discriminator-free method ReCOIL that learns to imitate from arbitrary off-policy data to obtain near-expert performance.
Nicholas Zolman, Urban Fasel, J. Kutz + 1 more
ArXiv
SINDy-RL is introduced, a unifying framework for combining SINDy and DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy that results in an interpretable control policy orders of magnitude smaller than a deep neural network policy.
Masato Fujitake
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
A deep reinforcement learning localization method for logo recognition (RL-LOGO) utilizes deep reinforcement learning to identify a logo region in images without annotations of the positions, thereby improving classification accuracy.
Runxuan Jiang, Tarun Gogineni, Joshua A Kammeraad + 3 more
Journal of Computational Chemistry
Conformer‐RL is an open‐source Python package for applying deep reinforcement learning (RL) to the task of generating a diverse set of low‐energy conformations for a single molecule and contains modular class interfaces for RL environments and agents, allowing users to easily swap components with their own implementations.
Geoffrey Cideron, Thomas Pierrot, Nicolas Perrin + 2 more
ArXiv
A novel reinforcement learning algorithm that incorporates the strengths of off-policy RL algorithms into Quality Diversity approaches, QD-RL, that can solve challenging exploration and control problems with deceptive rewards while being more than 15 times more sample efficient than its evolutionary counterparts.