Discover the key research papers that have shaped the field of reinforcement learning. These papers provide essential insights into the algorithms, methodologies, and applications driving this area of AI. Whether you are a researcher, student, or enthusiast, delve into these top papers to deepen your understanding of reinforcement learning.
Looking for research-backed answers?Try AI Search
A. Issa, A. Aldair
Iraqi Journal for Electrical and Electronic Engineering
DDPG reinforcement learning is proposed to generate and improve the walking motion of the quadruped robot, and the results show that the behaviour of the walking robot has been improved compared with the previous cases.
Aldo Pacchiano, Aadirupa Saha, Jonathan Lee
journal unavailable
This work is one of the first to give tight regret guarantees for preference based RL problems with trajectory preferences, where the trajectory preferences are encoded by a generalized linear model of dimension $d.
Walt Woods
2021 IEEE Security and Privacy Workshops (SPW)
This work proposes a novel set of mechanisms for grammar inference, RL-GRIT1, and shows that RL can be used to surpass the expressiveness of both classes, and offers a clear path to learning context-sensitive languages.
Xiaoxiao Liang, Yikang Ouyang, Haoyu Yang + 2 more
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
This article pioneer introducing the reinforcement learning (RL) model for mask optimization, which directly optimizes the preferred objective without leveraging a differentiable proxy, and outperforms state-of-the-art solutions.
Shaoteng Liu, Haoqi Yuan, Minda Hu + 5 more
ArXiv
This work introduces a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent, comprising a slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks.
Zhecheng Yuan, Sizhe Yang, Pu Hua + 4 more
ArXiv
RL-ViGen is a novel Reinforcement Learning Benchmark for Visual Generalization, which contains diverse tasks and a wide spectrum of generalization types, thereby facilitating the derivation of more reliable conclusions and laying a foundation for the future creation of universal visual generalization RL agents suitable for real-world scenarios.
Johanna Hansen, F. Hogan, D. Rivkin + 3 more
2022 International Conference on Robotics and Automation (ICRA)
This paper focuses on the problem setting where both visual and tactile sensors provide pixel-level feedback for Visuotactile reinforcement learning agents, and investigates the challenges associated with multimodal learning and proposes several improvements to existing RL methods.
Yan Duan, John Schulman, Xi Chen + 3 more
journal unavailable
This paper proposes to represent a “fast” reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, and shows that its performance on new MDPs is close to human-designed algorithms with optimality guarantees.
P. Graf, J. Annoni, C. Bay + 4 more
2019 American Control Conference (ACC)
A new algorithm for distributed Reinforcement Learning (RL), a combination of the Alternating Direction Method of Multipliers (ADMM) and reinforcement learning that allows for integrating learned controllers as subsystems in generally convergent distributed control applications.
Shengyi Huang, Quentin Gallou'edec, Florian Felten + 30 more
ArXiv
This document presents Open RL Benchmark, a set of fully tracked RL experiments, including not only the usual data such as episodic return, but also all algorithm-specific and system metrics, and is the first RL benchmark of its kind.
Yingjie Miao, Xingyou Song, Daiyi Peng + 3 more
ArXiv
Throughout this training process, it is shown that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
Harshit S. Sikchi, Qinqing Zheng, Amy Zhang + 1 more
journal unavailable
This work casts several state-of-the-art offline RL and offline imitation learning algorithms as instances of dual RL approaches with shared structures, and proposes a new discriminator-free method ReCOIL that learns to imitate from arbitrary off-policy data to obtain near-expert performance.
Nicholas Zolman, Urban Fasel, J. Kutz + 1 more
ArXiv
SINDy-RL is introduced, a unifying framework for combining SINDy and DRL to create efficient, interpretable, and trustworthy representations of the dynamics model, reward function, and control policy that results in an interpretable control policy orders of magnitude smaller than a deep neural network policy.
Caglar Gulcehre, Ziyun Wang, Alexander Novikov + 15 more
ArXiv
This paper proposes a benchmark called RL Unplugged to evaluate and compare offline RL methods, a suite of benchmarks that will increase the reproducibility of experiments and make it possible to study challenging tasks with a limited computational budget, thus making RL research both more systematic and more accessible across the community.
Dongsheng Zuo, Yikang Ouyang, Yuzhe Ma
2023 60th ACM/IEEE Design Automation Conference (DAC)
A multiplier design optimization framework based on reinforcement learning is proposed, utilizing matrix and tensor representations for the compressor tree of a multiplier, enabling seamless integration of convolutional neural networks as the agent network.
Giulia Milan, L. Vassio, I. Drago + 1 more
2021 IEEE International Conference on Omni-Layer Intelligent Systems (COINS)
Our life is getting filled by Internet of Things (IoT) devices. These devices often rely on closed or poorly documented protocols, with unknown formats and semantics. Learning how to interact with such devices in an autonomous manner is the key for interoperability and automatic verification of their capabilities. In this paper, we propose RL-IoT, a system that explores how to automatically interact with possibly unknown IoT devices. We leverage reinforcement learning (RL) to recover the semantics of protocol messages and to take control of the device to reach a given goal, while minimizing th...
Yan Duan, John Schulman, Xi Chen + 3 more
ArXiv
This paper proposes to represent a "fast" reinforcement learning algorithm as a recurrent neural network (RNN) and learn it from data, encoded in the weights of the RNN, which are learned slowly through a general-purpose ("slow") RL algorithm.
Vially Kazadi Mutombo, Seungyeon Lee, Jusuk Lee + 1 more
Mob. Inf. Syst.
The proposed EER-RL, an energy-efficient routing protocol based on reinforcement learning, allows devices to adapt to network changes, such as mobility and energy level, and improve routing decisions, and the results show that the proposed protocol performs better in terms of energy efficiency and network lifetime and scalability.
Tunde Aderinwale, Charles W Christoffer, D. Kihara
Frontiers in Molecular Biosciences
A novel method is introduced, RL-MLZerD, which builds multiple protein complexes using reinforcement learning (RL), and it emerged that the docking order of multi-chain complexes can be naturally predicted by examining preferred paths of episodes in the RL computation.
Brijen Thananjeyan, A. Balakrishna, Suraj Nair + 7 more
IEEE Robotics and Automation Letters
This work proposes Recovery RL, an algorithm which navigates this tradeoff by leveraging offline data to learn about constraint violating zones before policy learning and separating the goals of improving task performance and constraint satisfaction across two policies: a task policy that only optimizes the task reward and a recovery policy that guides the agent to safety when constraint violation is likely.
Chengwen Zhang, Yuhao Zhang, Bo Cheng
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
A novel Reinforcement Learning framework for the multimodal EMOtion recognition task (RL-EMO), which combines a Multi-modal Graph Convolution Network (MMGCN) module with a novel Reinforcement Learning (RL) module to model context at both the semantic and emotional levels respectively.
Marc Rigter, Bruno Lacerda, Nick Hawes
ArXiv
This work presents Robust Adversarial Model-Based Offline RL (RAMBO), a novel approach to model-based offline RL that addresses the problem as a two-player zero sum game against an adversarial environment model and demonstrates that it outperforms existing state-of-the-art baselines.
H. Jomaa, Josif Grabocka, L. Schmidt-Thieme
ArXiv
Experiments demonstrate that the model does not have to rely on a heuristic acquisition function like SMBO, but can learn which hyperparameters to test next based on the subsequent reduction in validation loss they will eventually lead to, and outperforms the state-of-the-art approaches for hyperparameter learning.
Ahmed Hallawa, Thorsten Born, A. Schmeink + 6 more
Proceedings of the Genetic and Evolutionary Computation Conference Companion
Results show that reinforcement learning algorithms embedded within the Evolutionary-Driven Reinforcement Learning approach significantly outperform the stand-alone versions of the same RL algorithms on OpenAI Gym control problems with rewardless states constrained by the same computational budget.
Xiaoyan Zhang, Yukai Song, Zhuopeng Li + 1 more
IEEE Transactions on Multimedia
The experiments show that the proposed PR-RL method outperforms state-of-the-art methods in generating locally effective and interpretable high resolution relighting results for wild portrait images.
Nico Bohlinger, Klaus Dorer
journal unavailable
The new Deep Reinforcement Learning library RL-X provides a flexible and easy-to-extend codebase with self-contained single directory algorithms and its application to the RoboCup Soccer Simulation 3D League and classic DRL benchmarks is presented.
Anutusha Dogra, R. Jha, K. R. Jha
2022 OPJU International Technology Conference on Emerging Technologies for Sustainable Development (OTCON)
An RL-based approach in which the agent will select the access point with the highest SNR for establishing a communication link with the user is proposed, which helps in effective spectrum utilization and minimizing power wastage.
Masato Fujitake
ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
A deep reinforcement learning localization method for logo recognition (RL-LOGO) utilizes deep reinforcement learning to identify a logo region in images without annotations of the positions, thereby improving classification accuracy.
S. Sarkar, Ashwin Ramesh Babu, Sajad Mousavi + 5 more
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
This work proposes a novel approach that uses reinforcement learning to generate a visual explanation for CNNs, and demonstrates that the proposed method outperforms the existing techniques, producing more accurate localization masks of regions of interest in the input images.
Alexandru Rinciog, Anne Meyer
2021 Winter Simulation Conference (WSC)
This work introduces FabricatioRL, an RL compatible, customizable and extensible benchmarking simulation framework that can interface with both traditional approaches and RL, and ensure that generic production setups can be covered, and experiments are reproducible.
Kanishka Rao, Chris Harris, A. Irpan + 3 more
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
The RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning, is obtained by incorporating the RL-scene consistency loss into unsupervised domain translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image.
Sivaraman Sivaraj, S. Rajendran
OCEANS 2022 - Chennai
The heading control of a KVLCC2 tanker in calm water and waves is investigated and the ship dynamics is represented using a 3DoF numerical model.
I. Made, Aswin Nahrendra, Christian Tirtawardhana + 3 more
IEEE Robotics and Automation Letters
This letter proposes a novel hybrid architecture that reinforces a nominal controller with a robust policy learned using a model-free deep RL algorithm and employs an uncertainty-aware control mixer to preserve guaranteed stability of a nominal controllers while using the extended robust performance of the learned policy.
Pierre Schumacher, D. Haeufle, Dieter Büchler + 2 more
ArXiv
Muscle-actuated organisms are capable of learning an unparalleled diversity of dexterous movements despite their vast amount of muscles. Reinforcement learning (RL) on large musculoskeletal models, however, has not been able to show similar performance. We conjecture that ineffective exploration in large overactuated action spaces is a key problem. This is supported by the finding that common exploration noise strategies are inadequate in synthetic examples of overactuated systems. We identify differential extrinsic plasticity (DEP), a method from the domain of self-organization, as being able...
Member Ieee Junjie Zhang, Minghao Ye, Senior Member Ieee Zehua Guo + 2 more
IEEE Journal on Selected Areas in Communications
CFR-RL (Critical Flow Rerouting-Reinforcement Learning), a Reinforcement Learning-based scheme that learns a policy to select critical flows for each given traffic matrix automatically and reroutes these selected critical flows to balance link utilization of the network by formulating and solving a simple Linear Programming problem.
Caglar Gulcehre, Ziyun Wang, Alexander Novikov + 15 more
journal unavailable
This paper proposes a benchmark called RL Unplugged to evaluate and compare offline RL methods, and proposes detailed evaluation protocols for each domain in RL Unplugged and provides an extensive analysis of supervised learning and offline RL methods using these protocols.
Baqis Quafu Group
journal unavailable
The experimental results demonstrate that the Reinforcement Learning agents are capable of achieving goals that are slightly relaxed both during the training and inference stages, and meticulously design hardware-efficient PQC architectures in the quantum model using a multi-objective evolutionary algorithm and develop a learning algorithm that is adaptable to Quafu.
Runxuan Jiang, Tarun Gogineni, Joshua A Kammeraad + 3 more
Journal of Computational Chemistry
Conformer‐RL is an open‐source Python package for applying deep reinforcement learning (RL) to the task of generating a diverse set of low‐energy conformations for a single molecule and contains modular class interfaces for RL environments and agents, allowing users to easily swap components with their own implementations.
S. S. Eshkevari, S. S. Eshkevari, Debarshi Sen + 1 more
ArXiv
This paper presents a novel RL-based approach for designing active controllers by introducing RL-Controller, a flexible and scalable simulation environment that includes attributes and functionalities that are defined to model active structural control mechanisms in detail.
Tian Xie, Zitian Gao, Qingnan Ren + 7 more
journal unavailable
The 7B model develops advanced reasoning skills-such as reflection, verification, and summarization-that are absent from the logic corpus after training on just 5K logic problems, and demonstrates generalization abilities to the challenging math benchmarks AIME and AMC.
Bryan M. Li, A. Cowen-Rivers, Piotr Kozakowski + 6 more
J. Open Source Softw.
This work argues the five fundamental properties of a sophisticated research codebase are modularity, reproducibility, many RL algorithms pre-implemented, speed and ease of running on different hardware/ integration with visualization packages.
Yu-Xin 羽欣 Jin 靳, Hong-Ze 宏泽 Xu 许, Zheng-An 正安 Wang 王 + 29 more
Chinese Physics B
This work takes the first step towards executing benchmark quantum reinforcement problems on real devices equipped with at most 136 qubits on the BAQIS Quafu quantum computing cloud and designs hardware-efficient PQC architectures in the quantum model using a multi-objective evolutionary algorithm and develops a learning algorithm that is adaptable to quantum devices.
Shivam Shandilya, Menglin Xia, Supriyo Ghosh + 4 more
ArXiv
This work proposes a novel and efficient reinforcement learning (RL) based task-aware prompt compression method that leverages existing Transformer encoder-based token classification model while guiding the learning process with task-specific reward signals using lightweight REINFORCE algorithm.
Benjamin Eysenbach, M. Geist, S. Levine + 1 more
journal unavailable
Applying a multi-step critic regularization method with a regularization coefficient of 1 yields the same policy as one-step RL, drawing a connection between these methods.
Yihao Wang, Ru Zhang, Jianyi Liu
IEEE Signal Processing Letters
This letter proposes a reinforcement learning-based method for linguistic steganalysis that employs an agent (steganalyzer) to interact within an observation space, enabling adaptation to the characteristics of the transformed data and capturing Steganalysis features.
S. Mohanty, Erik Nygren, Florian Laurent + 11 more
ArXiv
A two-dimensional simplified grid environment called "Flatland" that allows for faster experimentation and demonstrates that ML has potential in solving the VRSP on Flatland and identifies key topics that need further research.
Soroush Nasiriany, Vitchyr H. Pong, Ashvin Nair + 3 more
2021 IEEE International Conference on Robotics and Automation (ICRA)
This paper proposes goal distributions as a general and broadly applicable task representation suitable for contextual policies and develops an off-policy algorithm called distribution-conditioned reinforcement learning (DisCo RL) to efficiently learn these policies.
Shelley Nason
journal unavailable
An architectural modification to Soar is described that gives a Soar agent the opportunity to learn statistical information about the past success of its actions and utilize this information when selecting an operator.
Shweta Pandey, Rohit Agarwal, Sachin Bhardwaj + 3 more
International Journal of Scientific Research in Computer Science, Engineering and Information Technology
In this essay, the state-of-the-art RL is thoroughly reviewed in the literature and found in a wide range of industries, including smart grids, robots, computer vision, healthcare, gaming, transportation, finance, and engineering.
Quantao Yang, J. A. Stork, Todor Stoyanov
IEEE Robotics and Automation Letters
The Multi-Prior Regularized RL (MPR-RL) method is deployed directly on a real world Franka Panda arm, requiring only a set of demonstrated trajectories from similar, but crucially not identical, problem instances.