This work aims to incorporate RL into E in a way that generalizes the approach taken by Eās --auto mode.
Modern saturation based theorem provers such as E [8] and Vampire [5] rely on heuristics to guide their search. Historically, manually engineered heuristics have been the norm. More recently, advances in machine learning have inspired attempts to leverage that power for guiding search. While most efforts have used supervised learning, reinforcement learning (RL) has also been successfully applied [3, 1, 6]. This work aims to incorporate RL into E in a way that generalizes the approach taken by Eās --auto mode. In the words of Eās documentation: