login
Home / Papers / Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

11 Citations•2024•
Hamed Zamani, Michael Bendersky
ArXiv

Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process, and employs straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG.

Abstract

This paper introduces Stochastic RAG--a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models that relaxes the simplifying assumptions of marginalization and document independence, made in most prior work. Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process. Through this formulation, we employ straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG. We conduct extensive experiments on seven diverse datasets on a wide range of tasks, from open-domain question answering to fact verification to slot-filling for relation extraction and to dialogue systems. By applying this optimization method to a recent and effective RAG model, we advance state-of-the-art results on six out of seven datasets.