Home / Papers / Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

Stochastic RAG: End-to-End Retrieval-Augmented Generation through Expected Utility Maximization

DOI: 10.1145/3626772.3657923Semantic Scholar

30 Citations•2024•

Hamed Zamani, Michael Bendersky

Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process, and employs straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG.

Abstract

This paper introduces Stochastic RAG--a novel approach for end-to-end optimization of retrieval-augmented generation (RAG) models that relaxes the simplifying assumptions of marginalization and document independence, made in most prior work. Stochastic RAG casts the retrieval process in RAG as a stochastic sampling without replacement process. Through this formulation, we employ straight-through Gumbel-top-k that provides a differentiable approximation for sampling without replacement and enables effective end-to-end optimization for RAG. We conduct extensive experiments on seven diverse datasets on a wide range of tasks, from open-domain question answering to fact verification to slot-filling for relation extraction and to dialogue systems. By applying this optimization method to a recent and effective RAG model, we advance state-of-the-art results on six out of seven datasets.