No TL;DR found
Retrieval-Augmented Generation (RAG) has gained significant attention from many researchers as an effective solution to address the hallucination issue of Foundational Models (FMs), particularly Large Language Models (LLMs). Although the RAG framework is considered a successful approach for enhancing LLMs by providing a suitable retrieval mechanism to obtain appropriate external knowledge, it still has limitations in acquiring high-quality knowledge from diverse data sources. The complementary integration of RAG and data spaces is proposed to exploit RAG’s capabilities within data spaces. Data spaces provide RAG with the ability to obtain diverse and high quality data sources from several data providers under secure data-sharing mechanisms and direct data exchange negotiations. At the same time, RAG enhances the support services of data spaces. In this paper, we present a high-level architecture for RAG data space models (RAG-DSMs) with a unified lifecycle for RAG and data spaces, highlight the possible challenges of the proposed integration while presenting potential opportunities. Moreover, we present two use cases for leveraging RAG-DSMs in the mobility and health domains.