Home / Papers / Information Leakage in Embedding Models

Information Leakage in Embedding Models

184 Citations•2020•

Congzheng Song, Ananth Raghunathan

journal unavailable

This work develops three classes of attacks to systematically study information that might be leaked by embeddings, and extensively evaluates the attacks on various state-of-the-art embedding models in the text domain.

Abstract

Embeddings are functions that map raw input data to low-dimensional vector representations, while preserving important semantic information about the inputs. Pre-training embeddings on a large amount of unlabeled data and fine-tuning them for downstream tasks is now a de facto standard in achieving state of the art learning in many domains.