Harnessing the Universal Geometry of Embeddings

Rishi Jha, Collin Zhang, Vitaly Shmatikov, John X. Morris

Cornell University, Department of Computer Science

Abstract

We introduce the first method for translating text embeddings from one vector space to another without any paired data, encoders, or predefined sets of matches. Our unsupervised approach translates any embedding to and from a universal latent representation (i.e., a universal semantic structure conjectured by the Platonic Representation Hypothesis). Our translations achieve high cosine similarity across model pairs with different architectures, parameter counts, and training datasets.

The ability to translate unknown embeddings into a different space while preserving their geometry has serious implications for the security of vector databases. An adversary with access only to embedding vectors can extract sensitive information about the underlying documents, sufficient for classification and attribute inference.

Strong Platonic Representation Hypothesis

The Platonic Representation Hypothesis conjectures that all image models of sufficient size have the same latent representation. We propose a stronger, constructive version of this hypothesis for text models,

Strong Platonic Representation Hypothesis: the universal latent structure of text representations not only exists, but can be learned and, furthermore, harnessed to translate representations from one space to another without any paired data or encoders.

Our method, vec2vec, reveals that all encoders—regardless of architecture or training data—learn nearly the same representations (Figs. 1 and 2)!

Preserving Geometry

Leveraging these universal representations, we show that vec2vec can translate embeddings generated from unseen documents by unseen encoders while preserving their geometry: i.e., the cosine similarity of the translated embeddings and the ideal target embeddings is high (Figs. 3 and 4).

The translators are robust to (sometimes very) out- of-distribution inputs indicating that the shared latent structure is not just a property of the training data, but rather a fundamental property of the text embeddings.

Security Implications

Then, using vec2vec, we show that vector databases reveal (almost) as much as their inputs. Given just vectors (e.g., from a compromised vector database), we show that an adversary can extract sensitive information about the underlying text.

In particular, we extract sensitive disease information from patient records and partial content from corporate emails (Fig. 6), with access only to document embeddings and no access to the encoder that produced them. Better translation methods will enable higher-fidelity extraction.

Contact & Acknowledgments

If you have questions about this work, contact Rishi Jha at: rjha at cs dot cornell dot edu. The page design was adapted from these two excellent projects: tufte-css and tufte-project-pages.