Data Exfiltration Using Indirect Prompt Injection

Post Syndicated from Bruce Schneier original https://www.schneier.com/blog/archives/2023/12/data-exfiltration-using-indirect-prompt-injection.html

Interesting attack on a LLM:

In Writer, users can enter a ChatGPT-like session to edit or create their documents. In this chat session, the LLM can retrieve information from sources on the web to assist users in creation of their documents. We show that attackers can prepare websites that, when a user adds them as a source, manipulate the LLM into sending private information to the attacker or perform other malicious activities.

The data theft can include documents the user has uploaded, their chat history or potentially specific private information the chat model can convince the user to divulge at the attacker’s behest.

Noise

Data Exfiltration Using Indirect Prompt Injection

The collective thoughts of the interwebz