Copilot “Reprompt” exploit: Understand the Attacker Journey

In this article I want to discuss one of the recent exploits I found most interesting and how the mental model of the attacker journey helps destigmatize these exploits. More importantly, it helps identify pragmatic ways to reduce the harm they cause.

This example is built on a research report from Varonis Threats Labs which found a way to extract personal information from their victim by prompting them to click on a link that opens their Microsoft Co-pilot UI and then runs a prompt that extracts information like location, last files, etc. from the victim’s user context.

What Is a “user context”? When we run generative AI tools like Copilot, ChatGPT etc. they maintain conversational context and may have access to user data and services the user has explicitly authorized, or that are implicitly available through product integrations.

How the exploit unfolded

Microsoft has addressed the specific issue described in the report, but the underlying pattern remains relevant.

Attacker forms a prompt and packs it into a Link that is being sent to their victim(s).
Targeted individual(s) opens message (email, sms, slack message etc.) and clicks on the link
Link opens up Microsoft Co-Pilot and runs the embedded prompt
Prompt sends information to an external endpoint (e.g. via HTTP requests, webhooks, image loads, etc.) on the attacker’s server exposing the extracted information

The data accessible depends on what Copilot is integrated with and what permissions the user has already granted.

By the time the user realizes what the prompt did (if at all) the information has been extracted and is visible to the attacker on their backend system.

Critical Attacker Journey

If we dissect the journey of this specific type of attack, it helps us understand

Why it happens
What happens
What our options are to detect and block the attacker journey

What is a Critical Attacker Journey? Just like products are designed around critical user journeys (the steps a user must take to achieve a goal), cyber attacks also follow journeys. A Critical Attacker Journey describes the minimum set of steps an attacker needs to successfully cause harm.

User receives crafted Link with embedded Prompt

This is the first step in which the target is faced with the attempt to extract their personal information. Decades of experience with phishing campaigns show that attackers can be very skillful at crafting messages that build on trust and generate a level of urgency that increase the likelihood of people clicking on those links.

It’s unrealistic and even unfair to expect that people never click on malicious links. No matter how often we tell people and how much we scare them.

From a defence perspective we have to think about other methods to reduce the likelihood of these links to be clicked or even visible.

Make suspicious (or not particularly trusted) links unclickable: This adds another speedbump to the users ability to just click on these links
Mark email (from outside of the company) with links to internal tools as particularly risky.
Remove links that meet certain criteria from emails coming from outside of the organisation

Breaking the attacker’s journey at the earliest possible point is one of the most effective ways to prevent such an exploit.

Link opens Microsoft-Copilot

If the target still clicks on the link – it happens – Microsoft Copilot opens and runs said prompt.

Giving people the chance to review a prompt and having to approve it, can be a powerful way to increase awareness and break the journey, especially when a prompt originates from a link or external source rather than direct user input. This introduces an explicit intent check: something traditional authentication systems were never designed to handle.

Copilot runs prompt and sends information outside

While the prompt itself is highly dynamic, the fact that an external resource is called within a prompt can deterministically be detected and blocked or policed

Conclusion

Understanding the attacker journey of these types of exploits is crucial. Not only to destigmatize them and avoid a sense of powerlessness. It’s also useful to set realistic expectations on how we can expect the problem to solve in the most impactful way.

Avoid blaming people for clicking on links we should protect them from
Understand how to break the attacker journey effectively
Set expectations on products and policies to make these attacks more expensive and less scalable

Additional information:

Video showing the exploit both from the attacker’s and target’s perspective: https://varonis.wistia.com/medias/mqjzbm8h7f
Article going into more details about the exploit: https://www.windowscentral.com/artificial-intelligence/microsoft-copilot/copilot-ai-reprompt-exploit-detailed-2026