x402 icon indicating copy to clipboard operation
x402 copied to clipboard

x402 protocol needs a way to avoid paywall fraud

Open kdenhartog opened this issue 3 months ago • 12 comments

The inherent design of a paywall is that the user pays before they see the content. Within the foreseeable future, we'll see a flood of empty content pages hidden behind x402 paywalls as a means to conduct phishing like scams at large scale against AI agents.

As an example:

  1. an agent shows up to the malicious.com/page and is requested to pay 1 cent
  2. The agent pays the request of 1 cent via x402
  3. The agent gains access to the content and discovers there's nothing available or it wasn't what they intended to purchase

In this instance, the agent (or user in some circumstances such as a browser based payment) needs a way to request a chargeback as they paid for a product they didn't receive. This would be considered fraud and the protocol needs to address this.

kdenhartog avatar Oct 27 '25 02:10 kdenhartog

Payment only addresses payment-related issues. Your scenario here is a business one and should not be resolved through payment.

fanq0914 avatar Oct 27 '25 02:10 fanq0914

buy side security shouldn't be a priority beyond spoofing. sell side security is really all that's important (beyond identity theft).

You can't fix the problem of vendor trust with this protocol and shouldn't even try. In my opinion, I think that's largely where the protocol is going awry. They are trying to solve problems that cannot be solved by a payment protocol.

The base assumption should be - the client trusts the vendor.

For agentic flows this is a requirement which goes far beyond the shallow concern of payment. Using results inside of AI driven flows requires a deep trust to avoid real problems of hijacking the thinking the agent will do.

qrdlgit avatar Oct 27 '25 04:10 qrdlgit

Payment only addresses payment-related issues. Your scenario here is a business one and should not be resolved through payment.

Even if the protocol doesn't address the issue, the specification needs to at the very least document this issue occurs within a security considerations section. This is best practice within the development of standards specs.

I'd argue this is a problem for the protocol itself to address. Otherwise, what you'll see is that client side implementors will address this to protect their user or you'll see the protocol centralize around facilitators that support these capabilities.

buy side security shouldn't be a priority beyond spoofing.

The base assumption should be - the client trusts the vendor.

That's not how the Web works, so what you're proposing violates the security model of browsers. I'd suggest taking a look at some chromium documents about why they sandbox sites. Sites are inherently untrusted. If this does become an issue, then again you'll see centralization around payment facilitators which solve the problem.

These problems will either get solved within the protocol itself or will be addressed by facilitators which will capture and centralize the protocol because they're the dominant facilitator to address the issue. As one example, Cloudflare has already been able to leverage their market power on the Web today to reduce web crawling capabilities via pay per crawl. Pay per crawl will eventually support x402 and at that point it will make Cloudflare the largest payments facilitator who will then be able to control what's supported by sites (aka they only support proprietary methods as the payment gateway rather than open crypto rails) through their CDN sitting in front of the servers.

kdenhartog avatar Oct 27 '25 05:10 kdenhartog

In that case there should be way to decentralize the facilitator

rajpatil7322 avatar Oct 29 '25 07:10 rajpatil7322

This can be addressed by adding a resource hash to the PaymentRequirements object and introducing a conflict resolution protocol, where the buyer can submit the document for review and potentially receive a refund or cause a reputation penalty for the seller.

This also brings up another point - PaymentRequirements need to be signed by the seller ...

kladkogex avatar Oct 29 '25 19:10 kladkogex

In my mind I was going down the path of a chargeback protocol, which is very similar to what you have in mind @kladkogex. I think what you're describing adds more of a technical answer for it than what I had come up with.

Would you mind providing a deeper explanation of how this protocol might work, so others who contribute here can help too?

kdenhartog avatar Oct 30 '25 21:10 kdenhartog

I get the sense people don't program agents here, or practice any reasonable sense of infosec.

Clicking on an untrusted link is crazy. Much much worse though, is using an untrusted source in an agentic flow. The former you could possibly protect against (though doubtful), the latter is nearly impossible given the billions of parameters in our models.

The only security you need on the client is to make sure there are no MITM attacks. Beyond that, you shouldn't be transacting with untrusted sources.

The real value prop of x402 is to commodify payment and reduce friction as much as possible so it lowers switching costs and doesn't become a differentiation in which services are used by a client. Humble, sure, but trying to solve more than that is why x402 has always failed in the past.

Worse, it seems to be distracted by these non-critical problems and is ignoring the pivotal ones like https://github.com/coinbase/x402/issues/373#issuecomment-3473049816

qrdlgit avatar Oct 31 '25 12:10 qrdlgit

I get the sense people don't program agents here, or practice any reasonable sense of infosec.

I work on the security team at Brave, where people on my team have published and responsibly disclosed attacks on agentic browsers using prompt injection: https://brave.com/series/security-privacy-in-agentic-browsing/. So I think you might be misunderstanding where I'm coming from here, because this issue is being directly informed by this work. It is a problem that will affect real users if we enable both x402 and agentic browsing. That is the goal here right?

Clicking on an untrusted link is crazy.

It's also how the web is designed. You don't know the contents of the page until you visit it. That's why browsers sandbox is to make it safer to access untrusted content. It's also how we see agents are behaving within the browsing context today unless trained or instructed to do otherwise.

The only security you need on the client is to make sure there are no MITM attacks. Beyond that, you shouldn't be transacting with untrusted sources.

So uhh... how do you suggest we address a prompt injection attack that tells an agent to visit a site and spend all of it's available funds via x402 on the untrusted link that it was told to trust in said prompt injection attack?

kdenhartog avatar Oct 31 '25 21:10 kdenhartog

I use brave extensively and exclusively for agentic search and I think you folks are poised to take over the world, though you need to get massive investment and very soon. You guys are literally one of the very few that can keep the oligarchs from taking over. x402 has that potential as well, if done right. Scaling is so hugely important if it is going to compete against the big boys. https://arxiv.org/pdf/2510.26658 (hint: it's all about latency)

I think there is a missed opp here though, in that Brave can be the source of trust rather than trying to push it onto the protocol. Having a trust factor in your returned results is necessary anyways as you will need to deal better with spam, so why not profit from it?

spend all of it's available funds

You have to whitelist and rate limit budgets for every endpoint. Even with endpoints you 100% trust to the ends of the earth. Bugs happen.

In the hierarchy of concerns, paying out will always come at the bottom. When doing agentic on untrusted content there are many worse concerns than losing 1c. You seem immensely familiar with this, so I'm surprised you are not more comfortable with my line of reasoning.

I can assure you, this problem can not be solved. Nobody can code review billions of parameters. And I believe there is plenty of evidence the programmatic side hasn't been solved either. At the very least, you cough up significant amounts of privacy every time you visit a link.

And this is also likely why x402 has not taken off. There really isn't a lot of demand for paying for content from untrusted sources. It's hard enough just getting people to visit the content for free.

qrdlgit avatar Nov 01 '25 12:11 qrdlgit

Scaling is so hugely important if it is going to compete against the big boys

So is doing it the right way so you don't re-encounter the bad emperor problem in Web3.

I think there is a missed opp here though, in that Brave can be the source of trust rather than trying to push it onto the protocol. Having a trust factor in your returned results is necessary anyways as you will need to deal better with spam, so why not profit from it?

This actually is counter-intuitive. The more we (or any player) try to centrally capture the protocol the less potential usage we'll get from Metcalfe's Law. Each independent non-interoperable facilitator will form a separate network which will compete for usage. This turns the protocol into a zero sum game within the protocol, rather than a positive sum game of growth to compete against the incumbent payment networks and approaches like Apple Pay and Google Pay. Either one of those systems could enable this with an Embrace, Extend, Extinguish strategy and fracture the network further. This is what happened to Payments Request API, which is why it's largely unused. Therefore, Brave and every other participant building on this system stand to gain more by building a standard that prevents capture and focuses on growing the overall pie to compete for via expanding the total addressable market rather than enabling fracturing within the network and reducing the max size of the network via Metcalfe's Law.

You have to whitelist and rate limit budgets for every endpoint. Even with endpoints you 100% trust to the ends of the earth. Bugs happen.

Are you expecting users to do this? The UX here is going to be cumbersome and not scale well and they will be tricked via classic phishing attempts if build around "endpoints" (aka an origin in a browser).

In the hierarchy of concerns, paying out will always come at the bottom. When doing agentic on untrusted content there are many worse concerns than losing 1c. You seem immensely familiar with this, so I'm surprised you are not more comfortable with my line of reasoning.

You're right, this is exactly why I'm uncomfortable with focusing too much on scale without getting a good architecture in place. The concern here isn't that one person loses 1c, it's that 3% or 5% of the population is losing 1c / second because a prompt injection attack occurred via an XSS attack on a site. This is an attack that matters most at scale. This is an architecture concern that needs to be addressed up front and not bolted on later.

I can assure you, this problem can not be solved. Nobody can code review billions of parameters. And I believe there is plenty of evidence the programmatic side hasn't been solved either. At the very least, you cough up significant amounts of privacy every time you visit a link.

This is security nihilism and is framing it in such a way that forces this view point. The goal here isn't to secure a model, it's to secure the protocol, so the model can't take an unexpected action. We've already had one suggestion which is to implement a conflict resolution protocol on top and require it's usage in the protocol.

kdenhartog avatar Nov 01 '25 22:11 kdenhartog

In my mind I was going down the path of a chargeback protocol, which is very similar to what you have in mind @kladkogex. I think what you're describing adds more of a technical answer for it than what I had come up with.

Would you mind providing a deeper explanation of how this protocol might work, so others who contribute here can help too?

Thanks :)

Here is a rough outline

  1. The seller wants to sell a file X (for example, a movie).
  2. The seller publishes the encrypted version of the movie E for free, and sells the corresponding encryption key K.
  3. The buyer downloads the encrypted file E for free.
  4. To access the movie, the buyer must purchase the key K using the 402 protocol.
  5. The seller includes the hash of the key (H(K)) and the hash of the document (H(F)) in the signed PaymentRequirements message.
  6. The buyer includes the hash of the PaymentRequirements as the nonce in EIP-3009. This ensures that the payment is cryptographically linked to the specific file X
  7. In the optimistic case, the 402 protocol completes successfully, the buyer receives the key K, decrypts the file, and is satisfied.

Dispute Scenarios

1. Seller Withholds the Key

In the pessimistic case, the seller fails to provide the key to the buyer.
In this situation, the buyer can file a complaint to a smart contract, forcing the seller to release the key.
If the seller still fails to release the key to the contract, the seller is penalized.

2. Incorrect or Fraudulent File

If, after decryption, the buyer discovers that the file (e.g., the movie) is incorrect or fraudulent,
the buyer can also file a complaint to the smart contract.
The smart contract then forwards the dispute to a committee for resolution.

kladkogex avatar Nov 02 '25 21:11 kladkogex

Ok I like where you're going with this as it at least introduces a method of chargeback after settlement, but it requires settlement to occur to the facilitator rather than the end server. If I'm running these fraud sites, I'd just run my own facilitator that's not compliant, which at least is good enough for security because the client can filter on this but you've now introduced the centralization with certainty based on what the client chooses to support. Aka a browser decides which facilitator wins which isn't great either. I'd rather avoid that even though it would mean Brave has an advantage to start and can leverage our user base to dictate control, but it also means we could just become the next bad emperor. That should be designed against IMO.

Also, providing a hash of the site introduces a privacy tradeoff which cements the issue I highlighted in #406 from being probabilistic to now deterministi. Assuming a site is published on IPFS network, a person could then just do a lookup for the CID based on the hash of the site and then rebuild the browsing history.

We could likely improve this with a simple Pedersen commitment scheme where the blinded hash (b' = h(F)* b) is published and then if settling via a centralized service you send them b so they can validate you paid for F. If settled all on chain we'd want to do a double blinded pedersen commitment scheme which is a bit more complex but is still viable.

I like where your structure is at, we probably just need to refine it a bit and check that it scales and how it affects the protocol.

kdenhartog avatar Nov 04 '25 16:11 kdenhartog