frequency icon indicating copy to clipboard operation
frequency copied to clipboard

Correct off-chain delegation validation always requires an archive node

Open JoeCap08055 opened this issue 6 months ago • 11 comments

What happened?

Description

In the current Delegation model, we only store the revoked_at block number. A Delegation is either currently in-force (revoked_at == 0), or has been revoked as of some block (revoked_at > 0). This has issues even for the following simple sequence of events:

flowchart TD
    A(["User delegates a schema to Provider A at block 100"]) --> B(["User revokes delegation at block 400"])    

In this scenario, we can always determine that any content posted by Provider A on behalf of the user after block 400 is unauthorized; however, for any content prior to block 400, we do not currently retain enough information to know whether the now-revoked delegation was in force or not, because we do not know at which block it was granted. The only way to determine that would be to access an archive node and query state at the block in question.

So... is the solution to also store the granted_at block number? That would mitigate the above scenario. However, consider the following sequence of events:

flowchart TD
    A(["User delegates a schema to Provider A at block 100"]) --> B(["User revokes delegation at block 400"])    
    B --> C(["User re-delegates at block 1000"])
    C --> D(["User revokes delegation at block 2000"])

In this scenario, with access to the granted_at block, although we are able to validate content past block 2000 as unauthorized, and content between blocks 1000-2000 as authorized, we have no knowledge of the prior delegation from blocks 100-400. For that information, we still require an archive node.

Relevant log output


JoeCap08055 avatar Jul 21 '25 20:07 JoeCap08055

Additional thoughts:

This issue is not so much a bug as an inconvenience.

  • For publishing, we're always interested in the state at the current block, which we always have, so, fine.
  • For off-chain content validation, we should always be interested in the state as of the block at which the content was published. This is always available from an archive node.

The issue, then, is this: given that off-chain validation will usually be occurring at a block reasonably close to the block at which the content was published, how can we reduce the likelihood that an off-chain validator would need access to an archive node in order to validate the content?

By adding a granted_at block number, we can easily determine the following without an archive node:

  • If no entry exists -> unauthorized
  • If block_number >= granted_at && revoked_at == 0 -> authorized
  • If block_number >= granted_at && block_number < revoked_at -> authorized
  • If block_number >= revoked_at -> unauthorized

However, if block_number < granted_at, any of the following scenarios may apply, which, in order to determin, we must query state at block_number, which would likely require an archive node:

  • No prior delegation exists -> unauthorized
  • Prior delegation(s) exist, but block_number falls outside of any of their ranges -> unauthorized
  • Prior delegation(s) exist, and block_number falls within one of their ranges -> authorized

Currently, without the information granted_at, the only cases we're able to determinstically assert are a subset of the unauthorized cases, but we're unable to assert any authorized cases:

  • If no entry exists -> unauthorized
  • If block_number >= revoked_at > 0 -> unauthorized

All other cases require knowing whether a grant was in effect as of block_number, which requires reading state at that block, which may require an archive node.

JoeCap08055 avatar Jul 24 '25 13:07 JoeCap08055

TL;DR

Since, for the majority of cases, posted content would fall within the bounds of a currently in force or recently revoked delegation, adding the granted_at block number would allow off-chain validators to quickly assert authorized status without resort to archive node lookup.

Caveat

Storage migration involving Delegations is difficult, as it would (as of this writing) involve > 2M entries on mainnet; however, this difficulty only increases over time. It is therefor crucial to consider this sooner rather than later.

JoeCap08055 avatar Jul 24 '25 13:07 JoeCap08055

I went looking for the original delegation design discussions around why we chose to not support gaps. I was unable to find it, but here's the summary I remember:

  1. Gaps are storage expensive. Storage of the data is unlimited. In theory someone could delegate and revoke continuously.
  2. Gaps are computationally expensive in validation. Given a schemaId and random block, the cost to discover if a delegation is allowed is much more complex than a simple boolean.
  3. Meaning of a delegation. Delegation is
  4. Low need. The number of people needing gaped delegations is very small.
  5. Low risk. Someone might want to undelegate, then later re-delegate. This however ONLY has an impact of off-chain related delegation actions. And only then if someone isn't checking the delegation at that time. Yes, a malicious provider could do things that would then be odd if re-delegated to, but a malicious provider has to act before the re-delegation and has just as much power given the delegation. On chain delegation checks are at time of execution.

wilwade avatar Jul 25 '25 16:07 wilwade

I went looking for the original delegation design discussions around why we chose to not support gaps. I was unable to find it, but here's the summary I remember:

I get the arguments against supporting arbitrary gaps; in the case of a gap, in order to avoid the need for an archive node, it would greatly increase on-chain storage & complexity.

However, as illustrated above, even in the case of a single delegation, we cannot accurately determine its validity at a given block number without an archive node, currently. Consider the following scenario:

  • New user onboards with Provider A and creates/delegates MSA ID 1234
  • Provider B posts some content puportedly from MSA ID 1234.
    • at this point in time, we are able to correctly determine that this content is invalid, because no delegation exists for 1234 -> Provider B
  • MSA 1234 subsequently onboards with Provider B and grants a delegation
  • The previous unauthorized content now appears to be authorized, because a delegation exists, and we don't have the information about when that delegation started. An archive node is therefore required to correctly flag that content as unauthorized. If we simply stored the granted_at block number, it would resolve this particular scenario.

JoeCap08055 avatar Jul 25 '25 17:07 JoeCap08055

@JoeCap08055 so you are suggesting that we store the granted at block, but effectively never change it.

So in the re-delegation scenario it might look like this (not worrying about naming):

  • Alice delegate to Foo Provider Schema 1, at block 111: { 1: {started: 111, ended: null}}
  • Alice revokes delegation to Foo Provider Schema 1, at block 222: { 1: {started: 111, ended: 222}}
  • Alice redelegates delegation to Foo Provider Schema 1, at block 333: { 1: {started: 111, ended: null}}
  • Alice revokes the redelegation delegation to Foo Provider Schema 1, at block 444: { 1: {started: 111, ended: 444}}

That is a minor change, but I think doable.

wilwade avatar Jul 25 '25 17:07 wilwade

@JoeCap08055 so you are suggesting that we store the granted at block, but effectively never change it.

Not quite. I'm suggesting that we store the granted_at block for the most recent delegation. A re-delegation would overwrite granted_at. It's then easy to determine whether you need to access a historical block for further information.

In your example above, this would yield:

  • Alice delegates to Foo Provider Schema 1, at block 111: { 1: {started: 111, ended: null}}
  • Alice revokes delegation to Foo Provider Schema 1, at block 222: { 1: {started: 111, ended: 222}}
  • Alice redelegates delegation to Foo Provider Schema 1, at block 333: { 1: {started: 333, ended: null}}
  • Alice revokes the redelegation delegation to Foo Provider Schema 1, at block 444: { 1: {started: 333, ended: 444}}

If we don't overwrite granted_at, once we re-delegate block 333, it's impossible to determine whether we need to check a historical block for content in blocks 222-332.

It's still a simple change. I suppose we could even avoid a complicated migration by defaulting all current delegations to granted_at = 0 or some other fixed block number; since there's no-one posting content on mainnet (at least, not on behalf of any MSA) yet, that should be OK.

Still could be a large migration, given the number of delegations on mainnet...

JoeCap08055 avatar Jul 25 '25 18:07 JoeCap08055

That, however, would lose us the information and state of the prior delegation. That means we would have to do state historical lookups for every single potential prior delegation.

If we eventually get rid of archive nodes, that also won't work.

Either way, since you know the block you are looking for, you could do a historical block look up every single time, and check the state at that point in time.

wilwade avatar Jul 25 '25 19:07 wilwade

That, however, would lose us the information and state of the prior delegation. That means we would have to do state historical lookups for every single potential prior delegation.

Yes—but to prove the majority of happy-path cases, it would only require information currently on-chain. Verifying any content posted prior to the most-recent delegation would always require an archive node, anyway. However, there are only two scenarios in which you would need to do that:

  • Gaps (ie, delegation-undelegation-redelegation, etc) exist and an app wants to verify valid content posted prior to the most recent delegation
  • Malicious/unauthorized provider posted content prior to the most recent delegation

If we eventually get rid of archive nodes, that also won't work.

If we get rid of archive nodes, we'll have to revisit the issue, I expect; that would likely have additional implications beyond delegations.

I think, for now, this is the best solution. The only other options I see would appear to be non-starters; i.e.:

  • keeping the entire delegation history in the current state
  • having the chain validate delegations for every row in a batch file at write time

JoeCap08055 avatar Jul 26 '25 15:07 JoeCap08055

but to prove the majority of happy-path cases, it would only require information currently on-chain

Possible cases as I see it:

  1. User delegates. User never revokes delegation; they just stop using the app.
  2. User delegates forever and never stops using the app.
  3. User delegates. User revokes delegation forever.
  4. User delegates, revokes, and later re-delegates.

I think we can all agree that the number of users in Case 1 and 2 will absolutely dwarf 3 and 4. I think 3 and 4 will be comparable in numbers.

Obviously, this proposal wants to ensure in #4 that some provider does not post on User's behalf in the interim blocks. If User has revoked a delegation, we can be almost certain that they aren't using Provider's app with that MSA. The only reason Provider would post on their behalf, it seems, would be maliciously.

I find the likelihood of this happening to be extremely low. However I could also see someone going back and doing some checking of historical MSA-related posts if the granted_at field is updated, and falsely conclude the Provider was doing something wrong. I think it's reasonable to assume some people will run an archive node, and some might even be motivated to dig into delegations but it does also seem extremely unlikely.

The additional field doesn't actually prevent Providers from posting on an MSA's behalf when they shouldn't, regardless of whether there is a re-delegation. We have generally expected that there would be DSNP validation services checking batch files when they are announced. If we make Gateway do that checking before submitting batch announcements, then every Provider using Gateway would be in the clear, and only non-Gateway posters need be checked.

So I've flip-flopped on this a bit and talked myself into this idea. Even though MSA Ids are sequential, we don't have a direct way to know at what block one was created without an archive node past a certain # of blocks. I actually think granted_at is a good addition now. It will default to zero anyway because Rust; I believe that struct already derives Default, correct?

shannonwells avatar Jul 29 '25 14:07 shannonwells

@shannonwells Which version of using granted_at are you thinking about? (specifically around #4)

A. granted_at is set to the initial block of the first delegation and never changed even if the user revokes and re-delegates B. granted_at is updated to the granted_at block of any new re-delegation when the user revokes and does a re-delegation

wilwade avatar Jul 29 '25 14:07 wilwade

but to prove the majority of happy-path cases, it would only require information currently on-chain

Possible cases as I see it:

  1. User delegates. User never revokes delegation; they just stop using the app.
  2. User delegates forever and never stops using the app.
  3. User delegates. User revokes delegation forever.
  4. User delegates, revokes, and later re-delegates.

I think we can all agree that the number of users in Case 1 and 2 will absolutely dwarf 3 and 4. I think 3 and 4 will be comparable in numbers.

Obviously, this proposal wants to ensure in # 4 that some provider does not post on User's behalf in the interim blocks. If User has revoked a delegation, we can be almost certain that they aren't using Provider's app with that MSA. The only reason Provider would post on their behalf, it seems, would be maliciously.

Actually, no. This proposal is not about write-time—since that always occurs at the current block, we have no need to rely on anything beyond the current state. In the on-chain write case, granted_at buys us nothing (unless at some point we support creating future delegations).

What this proposal aims to do, is make it easy to positively prove the majority of happy-path cases. That is, it is expected that most content posted would be authorized content in the scope of the latest/current and ONLY delegation.

Any attempts to validate content posted prior to the current granted_at block for a delegation can only fall into one of the following scenarios:

  • Malicious/unauthorized content
  • Authorized content for a user w/delegation gaps

As you note, even in the archival scanning app case, it's unlikely that posted content would fall before the latest granted_at unless one of the two scenarios above applied.

I find the likelihood of this happening to be extremely low. However I could also see someone going back and doing some checking of historical MSA-related posts if the granted_at field is updated, and falsely conclude the Provider was doing something wrong. I think it's reasonable to assume some people will run an archive node, and some might even be motivated to dig into delegations but it does also seem extremely unlikely.

The additional field doesn't actually prevent Providers from posting on an MSA's behalf when they shouldn't, regardless of whether there is a re-delegation. We have generally expected that there would be DSNP validation services checking batch files when they are announced. If we make Gateway do that checking before submitting batch announcements, then every Provider using Gateway would be in the clear, and only non-Gateway posters need be checked.

Well... in the clear "ish":

  • There's no way of deterministically knowing whether content was actually posted by Gateway; we can only make assumptions about the based on software a Provider purports to be using.
  • Keys can be stolen...
  • Gateway can easily validate individual posts as they're received. However, as Gateway now has the ability to blindly pass-through batch files (POST /v3/content/batchAnnouncement), this becomes slightly more difficult... must we parse the entire batch file and validate every delegation?

So I've flip-flopped on this a bit and talked myself into this idea. Even though MSA Ids are sequential, we don't have a direct way to know at what block one was created without an archive node past a certain # of blocks. I actually think granted_at is a good addition now. It will default to zero anyway because Rust; I believe that struct already derives Default, correct?

It does not currently derive Default, no. But even with that derivation, would that allow us to deserialize/decode existing data that did not contain the granted_at field? I had assumed a migration would be necessary...

JoeCap08055 avatar Jul 29 '25 16:07 JoeCap08055