bee icon indicating copy to clipboard operation
bee copied to clipboard

Retrieval Redundancy Level not set (defaults to PARANOID)

Open ldeffenb opened this issue 1 year ago • 3 comments

Context

2.1.0-rc2 (and probably 2.0.* as well)

Summary

See #4693 for details, but tracking that down found that SetLevelInContext is never invoked by the /bytes api therefore when the joiner invokes GetLevelFromContext, it gets the default value of 4 or PARANOID.

Expected behavior

If the Swarm-Redundancy-Strategy is specified on /bytes, then I would expect it to be honored by the joiner.

Actual behavior

My hacked logs (see here ) prove that the joiner is using rLevel = 4.

Steps to reproduce

Add your own logs, retrieve something with the /bytes API specifying Swarm-Redundancy-Strategy and verify that it doesn't get to the joiner. Or just grep the source code for SetLevelInContext and GetLevelInContext. The file pipeline invokes the former, but that isn't invoked for the root chunk.

Possible solution

Have the API invoke SetLevelInContext before creating the joiner like maybe as in this hack that only handles NONE: https://github.com/ldeffenb/bee/blob/ed323f9561f1ae541895153383fcb6edc1d142fc/pkg/api/bzz.go#L546

ldeffenb avatar May 24 '24 15:05 ldeffenb

In actual fact, it appears that the API doesn't have a way to specify the retrieval redundancy level, but only the strategy for prefetching and/or changing that strategy to a fallback.

This is decidedly bad IMHO.

ldeffenb avatar May 24 '24 15:05 ldeffenb

I added this workaround: https://github.com/ldeffenb/bee/blob/ed323f9561f1ae541895153383fcb6edc1d142fc/pkg/api/bzz.go#L546

ldeffenb avatar May 24 '24 18:05 ldeffenb

Do you really want this to default to PARANOID? It will cause LOTS of canceled retrievals when accessing /bzz files from a browser. Just hit the following URL on a sepolia testnet node and watch the retrieval failed debug logs. http://localhost:1633/bzz/fdfd170f73953bc262d936d3a5329b787980335dc0547032bb2a6239ebe95a76/14/coverage.png

ldeffenb avatar May 24 '24 18:05 ldeffenb

Do you really want this to default to PARANOID? It will cause LOTS of canceled retrievals when accessing /bzz files from a browser. Just hit the following URL on a sepolia testnet node and watch the retrieval failed debug logs. http://localhost:1633/bzz/fdfd170f73953bc262d936d3a5329b787980335dc0547032bb2a6239ebe95a76/14/coverage.png

Yes, this is kindof deliberate. The idea is that by default (when we do not for sure know the level of encoding, then we really try. But it should surely be overwritten by specifying the "SWARM-REDUNDANCY-LEVEL" header to the appropriate expectation including the case that "SWARM-REDUNDANCY-LEVEL=0" should supress the default level 4. If it turns out that too often replica requests are being fired even though the original is found and therefore there are two many cancellations, we could introduce a new header called SWARM-REPLICA-FALLBACK-INTERVAL which would wait before replicas belonging to a higher level are requested. Or maybe even better strategies similar to the ones for erasure codes could be used here:

  • (NONE meaning the same as SWARM-REDUNDANCY-LEVEL=0)
  • STEP retrieve replicas belonging to higher levels (upto the one set by the level header) only as a fallback after some delay if previous does not succeed
  • PROX first retrieve the replica closest to the node addresss
  • RACE attempt at retrieve replicas simultaneously (upto the level set in the level header) cancelling all after the first to succeed
  • SAME retrieve all replicas of a SOC simultaneously (upto the level set in the level header) and check if all have the same payload, errors if not and the error specifies all versions. [this is needed for the uniqueness check functionality to be used for instance by swap channel networks for soft channel deposit allocation table], see [https://www.overleaf.com/1452913241cqmzrpfpjkym#32d5d4](sw3 paper)

zelig avatar Sep 11 '24 14:09 zelig