AVM icon indicating copy to clipboard operation
AVM copied to clipboard

Remove Blockchain.getStorage / Blockchain.putStorage

Open fulldecent opened this issue 6 years ago • 12 comments

The function signature for Blockchain.getStorage / Blockchain.putStorage encourages boilerplate code and this results in unsafe implementations.

  1. Assumption: every contract using Blockchain.putStorage are implementing hashmaps/arrays
  2. Assumption: every contract using Blockchain.putStorage has 2 or more hashmaps/arrays
  3. Now therefore, the following will be best practice for every contract:
private enum StorageSlots {
    NamesArray,
    BalancesHashMap
}

Blockchain.putStorage(
    Blockchain.blake2b(
        AionBuffer.allocate(SOMETHING_GOES_HERE)
                .putInt(StorageSlots.NamesArray.toString())
                // .put* PUT OTHER STUFF HERE
                .getArray();
    ),
    SOME_VALUE
);

Therefore, in every case, the boilerplate can be factored out at least to:

Blockchain.putStorage(
    STORGE_SLOT,
    SUB_SLOT,
    SOME_VALUE
);

where SUB_SLOT can be null or any value. And then this new function will apply Blockchain.blake2b to SUB_SLOT.

In every case it seems that this API would be safer, semantic and reduce verbosity, therefore I recommend that this replace the existing API.

References

  • --NO LINK YET-- Will's upcoming AIP

fulldecent avatar Jul 12 '19 18:07 fulldecent

Might need something a little more nuanced for the case where a contract inherits another contract that uses key-value storage on its own.

fulldecent avatar Jul 12 '19 18:07 fulldecent

This can be considered, subject to patterns we see being used, in the AVM 2.0 timeframe as it changes (or at least adds) API.

jeff-aion avatar Jul 15 '19 15:07 jeff-aion

Thanks for the feedback. An implementation is here

https://github.com/fulldecent/aion-aip010/blob/master/src/main/java/org/aion/AIP010KeyValueStorage.java

and the use site https://github.com/fulldecent/aion-aip010/blob/master/src/main/java/org/aion/AIP010.java#L81

My main concern now is finding if this approach will be considered best practice.

fulldecent avatar Jul 16 '19 03:07 fulldecent

Talking with people on this side, this sounds like something we would just add to the userlib, as it avoids duplication in a fairly common-case. The only real question of how to best approach this is in what is passed in as the namespace key versus the element key. The easiest answer is byte[] but the example you demonstrated with an int from an Enum ordinal is also a compelling idea.

We can proceed with a userlib modification (which doesn't require a hard-fork - it is just something packaged with the contract) if we can figure out the right type and null (or equivalent) behaviour.

jeff-aion avatar Jul 19 '19 20:07 jeff-aion

Thanks for checking into this. At the most general case we are addressing the key-value store using a variable-length tuple (i.e. vector).

These tuples should be fully qualified. In other words we must encourage code reuse patterns where methods AND STORAGE KEYS are defined in multiple files. Below is an all-encompassing use case which explains every type of storage situation Aion will ever encounter in a key-value store as a basis for defining how the user lib should work.

Here is a great key for storing your token name, which is used once for the whole contract, and which is an implementation detail of the reference implementation of aipXXX:

{"org.aion.aipXXXimplementation.storageSlots.tokenName"}

Here is a great key for storing your token balance for each account:

{"org.aion.aipXXXimplementation.storageSlots.balance", account}

Here is a great key for storing permission from each account to an authorized operator (a bool for a tuple of account and operator):

{"org.aion.aipXXXImplementation.storageSlots.operatorPermission", account, operator}

Then let's say your application reuses code from the aipXXX token reference implementation (e.g. subclasses, inherits, whatever) and defines its own variable for knowing who most recently received a token. (Hey it's their application, they can do whatever they want.) Here is a great key for that:

{"com.example.appThatTracksMostRecentTokenMoves.storageSlots.mostRecentTokenMover"}

Now after that, on a Saturday, before launch week, the CTO decides to switch from aipXXX to aipYYY. So, hopefully, they just swap out one line of code any everything should work, right? Well, the chosen aipYYY implementation also just happens to define a storage slot mostRecentTokenMover for some reason. Will this break anything? No! Because com.example.appThatTracksMostRecentTokenMoves.storageSlots.mostRecentTokenMover is fully qualified and distinct from org.aion.aipYYYSomeImplementaation.storageSlots.mostRecentTokenMover.


In Java, an Enum can be passed around as an opaque type (even if it's private, crazy) and can be resolved to a fully qualified description (someEnum.describeConstable().orElseThrow().toString()). So the key-value store interface could be:

public void putStorage(byte[] value, Enum domain, byte[]... keyTuple);
public void getStorage(Enum domain, byte[]... keyTuple);

The implementation of putStorage and getStorage MUST NOT naively use AionBuffer + Blake2k. This is because an AionBuffer is a stream, not a tuple!

Using AionBuffer would cause the following vulnerability:

Blockchain.putStorage("Red".getBytes, storageSlots.FavoriteColor, "stringA".getBytes(), "stringB".getBytes);
Blockchain.putStorage("Red".getBytes, storageSlots.FavoriteColor, "stringAstringB".getBytes());

The tuple {"stringA", "stringB"} does not equal the tuple ("stringAstringB") and therefore they should not be the same key.

fulldecent avatar Jul 20 '19 18:07 fulldecent

I suspect that using an Enum for the helper interface makes this most descriptive and serializing, using the ordinal as the namespace, probably does the right thing. I am generally wary of using strings for this since they aren't as specific (from a type perspective) and they grow the size of the deployment and object graph (since the constants are real user-space data).

jeff-aion avatar Jul 29 '19 17:07 jeff-aion

I would love to use integers if possible. Here is the use case:

code/OwnableContract.java code/OwnableContract-storage.java code/FungibleToken.java code/FungibleToken-storage.java code/MyStarshipGame.java code/MyStarshipGame-storage.java

The OwnableContract and FungibleToken code is reused (copy/pasted, imported/subclassed, etc) from an upstream vendor. Is it possible in Java for all three authors to guarantee that their enums will have different ints?

fulldecent avatar Jul 29 '19 23:07 fulldecent

It is a trade-off between how much of this should be managed as cut-and-paste, versus actual integration decisions. While things like ABI resolution require a very concrete answer (since the method namespace must be the same, no matter what else is included), this one has other potential options. Ideally, I would favour reduced deployment and transaction cost, even at the cost of a somewhat manual integration effort, during deployment. Any thoughts, or ideas around patterns for such integrations?

jeff-aion avatar Jul 30 '19 03:07 jeff-aion

We cannot choose whether developers will copy/paste and expect things to work. We can only choose whether there will be disastrous consequences when developers copy and paste.


At the moment, fully qualified Enum names using strings works, is safe, is composeable and is intuitive. Also it might be expensive.

Strawman approach #2 could be: each contract has exactly one enum (probably in a separate file) and they must concatenate the storage names from each code they are reusing into that file. So it uses an int (Enum.ordinal).

This is a tradeoff. Will somebody please be able to implement a full example using both approach and present the data on how much more wasteful using strings actually is?

fulldecent avatar Jul 30 '19 17:07 fulldecent

Thinking further about this, I suspect that the Enum.hashCode() might actually do what you want, just as an int. In AVM, identity hash code (which Enum uses) is just an incrementing counter. The Enum constants would be initialized once, during deployment, so they would all have a different identity hash code, and each would be stable due to object graph persistence. No matter how many different Enum classes were created, each instance would still have a different value (which is not true of ordinal - by design).

jeff-aion avatar Aug 02 '19 15:08 jeff-aion

Nice, great find.

Could you please link to documentation that shows the guarantees you are relying on?

Are these hash codes deterministic for off-chain analysis?

fulldecent avatar Aug 02 '19 19:08 fulldecent

Related #409 is a great implementation of this concept here. It adds a type safe API for the values too.

fulldecent avatar Aug 15 '19 18:08 fulldecent