snarkOS icon indicating copy to clipboard operation
snarkOS copied to clipboard

[Proposal] Change DataMap ownership management

Open IAvecilla opened this issue 3 years ago • 4 comments

Introduction

I was implementing a new get_ciphertexts function that returns an iterator over all ciphertexts in the ledger. To achieve this, I did the following PR with some of the first changes. I had to use cloned() and collect() functions to get that working and the performance seems to be worsened by this. Howard suggested a change in this comment. I couldn’t do that implementation because of ownership restrictions. Searching for a solution I realized that the values() function associated with the DataMap returns an iterator over the values but it takes ownership of the elements, so if I want to return an iterator with borrowed ciphertexts I can’t take ownership of the transition. The error that I’m getting is this one:

error[E0515]: cannot return value referencing local variable `transition`
    --> storage/src/state/ledger.rs:1646:44
     |
1646 |             .flat_map(|(_, _, transition)| transition.ciphertexts().collect::<Vec<&N::RecordCiphertext>>())
     |                                            ------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |                                            |
     |                                            returns a value referencing data owned by the current function
     |                                            `transition` is borrowed here

Also, this error happens with the Cow implementation too.

:boom: Proposal

For better performance in the DataMap behaviour, It might be a good idea to change the Values logic for returning a borrow of the values instead of taking ownership of them. But this could be a breaking change in all the other libs that use this DataMap structure.

IAvecilla avatar May 13 '22 20:05 IAvecilla

There's an issue with this proposal; the values that are returned by DataMap are already deserialized, so there is no benefit to borrowing them instead of taking them by value. The only way to obtain references to database values would be to return them in their "raw" version, i.e. plain bytes using https://docs.rs/rocksdb/latest/rocksdb/struct.DBWithThreadMode.html#method.get_pinned.

ljedrz avatar May 20 '22 07:05 ljedrz

That being said, I wouldn't mind extending the API to allow values to be returned without them being immediately deserialized (which is a much greater performance penalty than them being cloned) if @howardwu is ok with this - this would allow iteration over references.

ljedrz avatar May 20 '22 07:05 ljedrz

@ljedrz One option could be to introduce *_raw() -> Result<Raw<T>> methods that hold Bytes until the user calls deserialize. This would be similar to our delayed serialization approach in snarkOS messages.

howardwu avatar Jun 16 '23 03:06 howardwu

Yes, this can be arranged; would we want all the maps to support this kind of iteration?

ljedrz avatar Jun 19 '23 12:06 ljedrz