[Proposal] Change DataMap ownership management
Introduction
I was implementing a new get_ciphertexts function that returns an iterator over all ciphertexts in the ledger. To achieve this, I did the following PR with some of the first changes. I had to use cloned() and collect() functions to get that working and the performance seems to be worsened by this. Howard suggested a change in this comment. I couldn’t do that implementation because of ownership restrictions.
Searching for a solution I realized that the values() function associated with the DataMap returns an iterator over the values but it takes ownership of the elements, so if I want to return an iterator with borrowed ciphertexts I can’t take ownership of the transition. The error that I’m getting is this one:
error[E0515]: cannot return value referencing local variable `transition`
--> storage/src/state/ledger.rs:1646:44
|
1646 | .flat_map(|(_, _, transition)| transition.ciphertexts().collect::<Vec<&N::RecordCiphertext>>())
| ------------------------^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| returns a value referencing data owned by the current function
| `transition` is borrowed here
Also, this error happens with the Cow implementation too.
:boom: Proposal
For better performance in the DataMap behaviour, It might be a good idea to change the Values logic for returning a borrow of the values instead of taking ownership of them. But this could be a breaking change in all the other libs that use this DataMap structure.
There's an issue with this proposal; the values that are returned by DataMap are already deserialized, so there is no benefit to borrowing them instead of taking them by value. The only way to obtain references to database values would be to return them in their "raw" version, i.e. plain bytes using https://docs.rs/rocksdb/latest/rocksdb/struct.DBWithThreadMode.html#method.get_pinned.
That being said, I wouldn't mind extending the API to allow values to be returned without them being immediately deserialized (which is a much greater performance penalty than them being cloned) if @howardwu is ok with this - this would allow iteration over references.
@ljedrz One option could be to introduce *_raw() -> Result<Raw<T>> methods that hold Bytes until the user calls deserialize. This would be similar to our delayed serialization approach in snarkOS messages.
Yes, this can be arranged; would we want all the maps to support this kind of iteration?