dynomite icon indicating copy to clipboard operation
dynomite copied to clipboard

Store Uuid as Bytes instead of String

Open bottiger opened this issue 5 years ago • 10 comments

💡 Store Uuid as bytes

The current support for Uuid's store the Uuid's as a hyphenated string. However, this encoding is obviously a bit verbose since a string is a little more than twice as large as the 128 bit value the Uuid represents - without any clear benefit. For a database with large documents this will be a non issue, but for a large collection of small documents this might introduce a significant overhead.

Backwards compatibility for existing users is an issue though.

💻 Basic example

Just save the Uuid as a binary blob (base64 encoded in the API) instead of a string.

// A `String` type for `Uuids`, represented by the `S` AttributeValue type
#[cfg(feature = "uuid")]
impl Attribute for Uuid {
    fn into_attr(self: Self) -> AttributeValue {
        Bytes::copy_from_slice(self.as_bytes()).into_attr()
    }

PS: This code snippet requires Bytes 0.5 in contrast to Bytes 0.4 which dynomite currently uses.

bottiger avatar Feb 20 '20 22:02 bottiger

I like where you're going with this. What would be the suggested migration path for those already using a string attribute type for their keys?

softprops avatar Feb 22 '20 22:02 softprops

Unfortunately it is unclear to me what the best way forward is. It is not obvious to me how you upgrade an existing database (assuming you are using the Uuid as the partition key), so I think the best way would simply be to support both. Perhaps is can be implemented using a cargo feature where users can opt in to using the legacy string representation.

bottiger avatar Mar 11 '20 08:03 bottiger

Perhaps is can be implemented using a cargo feature where users can opt in to using the legacy string representation.

That's an excellent idea. Would you be up for submitting or pr on this. I'd be happy to merge and publish a new version.

softprops avatar Mar 11 '20 22:03 softprops

@softprops I can try that. But my time is currently very limited, so it will be whenever I have a small gap

bottiger avatar Mar 16 '20 13:03 bottiger

Perhaps is can be implemented using a cargo feature where users can opt in to using the legacy string representation.

Definitely don't force a particular storage representation on users. Some may not want bytes for whatever reason.

phrohdoh avatar May 22 '20 20:05 phrohdoh

I'd imagine the way to do this if someone wants to pick this up is with a cargo feature flag defaulting to the current format

softprops avatar May 22 '20 21:05 softprops

@softprops , I'm learning Rust and would give it a try with some guidance from you. I do use UUIDs a lot and store them as UUID type in Postgres, which is bytes. Is this the right place to make the change?

https://github.com/softprops/dynomite/blob/7667bf9de68fd3c52a91afd55d099d15b5c44952/dynomite/src/lib.rs#L279

I'll see if I can wrap my head around it. No pressure, though - you asked for help, not an apprentice :)

rimutaka avatar Jul 01 '20 04:07 rimutaka

Thanks!

You are off to a great start already. That's where I would be looking.

Ideally we'd want to do with this with a cargo feature flag with the default the current behavior to avoid breaking existing applications. Let me know if you get stuck or need more direction. Happy to help

softprops avatar Jul 01 '20 05:07 softprops

This feature may cause more problems than it solves. The same document may have multiple UUID properties. Some can be stored as bytes to save space, some must be stored as strings, for example for front end scripts in JS. Having a "feature" to choose one or the other will not let us discriminate between the two within the same document.

There must be some other way, e.g. #[dynomite(as_bytes)].

rimutaka avatar Jul 01 '20 09:07 rimutaka

The feature can be implemented inside the Attribute impl for Uuid since it's the one place uuids are serialized and deserialized to dynamodb attribute values for any given application so a single item having multiple uuid fields is not a problem.

If you are using serde to serialize a struct to JavaScript you can use serde to serialize in the format your front end expects.

softprops avatar Jul 01 '20 21:07 softprops