edgedb-rust icon indicating copy to clipboard operation
edgedb-rust copied to clipboard

Use a more idiomatic edgedb-native Range

Open CodesInChaos opened this issue 1 year ago • 5 comments

The native EdgeDB type looks like this:

pub struct Range<T> {
    pub(crate) lower: Option<T>,
    pub(crate) upper: Option<T>,
    pub(crate) inc_lower: bool,
    pub(crate) inc_upper: bool,
    pub(crate) empty: bool,
}

I think a more idomatic way of representing that would be one of the following. All of these can still represent every valid edgedb value.

I don't think these are breaking changes, since the current struct has private fields. However the fields have corresponding public accessors, which I'd like to get rid of if breaking changes are allowed (or at least deprecate).

use core::ops::Bound;

pub enum Range<T> {
    Empty,
    NonEmpty(Bound<T>, Bound<T>),
}

or

use core::ops::Bound;

pub enum Range<T> {
    Empty,
    NonEmpty { lower: Bound<T>, upper: Bound<T> },
}

or

use core::ops::Bound;

pub struct BoundedRange<T> {
    lower_bound: Bound<T>,
    upper_bound: Bound<T>,
}

pub enum Range<T> {
    Empty,
    NonEmpty(BoundedRange<T>),
}

or

pub struct BoundedRange<T>; // as above

pub struct MaybeEmptyRange<R>(inner:Option<R>);

type Range<T> = MaybeEmptyRange<BoundedRange<T>>;

I like the last one best, since it NonEmptyRange could be useful to be used in the model if the user knows it's never empty. It could also come in handy to model multi-ranges as something like Vec<NonEmptyRange<T>> (I'm assuming multi-ranges can't contain individual empty ranges).

CodesInChaos avatar Aug 28 '24 12:08 CodesInChaos

I tend to be against breaking changes whose justification is "it makes more sense some other way". Fortunately for this issue, there is a good reason for the proposed change:

Current range definition allows the following to exist:

Range {
    lower: None,
    upper: None,
    inc_lower: true,
    inc_upper: true,
    empty: false,
}

... which is very strange, as not having a bound and it being an inclusive does not go together. It is also possible to set empty to true and set other properties, which is also confusing.


I like your option 3 the most. Feels most understandable to me, it does not use type aliases and it also support the "modelling when user knows the range not to be empty".

BoundedRange<T> could also be reused when implementing multi-ranges:

struct MultiRange<T>(Vec<BoundedRange<T>>)

aljazerzen avatar Aug 28 '24 14:08 aljazerzen

Option 1 to 3 have two problems that option 4 can fix:

  • non canonical empty ranges, like an exclusive range where start == end, or an inclusive range where start > end. These could be represented as NonEmpty(actually-empty)
  • When using the built in rust Range/RangeInclusive it will not be possible to deserialize empty ranges, since they only represent empty ranges via start > end/end >= start respectively. So a generic MaybeEmptyRange wrapper could be used with them as well.

The basic changes above aren't breaking changes by themselves. But removing the weird accessors is.

And adding a PartialOrd constraint on the contained value to test for emptiness is breaking too, though all the types supported by edgedb ranges implement that trait anyways.


I think for now I'll go for a mix of 3 and 4:

pub struct BoundedRange<T>; // as above

pub struct Range<T>(Option<BoundedRange<T>>);

impl<T> Range<T> {
    fn is_empty(&self) -> bool;
    fn non_empty(&self) -> &BoundedRange<T>;
}

This allows fixing the non-canonical empty problem, and defers handling of std::ops. Deferring might be a good idea for now, especially since Rust 2024 will bring some changes in that area.

CodesInChaos avatar Aug 28 '24 14:08 CodesInChaos

Another problematic feature is: #[cfg_attr(feature = "with-serde", derive(serde::Serialize, serde::Deserialize))] on the existing struct. This leaks the format of the private fields into json. I would suggest adding manual implementations of these, and make those equivalent to how <json>range(...) works in EdgeQL.

CodesInChaos avatar Aug 28 '24 14:08 CodesInChaos

If we use PartialOrd to limit BoundedRange to non-empty ranges, then making Range an enum works. That seems like the cleanest design, even if means that constructing a BoundedRange from std::ops ranges would be fallible.


edit: Looks like causes problems forRange<Box<Value>>, but they can probably be worked around

CodesInChaos avatar Aug 28 '24 15:08 CodesInChaos

Unfortunately there were some unexpected challenges:

NaN handling

Edgedb treats NaN as larger than any normal value and equal to itself, while rust treats it as incomparable. I solved this by introducing a private EdgedbOrd trait which follows EdgeDB's comparison rules.

I also implemented that trait for Value, and panic if the compared values don't have the same type. I don't think it's actually possible to hit the panic, since constructing a Value::Range starts with a strongly typed Range. This might need adjustments once I implement deserialization.

Canonicalizing Bounds

EdgeDB cononicalizes discrete ranges so the start is inclusive and the end exclusive. The constructors of the rust range types do the same. This is done via the RangeScalar trait, which serves both as marker for the traits EdgeDB ranges support, and to enable this functionality.

See PR #345 (doesn't compile yet)

CodesInChaos avatar Aug 28 '24 20:08 CodesInChaos