rescript-compiler icon indicating copy to clipboard operation
rescript-compiler copied to clipboard

Record representation is unstable

Open cometkim opened this issue 1 year ago • 9 comments

Since we added support for optional fields on records, a crucial assumption was broken: up until then, records can be expected to have a fixed layout, as its definition.

So records with the same values for the same keys always had the same hash.

However, by introducing optional fields, we can easily create values ​​that pass the equality check but have different hashes.

type t = {
  a: string,
  b?: string,
  c: string,
}

let v1 = {
  a: "a",
  c: "c",
}

let v2 = {
  ...v1,
  b: "b",
}

let v3 = {
  a: "a",
  b: "b",
  c: "c"
}

assert (Hashtbl.hash(v2) == Hashtbl.hash(v3))

https://rescript-lang.org/try?version=v11.1.4&code=C4TwDgpgBMULxQN4CgpQIYC4oGdgCcBLAOwHMAaVKAIwH5s8izK0BjBgki5AX2WQA2EWADcAjPCRUsUAETpZLKOzmtFvfkNEAmSSjQA6I+KXVss6ur6DhUEQGY908wtPnLSlbLWVr6HDgQ+MAAFAAS-gAWwNQCBpFRIeIAlPAIETjRsfGJDsnJyEA

v2 and v3 have the same values for the same keys but in different order.

  • Obj.entries(v2) => [['a', 'a'], ['c', 'c'], ['b', 'b']]
  • Obj.entries(v3) => [['a', 'a'], ['b', 'b'], ['c', 'c']]

This literally means that users can't use a record as a key in a hashtable or other data structure. In other words, it's not a record.

cometkim avatar Sep 10 '24 21:09 cometkim

One possible fix is to add entry sorting at runtime, which will add serious performance degradation wherever optional fields are used (e.g. JSXv4)

Another way is to initialize optional fields to None, but this will again change the JS semantic.

cometkim avatar Sep 10 '24 21:09 cometkim

Coercion for records also break this assumption (more fields can be on the object than what's represented by the record).

Honestly I think these are just quirks we'll have to live with in some form if we want optional record fields (which we do) and coercion (which we also do).

Semi-unrelated, but one thing I've thought about before is whether one could have an annotation for a record definition type that disallows optional fields and/or coercion. Or one annotation for each thing. That way you could set up a "protected" record that you know will have the exact runtime shape you expect (unless it's from an external, but 🤷 ).

zth avatar Sep 11 '24 06:09 zth

Coercion for records also break this assumption (more fields can be on the object than what's represented by the record).

Honestly I think these are just quirks we'll have to live with in some form if we want optional record fields (which we do) and coercion (which we also do).

I think we should mention this caveat in the docs for optional fields. I rarely use record as the key (or a complex type), but who knows.

shulhi avatar Sep 11 '24 13:09 shulhi

Actually that was never true because of FFI.

cristianoc avatar Sep 11 '24 14:09 cristianoc

Coercion for records also break this assumption (more fields can be on the object than what's represented by the record).

IMO it should have runtime effects, not just be a simple type assertion.

type a = {
  foo: string,
  bar: string,
}

type b = {
  bar: string,
}

a :> b

Output should be { bar: a.bar } rather than a. It also makes more sense to extend it to custom encoding/decoding later.

I know it has a tradeoff, but we need to care a little more for integrity, even if it needs runtime.

cometkim avatar Sep 12 '24 09:09 cometkim

Actually that was never true because of FFI.

At least when using FFI explicitly, users can be quite careful about it. But in pure ReScript-produced code, this behavior is somewhat surprising to me.

cometkim avatar Sep 12 '24 09:09 cometkim

Honestly I think these are just quirks we'll have to live with in some form if we want optional record fields (which we do) and coercion (which we also do).

If so, we should deprecate supporting comparison operations (==, >, <, >=, <=) on records. It's a general property when dealing with immutable data, but if we can't fully support it, it's just an unreliable footgun.

cometkim avatar Sep 12 '24 11:09 cometkim

If so, we should deprecate supporting comparison operations (==, >, <, >=, <=) on records. It's a general property when dealing with immutable data, but if we can't fully support it, it's just an unreliable footgun.

This is definitely a good idea. I wonder however if anyone is using them. That type of comparison could be quite powerful (== for records anyway) but I wonder whether it actually works as expected today?

zth avatar Sep 12 '24 12:09 zth

This is definitely a good idea. I wonder however if anyone is using them. That type of comparison could be quite powerful (== for records anyway) but I wonder whether it actually works as expected today?

OCaml std users may be using it, but there's no guarantee that it will work without bugs.

We're dropping OCaml std, and Belt is a little better because it makes users explicitly build their own Hashable and Comparable modules.

But still, using Hashtbl.hash or comparison operators as its implementation is a common usage pattern that as far as I know.

However, the advantage of module functions is that it makes it easy to replace the implementation. Users can get both correctness and performance by replacing in a custom implementation.

Then we could extend language support something like @deriving to support deriving hash and comparison, like Rust's derive macro, or Java's Lombok does.

cometkim avatar Sep 12 '24 14:09 cometkim

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Sep 08 '25 02:09 github-actions[bot]