test: fine tune property tests
Draft disclaimer
i'm not married to anything here, and would appreciate any and all feedback
Description
the problem
allowing "blueprint"-style map schematics at the root-level of a list/1 schematic means that there is a chance (higher chance the more you run the property test) that you end up with a oneof/1 schematic to cover a list with a variety of maps in it where one of the map/1 schematics in that oneof is a highly permissive one, like map(%{}), which will readily unify with any map and then drop all of the keys when it is dumped out
the solution?
see the discussion below. TL;DR: we're not so sure
an additional property (or several) can at least be added to describe the behaviours of map schematics, particularly when there are unspecified keys that will be lost during unification
Closes #22
Specific requests for feedback
- in our thread for #22 we discussed limiting the generation space for the
input |> unify |> dump == inputtest; is this going too far?- related: i tried re-writing the generator to allow for a specific option to be passed in to avoid lists with a root-level map schematic defined using a "blueprint", but the resulting options (not to mention implementation) started to feel very cumbersome and confusing. it seems to me that having that level of granularity would be more trouble than it's worth
TODOs
- [ ] add a follow-up property test to describe the behaviour of map blueprint schematics, i.e. "unspecified keys in a map blueprint are discarded when dumped"
Did you try not generating maps with empty blueprints?
Did you try not generating maps with empty blueprints?
i believe i did, but i don’t see it in my notes. i’ll do it again just to be sure
i just tried ensuring that no empty-map blueprints are generated by matching on %{} and returning a K-V schematic like: Schematic.map(keys: Schematic.nullable(nil), values: Schematic.nullable(nil)), but that fails the property very quickly. maybe that approach is too naive?
i'm struggling to come up with a K-V schematic that would match %{} and nothing else, because if it doesn't do that then you end up in the same scenario as before, right? i guess you could do a raw/1 and make it always fail to unify for any value... let me try that
EDIT: ah, right, it just ignores non-conforming keys and lets it unify.
i would just omit the empty blueprint from the generated input space.
there is probably an argument that an empty blueprint should be illegal, or should not unify if there are keys in the input.
maybe i'm not thinking straight at this point in my day, but it seems to me that the property just doesn't hold as soon as you mix oneof/1 with the semantics of map/1, since the latter will drop keys that don't fit the schematic but can still successfully unify
e.g. if we have the list:
[%{"a" => 1}, %{"a" => 1, "b" => "foo"}]
then we might generate a schematic like:
[
map(keys: str(), values: int()),
map(keys: str(), values: oneof([str(), int()]))
]
|> oneof()
|> list()
which would unify both of the values using the first schematic, with the extra K-V in the second map being ignored, and then the property would be violated
I might have to noodle on this.
naively, i guess you could reduce over all of the schematics in a oneof and return the result of the unification that maintained the most data in the input? just spit-balling. definitely a performance hit there too
Yeah when implementing the initial version I thought of how to determine that and I'm not sure if that really is possible.
Unless you can somehow determine which schematic had the highest cardinality, sort by that, then attempt to unify.
Might be possible, but I'm also about to fall asleep so might be typing gibberish.
I feel like I should read a white paper on type systems.
hm, yeah. we might need a lot more info about the schematic than we currently have.
alternatively, once you call unify for one of the oneof schematics, and it's successful, and it's a map, couldn't we just hold onto that value and wait and see if another schematic in the oneof unifies, and then compare the size of the results via tree traversal? basically just "find the max" where the magnitude function is defined by a tree traversal
i think that whichever one is largest would have used the "most specific" schematic, and would thus be the best choice for unification.
but yeah, it's late. maybe Quinn or Keathley can suggest the appropriate paper(s)
alternatively, once you call
unifyfor one of theoneofschematics, and it's successful, and it's a map, couldn't we just hold onto that value and wait and see if another schematic in theoneofunifies, and then compare the size of the results via tree traversal? basically just "find the max" where the magnitude function is defined by a tree traversali think that whichever one is largest would have used the "most specific" schematic, and would thus be the best choice for unification.
i just went and implemented this and it does work as expected, and would handle a case like the one i laid out in a previous comment, but property testing can naturally still find the cases that break it.
at this point, i think i'm in favor of removing maps from that first property altogether, and coming up with a different set of properties that holds when maps are involved. unless you think that this property (or something very similar) should hold, regardless of the data-types involved, at which point we've probably got some reading to do 🤷