Multigraphs via an edge index and edge properties/metadata
An effort to implement #18 as well as #54 for some of my own use cases. Looks like some planning work went into these in the past so not sure if there's a more preferred approach for multigraphs and/or adjacency indexing.
Ran into a performance use case for multigraphs where to minimize enumeration on traversals for I wanted an index to trade space for time.
Current approach
An opt-in flag Graph.new(multigraph: true) with options for using a partitioned adjacency index in reflection APIs (out_edges, edges, in_edges, etc).
By default this option will maintain an adjacency index partitioned by the edge label. This is overrideable with the :partition_by option which accepts and edge and returns a partition. E.g. Graph.new(multigraph: true, partition_by: fn edge -> edge.weight end)
Reflection API options:
:by : a term or list of terms containing the partition keys.
:where: a filter function which accepts an edge and returns a boolean to include or exclude it from the result.
The edge_index is implemented as a nested map %{partition => %{vertex_id => Mapset(edge_keys)}} so the :by option can use map access time to get the set of adjacent edges for one or more partitions.
Edge Properties
For metadata / edge properties this PR changes the edge value from %{label => edge_weight} to
@type edge_properties :: %{
label: label,
weight: edge_weight,
properties: map
}
as well as adding the properties map to the %Edge{} struct.
Todos:
- [x] Option to enable multigraphs via edge indexing
- [x] Override-able indexing function defaulting to
fn %{label: label} -> label endto return the key - [x] Change index function to support partitioning an edge to more than one set
- [x] Traversal APIs with filter predicates to benefit from the indexing
- [x] Support edge properties/metadata
- [ ] More docs
- [ ] CI/CD chores
I've been using this PR inside https://github.com/zblanco/runic for some time now as a way to keep causal runtime edges produced during DAG executions from increasing the dataflow traversal costs. It hasn't need changes and tests pass for 1.18 and might be worth reviewing.
I developed this locally on Elixir 1.18+ which entailed some changes to doctests related to ordering of some results but this leaves older versions broken in CI. Not 100% on a preferred course of action - I would rather find a way to keep older versions compatible so that's what I'm looking into now.
This test discrepancy may be related to OTP updates as it's not related to the actual changes.
1) test sizeof/1 (Graph.UtilsTest)
Error: test/utils_test.exs:8
match (=) failed
code: assert 456 = sizeof(String.duplicate("bar", 128))
left: 456
right: 440
stacktrace:
test/utils_test.exs:10: (test)