Graphs.jl Question on neighbours functions returning a reference

Question - the Graphs.inneighbors, Graphs.outneighbors and consequently Graphs.neighbors specify that they return a reference to the current graph's internal structures, rather than a copy. Having fallen afoul of this, I was wondering why this is the case - as far as I can tell this would be trivial to have return a copy instead, so I'm assuming it's intentional, but the only use case I can see would be modifying a graph in a way that would presumably be better handled by add_edge!() and rem_edge!()?

Thanks for your time!

Nov 20 '23 15:11 jward0

The short answer is that copies are expensive because they allocate memory. So indeed you should not modify the output of neighbors and co unless you know what you are doing, because these functions often reference the actual fields of the graph object to save time.

Nov 20 '23 17:11 gdalle

Oh, of course - that should've occurred to me! Thanks for humouring my question.

Nov 21 '23 09:11 jward0

Anytime!

Nov 21 '23 09:11 gdalle

To add to this, this is indeed, because the adjancency structure used by SimpleGraph and SimpleDiGraph are indeed already the allocated vectors. In graph algorithms one often does something like

for u in vertices(g)
   for v in outneighbors(g, u)
      [...]
   end
end

which would be rather slow.

we could create some wrapper structure FrozenVector (similar to frozenset in Python) that wraps a vector but does not allow to modify the vector. One just has to be a bit careful so that propagating @inbounds works, as otherwise we will lose a lot of performance. There is at least one package that already creates something like that, but we might as well implement it ourselves in order to keep dependencies low: https://github.com/bkamins/ReadOnlyArrays.jl

Nov 21 '23 11:11 simonschoelly

Unrelated but does so much of Graphs.jl perf hinge on @inbounds? I genuinely have no clue

Nov 21 '23 12:11 gdalle

Unrelated but does so much of Graphs.jl perf hinge on @inbounds? I genuinely have no clue

I think in thight loops it can have quite an impact, on one hand we have to check the array bounds for each iteration, and on the other hand, this also prevents the compiler from doing more optimizations such as using simd instructions.

Nov 22 '23 02:11 simonschoelly

I made a PR with a suggestion for such a FrozenVector: #317

Nov 22 '23 02:11 simonschoelly

While I don't disagree with the principle of making neighbors immutable, I wonder whether this is necessary. Users may be a bit surprised by the type returned, so I would rather keep the existing behavior if possible. @jward0 in what kind of situation did this trick you?

Nov 22 '23 08:11 gdalle

I'm working with graphs containing "intermediate" nodes between "significant" ones, sometimes I need to consider only "significant" nodes when considering neighbours, so after calling neighbors() I then mutated it to step through any "intermediate" nodes in the vector to find the "significant" ones they led to. Definitely not a common use case, trivially easy to fix in my code once I realised what was going on, and the problem would've been entirely avoided if I'd read the docs a bit better!

Nov 22 '23 09:11 jward0