sanitize_graph function
I tried a minimal network with 5 edges joined at two nodes:

edges <- structure(list(from_id = c("vqO3WP", "SpeI4O", "p600f2", "SpeI4O",
"ZmNi0f"), to_id = c("SpeI4O", "p600f2", "b3oq5s", "gkZcTf",
"p600f2"), edge_id = 1:5, d = c(1, 1, 1, 1, 1)), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"))
dodgr::dodgr_contract_graph(edges)
I get an empty contraction, but I'm expecting 3 pairings
-
gkZcTf -> vq03WP -
SpeI4O -> p600f2 -
ZmNi0f -> b3oq5s
Am I still thinking about this incorrectly?
The short answer is that that graph can't be contracted in the dodgr sense. There is nothing that lies between the termini and the junction vertices. This results in the contracted graph being identical to the original, so the dodgr_contract_graph$edge_map object is empty. But there's nevertheless something NQR here ... more to come shortly
Make current top-left (gkZcTf) redundant by inserting extra from("gcKcTf") - to("a")
> edges <- structure(list(from_id = c("vqO3WP", "SpeI4O", "p600f2", "SpeI4O",
"ZmNi0f", "gkZcTf"),
to_id = c("SpeI4O", "p600f2", "b3oq5s", "gkZcTf",
"p600f2", "a"),
edge_id = 1:6,
d = c(1, 1, 1, 1, 1, 1)),
row.names = c(NA, -6L),
class = c("tbl_df", "tbl", "data.frame"))
> dodgr::dodgr_contract_graph(edges)
$graph
# A tibble: 5 x 4
from_id to_id edge_id d
<chr> <chr> <chr> <dbl>
1 SpeI4O a a123461 2
2 vqO3WP SpeI4O 1 1
3 SpeI4O p600f2 2 1
4 p600f2 b3oq5s 3 1
5 ZmNi0f p600f2 5 1
$edge_map
edge_new edge_old
1 a123461 4
2 a123461 6
and the edge_old column refers to the original graph:
> edges
# A tibble: 6 x 4
from_id to_id edge_id d
<chr> <chr> <int> <dbl>
1 vqO3WP SpeI4O 1 1
2 SpeI4O p600f2 2 1
3 p600f2 b3oq5s 3 1
4 SpeI4O gkZcTf 4 1
5 ZmNi0f p600f2 5 1
6 gkZcTf a 6 1
which tells us that former edge_ids 4 & 6 have been contracted to new, single edge a123461. So everything is alright after all. Does that make sense? And if you're playing around here, some suggestions would be really appreciated in terms of how I might make this whole process more intelligible. The fact that the result of dodgr_contract_graph() requires reference to the original submitted object to interpret it has always troubled me, and I usually end up making a meta-structure that contains the original plus the output of dodgr_contract_graph(). BUT these graphs can be huge beasts and that duplication can come at a cost, so I've been loath to automate that. Any other thoughts, ideas, please!
I think I see, it means I need to find those simple cases first and add them to the contracted set.
It's worth it because all my dodgy (not dodgr) ARC code will disappear when we get this.
It just occurred to me to get into the habit of using igraph to plot these minimal cases, so maybe I bring scgraph into it and work up some direct igraph plots too. I had to think pretty hard about the shapes in the graph you created, but it was helpful!
Oh, while we're at it - where does the insertion of implicit vertices live here? Would dodgr_contract_graph do that or is it another (precursor?) step?
(as discussed earlier: https://github.com/hypertidy/silicate/issues/60#issuecomment-391221735)
That last point is really important: The way I see it, there would / should be a general need for an independent sanitize_graph() function to insert implicit vertices. This shouldn't live in dodgr_contract_graph(), because it might be just an important to insert implicit vertices in a full (non-contracted) graph. Equally, the function shouldn't be directly called with any kind of ..._to_dodgr() function (such as the current weight_streetnet()), because maybe such insertion will not be desired. Thus my current thought is having to have it as an optional function that can be run if desired (along the lines of osmdata::poly2line()).
Another argument against auto-insertion is that this could be a pretty computationally expensive procedure? It requires calculating pair-wise intersections between all edges. Can obviously be run in parallel, but may consume notable processing time for large graphs. As for your ultimate question of where this function lives? I don't know. I'd be happy to put it as a stand-alone in dodgr, but maybe it would be more generally useful in silicate?
Absolutely agree.
Whatever is easiest, but we do need to have though about where things go - I'm a bit wary of requiring igraph (for example).
I'm happy for it to go in silicate! It seems this sanitize_graph could live in its own package and be generally useable elsewhere. ( I also think graph contraction perhaps belongs in another separate package, perhaps they can live together?).
"Absolutely agree." The question is the underlying graph representation fed to sanitize_graph. And I guess and hope that the answer there is SC. So maybe for the moment I'll park it here, rename this issue accordingly, write a stand-alone C++ routine to do the job (via RcppParallel), and then once the whole osmdata_sc -> dodgr workflow has been nutted out, and silicate has been CRAN-ified, we can strategise subsequent packages, whatever.