`as_graphnel` duplicates undirected self loops
Describe the bug
Conversion of an igraph object, with undirected self-loops, to a graphNEL object duplicates the self loops. I think that this is a consequence of igraph::as_adj_list, which is used internally to generate the graphNEL object.
To reproduce
# Install packages to run the below example if not installed
if (!requireNamespace("igraph", quietly = TRUE)) install.packages("igraph")
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
if (!requireNamespace("graph", quietly = TRUE)) BiocManager::install("graph")
# create a very simple adjacency matrix
mat <- matrix(c(1, 0.5, 0.5, 0), nrow = 2)
dimnames(mat) <- list(c("A", "B"), c("A", "B"))
# In this step the adjacency matrix is converted into an undirected igraph object
igr <- igraph::graph_from_adjacency_matrix(mat, mode = "undirected", weighted = T)
# Network as expected -----------------------------------------------------
igraph::E(igr)
#> + 2/2 edges from a1b7a6c (vertex names):
#> [1] A--A A--B
# Unexpected behaviour ----------------------------------------------------
grNEL <- igraph::as_graphnel(igr)
graph::edgeL(grNEL)
#> $A
#> $A$edges
#> [1] 1 1 2
#>
#>
#> $B
#> $B$edges
#> [1] 1
## Note that in the above output A has two self loops
igr_from_grNEL <- igraph::graph_from_graphnel(grNEL)
igraph::E(igr_from_grNEL)
#> + 3/3 edges from b242acf (vertex names):
#> [1] A--B A--A A--A
## Now the igraph network also has two self loops for A, which is not equal to
## the inital network
Version information
R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)
Matrix products: default
locale:
[1] LC_COLLATE=English_United Kingdom.utf8
[2] LC_CTYPE=English_United Kingdom.utf8
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C
[5] LC_TIME=English_United Kingdom.utf8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] devtools_2.4.3 usethis_2.1.6
loaded via a namespace (and not attached):
[1] magrittr_2.0.3 pkgload_1.2.4 R6_2.5.1 rlang_1.0.2
[5] fastmap_1.1.0 tools_4.2.0 pkgbuild_1.3.1 sessioninfo_1.2.2
[9] cli_3.3.0 withr_2.5.0 ellipsis_0.3.2 remotes_2.4.2
[13] rprojroot_2.0.3 lifecycle_1.0.1 crayon_1.5.1 brio_1.1.3
[17] processx_3.6.0 purrr_0.3.4 callr_3.7.0 fs_1.5.2
[21] ps_1.7.0 testthat_3.1.4 memoise_2.0.1 glue_1.6.2
[25] cachem_1.0.6 compiler_4.2.0 desc_1.4.1 prettyunits_1.1.1
We do have loops and multiple in C/igraph 0.9, so perhaps these can be exposed to R/igraph, which makes fixing this easier? https://igraph.org/c/html/0.9.10/igraph-Data-structures.html#igraph_adjlist_init
I've looked into this a bit. igraph::as_adj_list() calls C_R_igraph_get_adjlist(), which is a hand-written C function that essentially creates an R list of vectors, and then fills each vector with the output of igraph_neighbors(). So, unfortunately it has nothing to do with igraph_adjlist_t, and changing the invariant of as_adj_list() (namely that its vectors are consistent with repeated igraph_neighbors() calls) is probably a breaking change. We need to fix this in igraph::as_graphnel() instead.
Scheduling this for a future hackathon.
One more reason to add detailed loop / multiedge control to igraph_neighbors(), as I've been arguing for for a long time. https://github.com/igraph/igraph/issues/2065
@robert-koetsier Does graphNEL support self-loops at all? In the documentation it is stated that multi-edges are not supported:
Although this representation can support multi-edges, such support is not implemented and instances of graphNEL are assumed to be simple graphs with at most one edge between any pair of nodes.
There is no explicit statement made about self-loops.
I notice that it is possible to create graphNEL objects with multi-edges. There seems to be no check for it. Thus I assume that the above statement means that one cannot expect results to be correct when multi-edges are present. Is it perhaps the same with self-loops? Let's try:
> gn<-graphNEL(nodes=c('A','B'), edgeL=list(A=c('A', 'B'), B=c('A')), edgemode='undirected')
> graph::degree(gn)
A B
2 1
Oops! Vertex A should have degree 3, not degree 2. Since the result is incorrect, my conclusion is that graphNEL does not support self-loops either, or that the use case of self-loops was never considered (and thus graphs with loops can't be reliable handled by the package).
@ntamas Would it not make more sense to refuse to convert non-simple igraph graphs to graphNEL?
Since the result is incorrect, my conclusion is that graphNEL does not support self-loops either
It's not necessarily incorrect, it just treats loop edges as contributing only 1 to the degree of the node and not 2. This is not of igraph's concern. All that matters is that in theory it's possible to create such a graph in graphNEL so we should not explicitly prevent the user from converting an igraph graph with loop edges into a graphNEL object.
(We should probably not even do the validation on our end. If graphNEL cannot support certain types of graphs, like graphs with multiple edges between the same pair of nodes, it is not our job to validate this; it should be done in the graphNEL function).
I would like to at least get confirmation that this is intended, so we can be clear about what the correct representation of self-loops is in graphNEL objects. I asked here: https://support.bioconductor.org/p/9148438/
Counting self-loops only once would be a somewhat unusual definition of "degree" that violates identities such as the handshaking lemma.
The issue is now resolved by #602 (which also added a check that ensures that there are no multiple edges in the graph being converted).