rigraph icon indicating copy to clipboard operation
rigraph copied to clipboard

`as_graphnel` duplicates undirected self loops

Open robert-koetsier opened this issue 3 years ago • 1 comments

Describe the bug Conversion of an igraph object, with undirected self-loops, to a graphNEL object duplicates the self loops. I think that this is a consequence of igraph::as_adj_list, which is used internally to generate the graphNEL object.

To reproduce

# Install packages to run the below example if not installed
if (!requireNamespace("igraph", quietly = TRUE)) install.packages("igraph")
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager")
if (!requireNamespace("graph", quietly = TRUE)) BiocManager::install("graph")

# create a very simple adjacency matrix
mat <- matrix(c(1, 0.5, 0.5, 0), nrow = 2)
dimnames(mat) <- list(c("A", "B"), c("A", "B"))

# In this step the adjacency matrix is converted into an undirected igraph object
igr <- igraph::graph_from_adjacency_matrix(mat, mode = "undirected", weighted = T)


# Network as expected -----------------------------------------------------

igraph::E(igr)
#> + 2/2 edges from a1b7a6c (vertex names):
#>    [1] A--A A--B


# Unexpected behaviour ----------------------------------------------------

grNEL <- igraph::as_graphnel(igr) 
graph::edgeL(grNEL)
#> $A
#> $A$edges
#> [1] 1 1 2
#> 
#> 
#> $B
#> $B$edges
#> [1] 1

## Note that in the above output A has two self loops


igr_from_grNEL <- igraph::graph_from_graphnel(grNEL)
igraph::E(igr_from_grNEL)
#> + 3/3 edges from b242acf (vertex names):
#> [1] A--B A--A A--A

## Now the igraph network also has two self loops for A, which is not equal to 
##    the inital network

Version information

R version 4.2.0 (2022-04-22 ucrt)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19044)

Matrix products: default

locale:
[1] LC_COLLATE=English_United Kingdom.utf8 
[2] LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] devtools_2.4.3 usethis_2.1.6 

loaded via a namespace (and not attached):
 [1] magrittr_2.0.3    pkgload_1.2.4     R6_2.5.1          rlang_1.0.2      
 [5] fastmap_1.1.0     tools_4.2.0       pkgbuild_1.3.1    sessioninfo_1.2.2
 [9] cli_3.3.0         withr_2.5.0       ellipsis_0.3.2    remotes_2.4.2    
[13] rprojroot_2.0.3   lifecycle_1.0.1   crayon_1.5.1      brio_1.1.3       
[17] processx_3.6.0    purrr_0.3.4       callr_3.7.0       fs_1.5.2         
[21] ps_1.7.0          testthat_3.1.4    memoise_2.0.1     glue_1.6.2       
[25] cachem_1.0.6      compiler_4.2.0    desc_1.4.1        prettyunits_1.1.1

robert-koetsier avatar Sep 21 '22 14:09 robert-koetsier

We do have loops and multiple in C/igraph 0.9, so perhaps these can be exposed to R/igraph, which makes fixing this easier? https://igraph.org/c/html/0.9.10/igraph-Data-structures.html#igraph_adjlist_init

szhorvat avatar Sep 21 '22 16:09 szhorvat

I've looked into this a bit. igraph::as_adj_list() calls C_R_igraph_get_adjlist(), which is a hand-written C function that essentially creates an R list of vectors, and then fills each vector with the output of igraph_neighbors(). So, unfortunately it has nothing to do with igraph_adjlist_t, and changing the invariant of as_adj_list() (namely that its vectors are consistent with repeated igraph_neighbors() calls) is probably a breaking change. We need to fix this in igraph::as_graphnel() instead.

Scheduling this for a future hackathon.

ntamas avatar Nov 10 '22 22:11 ntamas

One more reason to add detailed loop / multiedge control to igraph_neighbors(), as I've been arguing for for a long time. https://github.com/igraph/igraph/issues/2065

szhorvat avatar Nov 28 '22 09:11 szhorvat

@robert-koetsier Does graphNEL support self-loops at all? In the documentation it is stated that multi-edges are not supported:

Although this representation can support multi-edges, such support is not implemented and instances of graphNEL are assumed to be simple graphs with at most one edge between any pair of nodes.

There is no explicit statement made about self-loops.

I notice that it is possible to create graphNEL objects with multi-edges. There seems to be no check for it. Thus I assume that the above statement means that one cannot expect results to be correct when multi-edges are present. Is it perhaps the same with self-loops? Let's try:

> gn<-graphNEL(nodes=c('A','B'), edgeL=list(A=c('A', 'B'), B=c('A')), edgemode='undirected')
> graph::degree(gn)
A B 
2 1 

Oops! Vertex A should have degree 3, not degree 2. Since the result is incorrect, my conclusion is that graphNEL does not support self-loops either, or that the use case of self-loops was never considered (and thus graphs with loops can't be reliable handled by the package).

@ntamas Would it not make more sense to refuse to convert non-simple igraph graphs to graphNEL?

szhorvat avatar Dec 15 '22 16:12 szhorvat

Since the result is incorrect, my conclusion is that graphNEL does not support self-loops either

It's not necessarily incorrect, it just treats loop edges as contributing only 1 to the degree of the node and not 2. This is not of igraph's concern. All that matters is that in theory it's possible to create such a graph in graphNEL so we should not explicitly prevent the user from converting an igraph graph with loop edges into a graphNEL object.

(We should probably not even do the validation on our end. If graphNEL cannot support certain types of graphs, like graphs with multiple edges between the same pair of nodes, it is not our job to validate this; it should be done in the graphNEL function).

ntamas avatar Dec 15 '22 17:12 ntamas

I would like to at least get confirmation that this is intended, so we can be clear about what the correct representation of self-loops is in graphNEL objects. I asked here: https://support.bioconductor.org/p/9148438/

Counting self-loops only once would be a somewhat unusual definition of "degree" that violates identities such as the handshaking lemma.

szhorvat avatar Dec 15 '22 17:12 szhorvat

The issue is now resolved by #602 (which also added a check that ensures that there are no multiple edges in the graph being converted).

ntamas avatar Dec 16 '22 13:12 ntamas