Graphs.jl icon indicating copy to clipboard operation
Graphs.jl copied to clipboard

[BUG] Eigenvector centrality for disconnected graphs

Open greimel opened this issue 1 year ago • 3 comments

Description of bug

In a disconnected graph, eigenvector_centrality sometimes delivers random results.

g = SimpleGraph(6)
add_edge!.(Ref(g), 1:3, (1:3)')
add_edge!.(Ref(g), 4:6, (4:6)')

eigenvector_centrality(g)

Run this multiple times, the numbers are always different.

Strictly speaking,

Eigenvector centrality is not well-defined for disconnected graphs since the centrality scores of the individual components are independent of each other

source on stackoverflow

Here's what Matlab does

[...] If there are several disconnected components, then the algorithm computes the eigenvector centrality individually for each component, then scales the scores according to the percentage of graph nodes in that component. The centrality score of disconnected nodes is 1/numnodes(G).

https://de.mathworks.com/help/matlab/ref/graph.centrality.html

Potential fixes

  • add a warning to the documentation
  • check if a graph is connected and warn/error if it isn't
  • emulate Matlab's behavior

greimel avatar Oct 01 '24 08:10 greimel

Hi, thanks for catching this! Which fix would you feel like implementing?

gdalle avatar Oct 08 '24 17:10 gdalle

I think the randomness comes from the solver ArnoldiMethod.jl that we are using here. Even if the graph is connected we can get slightly different results between each run:

julia> eigenvector_centrality(path_graph(2))
2-element Vector{Float64}:
 0.7071067811865475
 0.7071067811865475

julia> eigenvector_centrality(path_graph(2))
2-element Vector{Float64}:
 0.7071067811865476
 0.7071067811865476

But if the graph is disconnected the eigenvalues that we get from ArnoldiMethod.jl seem to be all over the place. Its been years since I had to deal with eigenvalues and vectors so I am not sure why this is the case. This is a general issue in JuliaGraphs that we unfortunately did not solve yet. For example we have also issues when we use spectral methods in GraphPlot.jl for disconnected graphs.

I like the idea of emulating matlabs behavior though - so perhaps for now we can just implement that and document it well.

We also should think how this functions works in case of only weakly connected directed graphs. Here is an example that shows that we also have issues in that case:

julia> gd = DiGraph([Edge(1,2)])
{2, 1} directed simple Int64 graph

julia> eigenvector_centrality(gd)
2-element Vector{Float64}:
 0.9502791434299723
 0.3113993409787471

julia> eigenvector_centrality(gd)
2-element Vector{Float64}:
 0.6899770074048572
 0.7238312850745245

simonschoelly avatar Nov 20 '24 12:11 simonschoelly

Hey! Just wanted to share a few thoughts around the use of Arnoldi iteration here.

Arnoldi Iteration is inherently an iterative method. In ArnoldiMethod.jl, if a starting vector isn’t provided, it defaults to a randomly sampled one. So if we're aiming for consistent results in eigenvector_centrality, it's important to manually define and pass a start vector to partialschur.

Also, by design, Arnoldi is great for finding extremal eigenvalues that are non-zero. But in the case of something like DiGraph([Edge(1,2)]), the adjacency matrix ends up being nilpotent, which means all eigenvalues are zero—something Arnoldi can’t really work with. One idea could be to check if the DiGraph contains any cycles and return eigenvalue = 0 if not. Though, that still wouldn’t solve situations where the largest eigenvalue is just very small (like 1e-5), which can also cause issues.

Interestingly, I noticed that The Matlab doc, mentions eigenvector_centrality is only defined for undirected graphs—maybe something for us to keep in mind as well.

Syuizen avatar Apr 09 '25 14:04 Syuizen