# Vertex removal in LightGraphs

Testing abstractions, their limits and leaks

In various graph-related algorithms, a graph is modified through successive operations, merging, creating and deleting vertices. That’s the case for the Blossom algorithm finding a best matching in a graph and using contractions of nodes. In such cases, it can be useful to remove only the vertex being contracted, and maintain the number of all other vertices.

LightGraphs.jl offers a set of abstractions, types and algorithms to get started with graphs. The claim of the abstraction is simple: whatever the underlying structure representing your graph, if it implements the `AbstractGraph` interface, it can be used out of the box with all algorithms built on LightGraphs.jl. The main concrete type presented by LightGraphs.jl is `SimpleGraph` and its directed counterpart `SimpleDiGraph`, only storing edges as adjacency lists, meaning vertices are just the integers from 1 to the length of the list. This means that in a graph with 6 vertices, deleting vertex 4 will re-label vertex 6 as 4. Hopefully, the interface should allow us to build a graph type on top of another graph, re-implementing only vertex removal.

## A simple vertex-safe implementation

First things first, we will build it as a struct, using LightGraphs:

``````import LightGraphs
const LG = LightGraphs

struct VSafeGraph{T, G<:LG.AbstractGraph{T}, V<:AbstractVector{Int}} <: LG.AbstractGraph{T}
g::G
deleted_vertices::V
VSafeGraph(g::G, v::V) where {T, G<:LG.AbstractGraph{T}, V<:AbstractVector{Int}} = new{T, G, V}(g, v)
end

VSafeGraph(g::G) where {G<:LG.AbstractGraph} = VSafeGraph(g, Vector{Int}())
VSafeGraph(nv::Integer) = VSafeGraph(LG.SimpleGraph(nv))``````

We added simple default constructors for convenience. The structure holds two elements:

• An inner abstract graph `g`
• A list of vertices already deleted: `deleted_vertices`.

The interface can now be implemented for our type, starting with the trivial parts:

``````LG.edges(g::VSafeGraph) = LG.edges(g.g)
LG.edgetype(g::VSafeGraph) = LG.edgetype(g.g)

LG.is_directed(g::VSafeGraph) = LG.is_directed(g.g)
LG.is_directed(::Type{<:VSafeGraph{T,G}}) where {T,G} = LG.is_directed(G)

LG.ne(g::VSafeGraph) = LG.ne(g.g)
LG.nv(g::VSafeGraph) = LG.nv(g.g) - length(g.deleted_vertices)
LG.vertices(g::VSafeGraph) = (v for v in LG.vertices(g.g) if !(v in g.deleted_vertices))

LG.outneighbors(g::VSafeGraph, v) = LG.outneighbors(g.g, v)
LG.inneighbors(g::VSafeGraph, v) = LG.inneighbors(g.g, v)
LG.has_vertex(g::VSafeGraph, v) = LG.has_vertex(g.g, v) && !(v in g.deleted_vertices)

LG.has_edge(g::VSafeGraph, e) = LG.has_edge(g.g, e)

LG.add_vertex!(g::VSafeGraph) = LG.add_vertex!(g.g)

LG.rem_edge!(g::VSafeGraph, v1, v2) = LG.rem_edge!(g.g, v1, v2)

Base.copy(g::VSafeGraph) = VSafeGraph(copy(g.g), copy(g.deleteed_vertices))``````

For most of these, we only re-call the method on the inner graph type. Only for `LG.nv`, which computes the number of vertices in the inner graph, minus the number of vertices in our removed list. Now the tricky parts, adding an edge and removing a vertex, which require a bit more verifications:

``````function LG.add_edge!(g::VSafeGraph, v1, v2)
if !LG.has_vertex(g, v1) || !LG.has_vertex(g, v2)
return false
end
LG.add_edge!(g.g, v1, v2)
end

function LG.rem_vertex!(g::VSafeGraph, v1)
if !LG.has_vertex(g, v1) || v1 in g.deleted_vertices
return false
end
for v2 in LG.outneighbors(g, v1)
LG.rem_edge!(g, v1, v2)
end
for v2 in LG.inneighbors(g, v1)
LG.rem_edge!(g, v2, v1)
end
push!(g.deleted_vertices, v1)
return true
end``````

Instead of removing the vertex `v1` from the inner graph, the function removes all edges pointing to and from `v1`, and then adds it to the removed list.

## Specific and generic tests

So far so good, we can add some basic tests to check our type behaves as expected:

``````@testset "Graph construction and basic interface" begin
nv = 20
g1 = VSafeGraph(nv)
@test LG.nv(g1) == nv
@test LG.nv(g1.g) == nv

g2_inner = LG.CompleteGraph(nv)
g2 = VSafeGraph(g2_inner)
@test LG.nv(g2) == LG.nv(g2_inner)
@test LG.ne(g2) == LG.ne(g2_inner)

@test all(sort(collect(LG.vertices(g2))) .== sort(collect(LG.vertices(g2_inner))))

g3 = VSafeGraph(LG.CompleteDiGraph(30))
@test LG.is_directed(g3)
@test !LG.is_directed(g2)
end

@testset "Vertex deletion" begin
Random.seed!(33)
nv = 45
inner = LG.CompleteGraph(nv)
g = VSafeGraph(inner)
@test LG.ne(inner) == LG.ne(g)
@test LG.nv(inner) == LG.nv(g)
nrm = 0
for _ in 1:15
removed_ok = LG.rem_vertex!(g, rand(1:nv))
if !removed_ok
continue
end
nrm += 1
@test LG.nv(inner) == nv
@test LG.nv(g) == nv - nrm
@test length(g.deleted_vertices) == nrm

@test LG.ne(inner) == LG.ne(g)
end
end``````

So far so good. Now, with the promise of generic graphs and the AbstractGraph interface, we should be able to use any algorithm in LightGraphs.jl, let us try to compute a page rank and a Kruskal minimum spanning tree:

``````nv = 45
inner = LG.CompleteGraph(nv)
g = VSafeGraph(inner)
removed_ok = LG.rem_vertex!(g, rand(1:nv))
@test removed_ok
# LG broken here
@test_throws BoundsError LG.pagerank(g)
@test_throws BoundsError LG.kruskal_mst(g)``````

Yikes, what’s happening here? Many parts of LightGraphs.jl use vertices computed from `vertices(g)` as indices for structures indexed by them. So if you remove vertex 4 in a 6-vertex graph, vertices will be `{1,2,3,5,6}`, and the rank algorithm will try to access the 6th rank, even though only 5 exist.

## Fixes and proposal

It would be too bad to throw the interface altogether, but we need to do something for the broken behaviour. The underlying assumption here is that vertices behave like indices for anything vertex-related. So the way we implement this interface for `VSafeGraph` is correct, but the implicit contract is not, the way it is used in algorithms such as pagerank and Kruskal leak the underlying implementation for `SimpleGraph`: a contiguous list of integers from 1 to the number of vertices. It reminds me of this great talk on paying attention to the contract of an interface in Go, the type is telling you what to expect in and out, but not how it is supposed or will be used.

The first fix is to make `vertices` return `1:nv(g)` for VSafeGraph, but if you think about it, it means it needs to do such with any graph type, which means the `vertices` function is redundant with other functions of the interface and should not be mandatory. The other option is to fix breaking code to really use the interface signalled and documented and not the leaked implementation.

We still have some good news though:

• Changing the code here is strictly non-breaking, since we would just remove the assumption that vertices are indices.
• If we want to keep this assumption for some pieces of code, it means these pieces are not generic but specialized, something we can handle well using either dispatch on types or traits, which LightGraphs.jl already does. There is a `IsDirected` trait associated with the fact that a graph is directed or not, there could also be a `HasContiguousVertices` trait signalling whether this assumption is validated for a type.

## Edit: refined proposal

Following some discussions with fellow LightGraphs.jl developers and users, a softer transition could be:

1. Add the functions `vertex_indices(g)` and `vertex_values(g)` to the interface, `vertex_values` could default to `vertex_indices`, which could itself default on `1:nv(g)`.
2. Deprecate `vertices(g)`, with a fallback to `vertex_indices`.
3. Replace all calls to `vertex` with either `vertex_indices` or `vertex_values` depending on which makes sense for the use case.

This change is non-breaking and only deprecating `vertices`, making the interface more explicit. By keeping the two functions, we avoid having to use `enumerate(vertices_values(g))` every time we need indices.

## Edit 2: Corrections to the functions

I have corrected various functions following Pankaj’s much needed Pull Request on the corresponding repository, thanks!

## Edit 3

Seth Bromberger spotted an error in my assumptions, Swap-and-pop is used for vertex removal, so the last vertex will take the place of the removed one in the re-labelling.

##### Mathieu Besançon
###### Researcher in mathematical optimization

Mathematical optimization, scientific programming and related.