Efficient Graph Search

Note: This was originally published as an Observable notebook.

Many textbook implementations of breadth-first search contain a flaw: they are needlessly slow.

Terence Kelly's Drill Bits article describes the flaw and how to fix it.

It’s fun to see how this plays out visually, so here are a few pictures of the problem and how it can be solved.

First, let's take a look at a textbook BFS implementation. The code below is adapted from Algorithms by Dasgupta, Papadimitriou, and Vazirani:

function bfs(G, s) {
  // For all vertices `v` reachable from `s`, dist[v] is set
  // to the distance from `s` to `v`. 
  const V = G.numVertices;                // Assume the number of vertices is known
  let dist = new Array(V).fill(Infinity); // Init distances to ∞
  dist[s] = 0;                            // Init the source distance to zero
  let Q = [s];                            // Init a queue containing just s 
  while (Q.length > 0) {                  // While the queue is nonempty:
    let u = Q.shift();                    //   Pop the next vertex from the queue
    for (let v of G.neighbors[u]) {       //   For all edges (u, v):
      if (dist[v] == Infinity) {          //     If we haven't seen v before:
        dist[v] = dist[u] + 1;            //       Set its distance
        Q.push(v);                        //       Push v onto the queue    
      }
    }
  }
  return dist;
}

This algorithm is correct, but in some cases does extra work proportional to the number of vertices squared.

We can see the issue by looking at how this algorithm processes a complete graph, which has an edge between every pair of vertices.

BFS on a complete graph starts at a source vertex and immediately visits all of the others, since each is just one hop away.

Once those vertices has been visited, all of the useful work is done — we now know the distance to every vertex from the source. But the textbook BFS is far from finished!

Slide the slider below to watch the textbook algorithm play out.

The red vertex is the one being processed, the outlined vertex is a neighbor, and vertices turn black once they've been found.

By the th step all of the vertices have been found and processed, at which point we're done.

But because the neighbors of the first vertex were all enqueued, the algorithm proceeds to visit them, along with their neighbors, which results in the inspection of extra edges.

Fortunately, we can fix this with just a few lines of code:

function bfsImproved(G, s) {
  const V = G.numVertices;                // Assume the number of vertices is known
  let dist = new Array(V).fill(Infinity); // Init distances to ∞
  dist[s] = 0;                            // Init the source distance to zero
  let Q = [s];                            // Init a queue containing just s 
  let f = 1;                              // 🆕 Initialize the number of seen vertices
  while (Q.length > 0) {                  // While the queue is not empty:
    let u = Q.shift();                    //   Pop the next vertex from the queue
    for (let v of G.neighbors[u]) {       //   For all edges (u, v):
      if (dist[v] == Infinity) {          //     If we haven't seen v before:
        dist[v] = dist[u] + 1;            //       Set its distance
        if (++f == V) return dist;        // 🆕    Return if we've seen all V vertices
        Q.push(v);                        //       Push v onto the queue    
      }
    }
  }
  return dist;
}

The added code tracks the number of vertices that have been found and finalized, and returns as soon as we've seen them all..

In the case of our complete graph this simple check takes the total number of steps down from to :

We can get a visceral sense for the amount of extra work by laying out all of the steps side-by-side.

The improved algorithm is finished by step :

The textbook algorithm continues on, churning through more edges before returning:

Hopefully this primes your intuition for the problem and its solution. If you’d like to learn more, go read Terence Kelly's article.