Graphs

A graph is a structure composed of vertices connected by edges. Edges in a graph can be undirected or directed. A directed edge between vertices u and v allows you to travel from u to v, but not from v back to u. To travel directly from v back to u the graph would require a second directed edge running from v to u. Undirected edges can be traversed in either direction.

The picture below shows an example of a directed graph.

In most applications of graphs the vertices have identifiers - these can be numbers, letters, or names. Edges can then be described by listing the identifiers of the two vertices the edge connects. In some applications edges can also have weights, which are usually numerical values attached to the edges.

One application of graphs is in representing road networks. The picture below shows the interstate highway system in the United States.

This road network can be represented as a graph where the vertices are the cities where interstate highways terminate or intersect, and the edges are the road segments that connect vertices. The edges in this graph would be undirected, since road segments can be traversed in either direction. Most commonly the edges would be given weights equal to the length of the road segment that the edge represents.

I have prepared a data file that represents a significant portion of the interstate highway system. The file contains a list of edges, where each edge appears as a pair of city names and a distance in miles between them.

Representing a graph in a Java program

A graph is relatively easy to represent in a Java program. We can create simple objects that represent vertices and the edges between them, and then represent a graph as a collection of vertices and edges.

I have written a simple example program that can load and work with the graph that represents the interstate highway system. The example program contains the code for three simple classes.

The first class is a Vertex class, which represents a city and the road segments that run outward from that city.

public class Vertex {
    private String city;
    private ArrayList<Edge> edges;
    
    public Vertex(String name) {
        this.city = name;
        edges = new ArrayList<Edge>();
    }
    
    public void addEdge(Vertex destination,int weight) {
        Edge newEdge = new Edge(destination,weight);
        edges.add(newEdge);
    }
    
    public String getName() { return city; }
    
    public Edge[] getEdges() {
        Edge[] result = new Edge[edges.size()];
        return edges.toArray(result);
    }
}

The second class is a simple Edge class that represents edges. In this application Edges are directed. Since Edges are stored in Vertices, we will always know what city the Edge comes from. The only thing the Edge itself needs to represent is the Vertex the Edge goes to and the length of the road segment that this Edge represents.

public class Edge {
    private Vertex destination;
    private int weight;
    
    public Edge(Vertex destination,int weight) {
        this.destination = destination;
        this.weight = weight;
    }
    
    public Vertex getDestination() { return destination; }
    public int getWeight() { return weight; }
}

Finally, the graph is represented as a collection of Vertices. Since we will very often need to look up Vertices by their city names, the Vertices are actually stored in a Map that uses the city names as keys and the Vertices themselves as values.

public class Graph {

    private TreeMap<String, Vertex> vertices;

    public Graph() {
        vertices = new TreeMap<String, Vertex>();
    }

    public void readFrom(String fileName) {
        Scanner input = null;
        try {
            input = new Scanner(new File(fileName));
        } catch (Exception ex) {
            ex.printStackTrace();
            System.exit(-1);
        }

        while (input.hasNext()) {
            String from = input.next();
            String to = input.next();
            int distance = input.nextInt();

            Vertex fromV = null;
            if (!vertices.containsKey(from)) {
                fromV = new Vertex(from);
                vertices.put(from, fromV);
            } else {
                fromV = vertices.get(from);
            }

            Vertex toV = null;
            if (!vertices.containsKey(to)) {
                toV = new Vertex(to);
                vertices.put(to, toV);
            } else {
                toV = vertices.get(to);
            }

            fromV.addEdge(toV, distance);
            toV.addEdge(fromV, distance);
        }
    }

    public Edge[] getEdgesFrom(String name) {
        Vertex v = vertices.get(name);
        if (v != null) {
            return v.getEdges();
        } else {
            return new Edge[0];
        }
    }

    public static void main(String[] args) {
        Graph G = new Graph();
        G.readFrom("highways.txt");

        Scanner input = new Scanner(System.in);
        System.out.print("Enter a city name or quit to quit: ");
        String name = input.next();
        while (!name.equals("quit")) {
            Edge[] testEdges = G.getEdgesFrom(name);

            System.out.println("You can get from " + name + " to:");
            for (Edge e : testEdges) {
                System.out.println(e.getDestination().getName());
            }
            System.out.print("Enter a city name or quit to quit: ");
            name = input.next();
        }
    }

}

For convenience, the Graph class contains a method that reads the graph in from a text file containing a list of edges. Since each line in the input file will contain a pair of city names and a distance, the readFrom() method will read those two city names and try to loop up the corresponding vertices in the map. If no Vertex currently exists for a city, readFrom() will create the Vertex and add it to the map under that city name. Once readFrom() has Vertex objects for each of the two cities for the edge, it will add an Edge object to each of the two Vertex objects that makes it possible to travel between the two cities.

The Graph class also contains a method getEdgesFrom() that allows us to look up the Vertex for a particular city and get a list of all of the Edges running outward from that city. The main method uses this method to allow the user to enter a city name and see a list of all of the cities that are directly connected to that city.

The NetBeans project containing this example code and the highways.txt data file is available here. You can try running the project to see how easy or difficult it is to find a path between two cities. For example, see if you can find your way from Boston to Phoenix, or from Miami to Seattle.

The Dijkstra shortest path algorithm

An important problem in computer science is the problem of finding the shortest path between two vertices in a directed, weighted graph. There are several algorithms to solve this problem. In the notes below I am going to describe the Dijkstra algorithm, which is a widely-used algorithm for finding shortest paths in weighted, directed graphs where all of the weights are positive.

The Dijkstra algorithm actually does more than just finding a shortest path from a start vertex to a destination vertex. The algorithm actually computes the shortest paths from the start vertex to every other vertex in the graph. This may seem like overkill, but finding the shortest path from the start to the destination usually requires a systematic search of all paths in the graph that run from the source to the destination, so the algorithm has to consider every possible vertex in the graph as it conducts that search.

The first step in the Dijkstra algorithm is to add two new member variables to the Vertex class, d and previous. d is used to record the length of the shortest path found so far to that vertex, while previous is the last Vertex on the path leading to the given Vertex. At the start of the algorithm, we set the d value for the start vertex to 0 and the previous value for the start vertex to the start vertex itself. For every other vertex in the graph, we set the d value to a very large number and the previous value to null.

The last part of the initialization phase of the algorithm is to place every vertex in the graph in a set, called the pending set.

The basic step in the algorithm is an operation called relaxation that gets applied to edges. Suppose we have an edge running from vertex u to vertex v and the weight of the edge is w. The relaxation operation does the following:

if(u.d + w < v.d) {
  v.d = u.d + w;
  v.previous = u;
}

Relaxation tries to determine whether or not the edge (u,v) makes it possible to get to v more cheaply. u.d tells us that we have already found a way to get from the start vertex to u at a cost of u.d. If we add the edge (u,v) to that path, we would be able to get to v at a total cost of u.d + w. If this is cheaper than the current value of v.d, we set v.d to this new, cheaper cost and record the fact that the new cheapest path to v runs from u by setting v.previous to u.

After the initialization step, the algorithm runs the following loop:

while(the pending set is not empty) {
  let v = the vertex in the pending set with the smallest d value
  remove v from the pending set
  for every edge e running from v:
    if e runs to a vertex that is still in the pending set:
      relax e
}

At the end of the loop the algorithm will have computed the lowest possible d value for every vertex in the graph. The set of previous links discovered by the algorithm will form a network of shortest paths running backward from every vertex in the graph to the source vertex. To find a shortest path to any given destination, we follow the previous links from that destination back to the source, making a record of all the vertices we pass along the way. When we reach the source vertex, we can reverse that list to see the list of vertices you would have to travel through to get from the source to the destination along the shortest path.

An example

Here is a simple example of the Dijkstra algorithm in practice. Here is a simple graph with weighted edges.

In this example we will run the algorithm using S as the start vertex. The table below shows how the d and previous values for each vertex in the graph evolve over each round of the algorithm. Once a vertex is selected to be removed from the pending set, its d and previous values stop being updated.

Vertex12345678910
S0,S
P∞,∅2,S
U∞,∅3,S3,S
X∞,∅∞,∅6,P4,U
Q∞,∅∞,∅7,P7,P7,P7,P
Y∞,∅∞,∅∞,∅∞,∅10,X10,X10,X10,X
V∞,∅∞,∅∞,∅6,U6,U
R∞,∅∞,∅∞,∅∞,∅∞,∅∞,∅9,Q
W∞,∅∞,∅∞,∅∞,∅∞,∅10,V10,V10,V10,V
T∞,∅∞,∅∞,∅∞,∅∞,∅∞,∅∞,∅15,R15,R15,R

From the table we can read backward from vertex T to find the shortest path from S to T:

TRQPS

Reversing this gives the desired path, and the path has a total weight of 15 as predicted by the algorithm.