**Efficient Real-World Route Searching**

In this assignment, students implement a real-world route searching algorithm for HuskyMaps, a derivative of the street mapping application Bear Maps (Nifty Assignment 2018). The assignment applies several topics commonly taught in CS2, including priority queues, graphs, shortest paths algorithms, binary trees, tree traversals, testing, and debugging large programs. After building HuskyMaps in a previous homework, this homework challenges students to improve upon the relatively simple A* route search algorithm by implementing contraction hierarchies, a 2008 algorithm for efficient routing based on the idea that many shortest paths queries share the same major roadways. This can be done in 3 phases: (1) compute an importance heuristic for each vertex; (2) add new shortcut edges by contracting each vertex in the order given by the importance heuristic; and (3) precompute the shortest paths on the augmented graph from each vertex to increasingly important roadways. Afterwards, a source-to-destination route search query can be answered by finding the common most-important roadways (midpoints) and reconstructing the best path from source to midpoint to destination. Reconstructing the path is an application of inorder tree traversal since each shortcut edge contains two smaller edges, both of which can also be shortcut edges. Each part of this assignment can be scaffolded. Unit tests on a small, manually-computable graph provide immediate, formative feedback to students as they develop their implementation. Randomized route search queries serve as a stress test for the correctness and efficiency of their implementation, which they can compare against their earlier A* search algorithm implementation.

Summary | Students implement a real-world route search algorithm by preprocessing the graph, precomputing partial shortest paths, and responding to queries using the partial shortest paths as input. They must implement 2 new classes and integrate their optimized data structures and algorithms into a large, existing code base. |

Topics | priority queues, graphs, shortest paths algorithms, binary trees, tree traversals, testing, debugging |

Audience | Late CS2 |

Difficulty | Medium or hard depending on scaffolding. |

Strengths | Algorithm refinement. Students previously solved route search with A* search. In this assignment, students implement a new approach to single-pair shortest paths that combines several ideas from across CS2. Instead of relying on an instructor-provisioned autograder, students can use their own A* search as a correctness oracle.Crisp algorithm with a clean implementation. In order to get high performance, some graph algorithms can be very complicated to learn or implement. Contraction hierarchies arise from a simple insight: many of our shortest paths queries share the same major roadways. Despite implementing a new algorithm in a large code base, the suggested implementation is cleanly modularized from the rest of the project and requires no API hacking.Extending a large code base. It can be difficult for instructors to rationalize why students should care about software engineering concerns, such as code style or composition, so long as their code gets the job done. Because students need to extend the functionality of a previous assignment, students can realize some of the benefits of these principles.Progressive testing and debugging process. The assignment is scaffolded with three levels of testing: a set of test cases on a small, manually-computable graph that students can solve by hand; a set of local integration tests that check for problems in a limited number of scenarios; and random testing with their A* search oracle. |

Weaknesses | Computationally-intensive. While the overall memory usage for working with the augmented graph is only a constant factor increase, graph preprocessing time can take up to 30 minutes on the provided `seattle` graph. This can be addressed by running on smaller slices, e.g. the provided `seattle-small` graph takes less than 5 minutes. However, the smaller graph obviates some of the route search query runtime improvements. This also poses a problem for limited autograding resources.Depends on HuskyMaps. Adopting this assignment is a large undertaking because it extends HuskyMaps. It is possible to implement contraction hierarchies without the HuskyMaps project, but the loss of interactivity may reduce student interest and motivation. |

Libraries | Spark, Gson, WebSocket, algs4, JUnit, Hamcrest, Apache Commons, SLF4J. All required dependencies are included. |

Variants | Improved priority heuristic. The Deleted Neighbors heuristic is a simple and effective heuristic at encouraging spatial diversity, but can be improved with ideas from graph theory to produce better contraction orderings.Directed edges and travel speed. All edges in the graph are inserted in both the forward and backward directions regardless of the actual roadway direction. Edge weight is given as a function of distance alone even though road speed is available in the dataset.Parallelism. Different subproblems in the algorithm can be parallelized to different extents: some are embarrassingly parallel, others have some concurrency issues, and yet others require going back to the drawing board entirely.Swap student implementations. Instead of implementing contraction hierarchies on top of their own code base, have students swap projects to highlight the usefulness of a cleanly-designed API. |

While many problems in graph theory can be solved with a combination of simple algorithms such as DFS, BFS, and Dijkstra’s algorithm—contraction hierarchies included—it’s sometimes the case that faster solutions require specialized graph augmentations or data structure modifications. For example, it’s possible to solve in two different but closely related ways the special case of single-source shortest paths in an undirected, weighted graph *G = (V, E)* where each positive integer edge weight *w* is bounded by a constant *k*.

**Graph augmentation**. Preprocess*G*and expand each edge of weight*w*into*w*edges each of weight 1. Run BFS on the resulting graph in*O(k|E|) = O(|E|)*time.**Data structure modification**. Run Dijkstra’s algorithm on*G*but optimize the priority queue by using a circular counter and an array of*k + 1*sets, each set index*i*representing vertices with a total path weight*d*such that*d % (k + 1) = i*. Process all of the items in the current set before incrementing the counter. The next vertex added to the priority queue will be at most*k + d*. Each priority queue operation takes constant time, so Dijkstra’s algorithm improves to*O(|E|)*time.

On a 2015 CS2 final exam, significantly fewer students used the first approach despite its simplicity.

Our CS2 course spends relatively more time on data structure refinement but comparatively less time on graph algorithm optimization strategies. While class time is spent on iteratively improving tree data structures from ordered linked lists to binary search trees to balanced search trees, it is rare for students to see an equivalent process for graph algorithm refinement since our curriculum introduces different graph algorithms as (typically) solving different graph problems.

Iterative refinement also enables more robust, student-driven testing of their own programs. By relying on their A* search algorithm as a correctness oracle, students can compare the two implementations to evaluate correctness and query runtime.

While the graph preprocessing step is a tractable problem, it is annoyingly slow on large graphs. The full `seattle`

graph, for instance, can take nearly 30 minutes to preprocess on a single thread. Continuing with the theme of the HuskyMaps project and the themes of CS2 as a whole, we’d like the algorithm to run in a reasonable amount of time on planet-scale graphs as well.

The graph preprocessing step can be parallelized but only in certain places where the result of an operation does not depend on the results of other operations. Improving parallel graph preprocessing runtime beyond a certain point requires a different approach to graph contraction, e.g. an approach based on computing independent level sets in the graph. Parallelizing contraction hierarchies involves judicious application of parallel threads and identification of concurrency bugs. The ceiling for runtime optimization depends on how much graph theory the instructor would like to introduce.

See Contraction Hierarchies briefly explained for a primer on the contraction hierarchies algorithm. What follows is a high-level implementation outline for the assignment without discussion of idea or correctness.

Suppose we have an undirected graph *G = (V, E)* with non-negative edge weights representing the real-world distance between neighboring vertices. A *shortcut edge* through vertex *v* is a pair of edges *(u, v)* and *(v, w)* where *(u, v, w)* is the shortest path from *u* to *w*. Shortcut edges are a recursive, binary tree-like structure as either edge can also be a shortcut edge.

**Compute an importance heuristic for each vertex**with the Deleted Neighbors heuristic (Geisberger 2008). The priority for a vertex is given by the number of neighbors plus the number of already-contracted neighbors (zero when computing the initial heuristic) minus the number of potential shortcuts that would be added by contracting the vertex.**Add new shortcut edges by contracting each vertex in the order given by the importance heuristic**. The result of contracting a vertex will affect the importance heuristic of neighboring vertices, so we perform a lazy update before contraction and reorder immediate neighbors after contraction (Geisberger 2008). This step typically adds to the graph about*|E|*additional edges (assuming real-world data) to yield the augmented graph*G’*. Record the vertex contraction order: the first-contracted vertex is considered the least important vertex while the last-contracted vertex is considered the most important.**Precompute the shortest paths on the augmented graph from every vertex to increasingly important roadways**. Run a modified Dijkstra’s single-source shortest paths search from each vertex to every other vertex: “when pulling a node*v*from the priority queue, only consider edges*(w, v)*with*level(w) > level(v)*” (Funke 2017) where the*level*function is given by the contraction order. In other words, only explore more important roadways.

At query time, lookup the precomputed shortest paths for the source and destination. Find the set intersection of the most-important roadways. Then, identify the most-important roadway (midpoint) with the lowest total distance from source to midpoint to destination. Finally, reconstruct the shortest path from source to destination.

- Josh Hug. Bear Maps: Creating Routes with Open Street Maps. 2018. In Nifty Assignments 2018. http://nifty.stanford.edu/2018/hug-bear-maps.html
- Robert Geisberger, Peter Sanders, Dominik Schultes, and Daniel Delling. Contraction Hierarchies: Faster and Simpler Hierarchical Routing in Road Networks. 2008. In WEA’08 Proceedings of the 7th International Conference on Experimental Algorithms. https://doi.org/10.1007/978-3-540-68552-4_24
- Stefan Funke. Contraction Hierarchies briefly explained. 2017. https://www.fmi.uni-stuttgart.de/files/alg/teaching/s15/alg/CH.pdf

© Mapbox, © OpenStreetMap, Improve this map. Data available under the Open Database License.

Components include the JavaSpark web framework and Google Gson library. Alan Yao for creating the original version of this project. Colby Yuan and Eli Lipsitz for substantial refactoring of the front end. Rahul Nangia for substantial refactoring of the back end.