Map Routing, a la Google Maps?
I've always been intrigued by Map Routing, but I've never found any good introductory (or even advanced!) level tutorials on it. Does anybody have any pointers, hints, etc?
Update: I'm primarily looking for pointers as to how a map system is implemented (data structures, algorithms, etc).
Take a look at the open street map project to see how this sort of thing is being tackled in a truely free software project using only user supplied and licensed data and have a wiki containing stuff you might find interesting.
A few years back the guys involved where pretty easy going and answered lots of questions I had so I see no reason why they still aren't a nice bunch.
Barry Brumitt, one of the engineers of Google maps route finding feature, wrote a post on the topic that may be of interest:
The road to better path-finding 11/06/2007 03:47:00 PM
A* is actually far closer to production mapping algorithms. It requires quite a bit less exploration compared to Dijikstra's original algorithm.
By Map Routing, you mean finding the shortest path along a street network?
Dijkstra shortest-path algorithm is the best known. Wikipedia has not a bad intro: http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm
There's a Java applet here where you can see it in action: http://www.dgp.toronto.edu/people/JamesStewart/270/9798s/Laffra/DijkstraApplet.html and Google you lead you to source code in just about any language.
Any real implementation for generating driving routes will include quite a bit of data on the street network that describes the costs associate with traversing links and nodes—road network hierarchy, average speed, intersection priority, traffic signal linking, banned turns etc.
Instead of learning APIs to each map service provider ( like Gmaps, Ymaps api) Its good to learn Mapstraction
I would suggest you go to the URL and learn a general API. There is good amount of How-Tos too.
I've yet to find a good tutorial on routing but there are lots of code to read:
There are GPL routing applications that use Openstreetmap data, e.g. Gosmore which works on Windows (+ mobile) and Linux. There are a number of interesting [applications using the same data, but gosmore has some cool uses e.g. interface with websites.
The biggest problem with routing is bad data, and you never get good enough data. So if you want to try it keep your test very local so you can control the data better.
From a conceptual point of view, imagine dropping a stone into a pond and watching the ripples. The routes would represent the pond and the stone your starting position.
Of course the algorithm would have to search some proportion of n^2 paths as the distance n increases. You would take you starting position and check all available paths from that point. Then recursively call for the points at the end of those paths and so on.
You can increase performance, by not double-backing on a path, by not re-checking the routes at a point if it has already been covered and by giving up on paths that are taking too long.
An alternative way is to use the ant pheromone approach, where ants crawl randomly from a start point and leave a scent trail, which builds up the more ants cross over a given path. If you send (enough) ants from both the start point and the end points then eventually the path with the strongest scent will be the shortest. This is because the shortest path will have been visited more times in a given time period, given that the ants walk at a uniform pace.
EDIT @ Spikie
As a further explanation of how to implement the pond algorithm - potential data structures needed are highlighted:
You'll need to store the map as a network. This is simply a set of nodes and edges between them. A set of nodes constitute a route. An edge joins two nodes (possibly both the same node), and has an associated cost such as distance or time to traverse the edge. An edge can either either be bi-directional or uni-directional. Probably simplest to just have uni-directional ones and double up for two way travel between nodes (i.e. one edge from A to B and a different one for B to A).
By way of example imagine three railway stations arranged in an equilateral triangle pointing upwards. There are also a further three stations each halfway between them. Edges join all adjacent stations together, the final diagram will have an inverted triangle sitting inside the larger triangle.
Label nodes starting from bottom left, going left to right and up, as A,B,C,D,E,F (F at the top).
Assume the edges can be traversed in either direction. Each edge has a cost of 1 km.
Ok, so we wish to route from the bottom left A to the top station F. There are many possible routes, including those that double back on themselves, e.g. ABCEBDEF.
We have a routine say, NextNode, that accepts a node and a cost and calls itself for each node it can travel to.
Clearly if we let this routine run it will eventually discover all routes, including ones that are potentially infinite in length (eg ABABABAB etc). We stop this from happening by checking against the cost. Whenever we visit a node that hasn't been visited before, we put both the cost and the node we came from against that node. If a node has been visited before we check against the existing cost and if we're cheaper then we update the node and carry on (recursing). If we're more expensive, then we skip the node. If all nodes are skipped then we exit the routine.
If we hit our target node then we exit the routine too.
This way all viable routes are checked, but crucially only those with the lowest cost. By the end of the process each node will have the lowest cost for getting to that node, including our target node.
To get the route we work backwards from our target node. Since we stored the node we came from along with the cost, we just hop backwards building up the route. For our example we would end up with something like:
Node A - (Total) Cost 0 - From Node None Node B - Cost 1 - From Node A Node C - Cost 2 - From Node B Node D - Cost 1 - From Node A Node E - Cost 2 - From Node D / Cost 2 - From Node B (this is an exception as there is equal cost) Node F - Cost 2 - From Node D
So the shortest route is ADF.
From my experience of working in this field, A* does the job very well. It is (as mentioned above) faster than Dijkstra's algorithm, but is still simple enough for an ordinarily competent programmer to implement and understand.
Building the route network is the hardest part, but that can be broken down into a series of simple steps: get all the roads; sort the points into order; make groups of identical points on different roads into intersections (nodes); add arcs in both directions where nodes connect (or in one direction only for a one-way road).
The A* algorithm itself is well documented on Wikipedia. The key place to optimise is the selection of the best node from the open list, for which you need a high-performance priority queue. If you're using C++ you can use the STL priority_queue adapter.
Customising the algorithm to route over different parts of the network (e.g., pedestrian, car, public transport, etc.) of favour speed, distance or other criteria is quite easy. You do that by writing filters to control which route segments are available, when building the network, and which weight is assigned to each one.
Another thought occurs to me regarding the cost of each traversal, but would increase the time and processing power required to compute.
Example: There are 3 ways I can take (where I live) to go from point A to B, according to the GoogleMaps. Garmin units offer each of these 3 paths in the Quickest route calculation. After traversing each of these routes many times and averaging (obviously there will be errors depending on the time of day, amount of caffeine etc.), I feel the algorithms could take into account the number of bends in the road for high level of accuracy, e.g. straight road of 1 mile will be quicker than a 1 mile road with sharp bends in it. Not a practical suggestion but certainly one I use to improve the result set of my daily commute.