Routing Multiple Clusters
With our new rout calculating algorithms, we decided to try routing larger clusters as well as routing many multiple clusters simultaneously.
Using simulated annealing allows us to route larger clusters than before:
We actually were able to route the cluster above twice at the same time, no problem. In fact, calculating routes is no longer the bottleneck for Pathfinder.
We can calculate the route shown above twice without significant loss of speed. However, once we try routing it three times, we reach the query limit for the distance matrix API.
The Distance Matrix API allows us to calculate 1000 elements a second and 625 elements per request. Each element corresponds to a pair of coordinates that we request the distance between. Each transport adds 1 coordinate, its start location, while each commodity contributes up to two coordinates, one for its start location and another for its drop off location. We can calculate a rough estimate for the number of elements routing a cluster uses by squaring the number of coordinates. The actual value is lower since we do not calculate the distance for each and every pair of coordinates (such as the distance from a commodity's drop off to its pick up), and each routing of a cluster uses multiple Distance Matrix requests.
The cluster above has 12 coordinates, which leads to less than 144 elements requested. Doing this 3 times in parallel leads to 432 elements, which is less than the 1000 elements per second we are given. As such, it appears that we are not given as many calculations as expected, but we make our requests really quickly. It could be the case that exceeding the limit over say a quarter of a second results in exceeding the query limit. We tried routing multiple smaller clusters each with 7 coordinates.
It appears that the calculation time stays constant since we use a hard limit of a 10 seconds for routers to calculate routes. It could be the case that we are sacrificing route quality instead of time though, but the resulting routes end up looking reasonable like the one shown above. We suspect that our linear programming router is always taking up the full 10 seconds while the other routing servers finish early, even with the extra load from calculating multiple routes at once. When removing the linear programming solution, routing the cluster only took about 5 seconds. It may make sense then to remove the linear programming server then, or at least limit the size of cluster it is used for. Another possibility is to return a result once a super majority of cluster routers each return a route instead of waiting for them all.
It is clear that in order to further improve routing to allow larger clusters and improved scalability, that we need to deal with our usage of the Distance Matrix API. Options we are considering are caching more of the distance data per cluster, breaking up the Distance Matrix Requests for large clusters into smaller requests, and filtering out more unused elements.
Changelog
