2DPathfinder documentation


Theory

Finding paths between locations is one interesting problem that can be solved with the “informed search” theory.

A “general search” consists in expanding the start node, placing its successors in a queue, and then examining the next node in the queue.


function general_search(start_state, target_state, push_function())
        start_node <- make_node(start_state)
        queue <- start_node
        loop
                if empty(queue) then
                        return no_solution
                current_node <- pop(queue)
                mark_closed(current_node)
                if state(current_node) = target_state then
                        return solution
                for each node in successors(current_node) do
                        if not closed(node) then
                                queue <- push_function(node)
        end

Many different algorithms can be obtained by defining the push_function().

The successors() function depends on the structure that represents the space of the states (a graph, a 2D map...).

The “ramification factor” (b) is the number of states that can succeed to each state; d is the total number of expansion levels.

The complexity of the search is, in the worst case, O(bd).

The mark_closed() function is used to avoid to re-consider a node that has already been analyzed.

While there are different ways to implement every single search strategy, one of them is always derived from a general search.

So the different strategies are determined by how the nodes' data structures and algorithms are implemented.

For example, a “best-first search” is obtained by maintaining the open nodes sorted by a function that evaluates their states.

To estimate the “goodness” of a node, a heuristic function can be used (h()). This is a “greedy” strategy.


function best_first_search(start_state, target_state, eval_function())
        push_function() <- a function that sorts the nodes using eval_function()
        return general_search(start_state, target_state, push_function())

A greedy search is generally very efficient, but minimizes only the estimated cost.

Another strategy is the “uniform-cost search”, which minimizes the current path cost: it is optimal but very little efficient on large data.

The function that returns the current (and effective) path cost is usually called g().


function greedy_search(start_state, target_state)
        return best_first_search(start_state, target_state, h())

Such a search can also be implemented with “dynamic programming” or with “memoization” (see Dijkstra's algorithm).

The A* (pronounced “A-star”) search combines the advantages of the greedy and uniform-cost searches: it minimizes the total search cost by using the sum of the two functions as the evaluation function (f()):


f = g + h


In other words, the f cost of a node is the sum of the path generated to get there from the start node and the path estimated to reach the target.


function A_star_search(start_state, target_state)
        return best_first_search(start_state, target_state, f())

In a 2D map, typical heuristics can be the Euclidean distance, the “Manhattan” distance or the Diagonal distance.

After a few benchmarks, I realized that the Diagonal distance performs very well and is good for finding paths where diagonal movement is allowed.


Diagonal = sqrt(2) * min(∆x, ∆y) + (∆x + ∆y - 2 * min(∆x, ∆y))


I used integer values to store the costs to make the computations faster, so sqrt(2) is approximated to 1.4 and all costs are multiplied by 10.

Another improvement I made is the POS() macro, that makes it possible to represent a state (a position in the map) with only one 32bit integer instead of x and y explicitly. This halves the memory that should be allocated, and also halves the number of comparisons to compute.

But the most significant choice is the data structure used for the open list: a simple array of booleans would require O(1) to access to an element, but also O(n) to find the high priority node, and it would require (n * size of one node) memory; a sorted array would require at least O(n log n) at each loop to keep it sorted, and O(1) to find the high priority node; a linked list would decrease the amount of memory to be allocated, but the complexity would be the same as that of a simple array.

My choice was a “binary heap”, which requires the same memory as a linked list (the actual number of nodes in the open list), and takes O(1) to get the high priority node, O(log n) to pop it from the heap and O(log n) to insert a node.

An array of “flags” was used to know in O(1) if a node is already in the heap.

The choice of a binary heap dramatically increased the efficiency of the algorithm, allowing to make it work on larger data.

Since an array is used to store the nodes of the heap, it is necessary to reallocate all the memory every time a node is added to the heap;

I avoided this problem by setting a default block of memory to be allocated instead of one single node.

Anyway, if possible (i.e. if the map isn't too big), it is better to allocate all the memory at one time.


Tutorial

The Unix makefile included in the package can be used to automatically build the library and the demonstration binaries.

Open a shell and type:


make


This will compile all the C sources in object files, then archive them in a static library (lib/lib2dp.a), and link the demo to the library.

If you want to compile them separately, the make targets are “lib2dp” for the library, “demo” for the demo binary and “genmap” for the map generator used by the demo.


make clean

will remove all the object files from the source directory.

To compile an application linking it statically with the library (let's assume that “lib2dp.a” is in the same directory of “MyApp.c”):


gcc -o MyApp MyApp.c -L. -l2dp


If you use a different OS (like Windows), follow the instructions of your compiler/IDE to build the library, or simply add all the sources to your project.

Now, let's have a look at the demonstration binaries.

To generate a 500x500 random map (40% probability to have an unwalkable tile):


./genmap 500 500


A plain text file (“map.txt”) should be created: the “0”'s represent walkable tiles, the “1”'s represent unwalkable tiles.

Then run the demo and enter any key when asked:


./demo


If at least a path exists between (0;0) and (499,499), the program finds it and writes it on the same file: “S” is the starting point, “T” the target, and the path is made of “x”'s.

The cost of the path is also displayed by the sample program.

Now let's learn how to use the library:

First of all, “pathfinder_t.h” should be included in your code:


#include “pathfinder_t.h”


Remember that a position is compressed as follows:


int x, y;

int position = POS(x, y, width);


The cells are represented by an array of short integers:


short cells[width * height];

or

short *cells;

cells = calloc(width * height, sizeof(short));


An instance of map_t must be declared properly:


map_t map;

map.x_limit = width;

map.y_limit = height;

map.cells = cells; /* pointer to the first element of the array */


Then comes the pathfinder:


pathfinder_t *pathfinder;

pathfinder = pfAlloc(100, width * height);

pfSetMap(pathfinder, &map);


The first parameter of pfAlloc is the initial memory to be allocated for the open list (we used 100 as an example); the second parameter is the total number of states in the space of the states.

pfAlloc() allocates the memory for a pathfinder_t instance and returns a pointer to it.

Remember to call


pfFree(pathfinder);


when you don't need the pathfinder anymore, or you need to use it with a map of different dimension.

To perform a pathfinding, you must first allocate space for a list which will host the path:


list_t path;

path = lAlloc();


and two integers, one to store the result (SOLVED or NO_SOLUTION) and one to store the cost of the path (which is the number of tiles of the path).

Now you can call pfFindPath() (remember that positions are passed after being compressed):


int result, cost;

start = POS(start_x, start_y, width);

target = POS(target_x, target_y, width);

result = pfFindPath(pathfinder, start, target, path, &cost);


If the result is SOLVED, you can go through the path using a list_iterator_t:


list_iterator_t lit;

for (lit = path->front; lit != 0; lit = lit->next)
        <lit is a pointer to the current step of the path>

See “demo.c” for a complete and working code.


Further improvements

Even if it performs well, this software is only a little experiment.

When I have the time to work on it, I'm probably going to write a more comprehensive library in C++, using the STL library as much as possible, on informed search in general.

Using templates and inheritance, it should be easy to define custom search algorithms, heuristics, data structures and so on, which can be useful not only for pathfinding but for planning problems in general.

Also a dedicated file format (for graphs and maps) and precomputing capabilities could be some interesting features.


Copyright © 2005 Alessandro Presta