1
- # A\* Algorithm
1
+ # A\*
2
2
3
3
---
4
4
@@ -11,41 +11,59 @@ end node. If the heuristic never over-estimates the path length it is
11
11
said to be "admissible" and A\* is then guaranteed to find a shortest
12
12
path to the end node (assuming all edges have a positive weight).
13
13
With an inadmissible heuristic a path will still be found (and it
14
- may even be found more quickly) but it may not be the shortest. A\*
15
- is one of several algorithms that can be viewed as having a similar
16
- structure. Some of these can be used for both directed and undirected
17
- graphs; here we use undirected graphs for simplicity. The way paths are
18
- represented is for each node to point to the previous node in the path
19
- (so paths are actually reversed in this representation and essentially we
20
- have a tree with "parent" pointers and the start node at the root). This
21
- allows multiple nodes to each have a single path represented.
14
+ may even be found more quickly) but it may not be the shortest.
22
15
23
- As all these algorithms execute, we can classify nodes into three sets.
24
- They are the nodes for which the final parent node has been found (this
25
- is a region of the graph around the start node), "frontier" nodes that
26
- are not finalised but are connected to a finalised node by a single edge,
27
- and the rest of the nodes, which have not been seen yet. The frontier
28
- nodes are stored explicitly in some data structure and some algorithms
29
- also need some way to check if a node has been seen and/or finalised. The
30
- frontier initially contains just the start node. The algorithms repeatedly
31
- pick a frontier node, finalises the node (its current parent becomes
32
- its final parent) and updates information about neighbours of the node.
16
+ A\* is one of a related group of graph traversal
17
+ algorithms that can be viewed as having a similar structure.
18
+ Others of these algorithms work with weighted graphs
19
+ where the aim is to find the least cost path(s), while BFS and DFS
20
+ ignore edge weights and Prim's
21
+ algorithm finds a minumum spanning tree of the graph (the least cost
22
+ set of edges that connects all nodes, if the graph is connected).
33
23
34
- The A\* algorithm keeps track of the length of the shortest path found so
35
- far to each node (if any) and uses a priority queue (PQ) for the frontier
36
- nodes, ordered according to this length * plus* the heuristic value for
37
- the node. At each stage the node with minimum path length plus heuristic
38
- value is removed from the priority queue and finalised; its neighbours
39
- in the frontier may now have a shorter path to them so their costs need
40
- to be updated (and other neighbours must be added to the frontier).
24
+ These graph traversal algorithms can be used for both directed
25
+ and undirected graphs; in AIA we use undirected graphs for simplicity.
26
+ Paths are represented by having each node point to the previous
27
+ "parent" node in the path, so
28
+ we have a tree with "parent" pointers and the start node at the
29
+ root, that is a tree of reversed paths.
30
+
31
+ As these algorithms execute, we can classify nodes into three sets.
32
+ These are:
33
+
34
+
35
+ - "Finalised" nodes, for which the shortest or least costly path back to the start node has already
36
+ been finalised, that is the final parent node has been determined and is recorded;
37
+
38
+ - "Frontier" nodes, that are not finalised but are connected to a finalised node by a single edge; and
39
+
40
+ - The rest of the nodes, which have not been seen yet.
41
+
42
+ The frontier nodes are stored in a data structure.
43
+ Some of the algorithms also need a way to check if a node has already been seen and/or finalised.
44
+
45
+ The frontier initially contains just the start node. The algorithms repeatedly
46
+ pick a frontier node, finalise the node (its current parent becomes
47
+ its final parent) and update information about the neighbours of the node.
48
+ A\* uses a priority queue for the frontier nodes,
49
+ ordered on the shortest distance to the node found so far * plus* the
50
+ heuristic value of the node. At each
51
+ stage the node with the minimum cost
52
+ is removed for processing, and its neighbors have their information
53
+ updated if a shorter path has now been found.
54
+ Other algorithms use other data structures to keep track
55
+ of the frontier nodes.
41
56
42
57
In the presentation here, we do not give details of how the priority
43
58
queue is implemented, but just emphasise it is a collection of nodes
44
- with associated costs and the node with the minimum cost is selected
45
- each stage. When Length elements values disappear it means the element
46
- has been removed from the PQ (the length and heuristic values are not
47
- used again). The pseudo-code is simpler if nodes that are yet to be
48
- seen are also put in the PQ, with infinite cost, which we do here.
59
+ with associated path lengths plus heuristic values and the node with the
60
+ minimum total is selected each
61
+ stage. When elements disappear from the length and heuristic arrays
62
+ it means the element
63
+ has been removed from the priority queue (the value is not used again).
64
+ The pseudo-code is simpler if nodes that are yet to be seen are also
65
+ put in the PQ, with infinite cost, which we do here. The frontier is the
66
+ set of nodes with a finite path length shown.
49
67
50
68
Here we number all nodes for simplicity so we can use arrays for the
51
69
graph representation, the parent pointers, etc. For many important
@@ -54,20 +72,21 @@ be huge and arrays are impractical for representing the graph so other
54
72
data structures are needed.
55
73
56
74
In this animation the layout of the graph nodes is important. All nodes
57
- are on a two-dimensional grid so each have (x,y) integer coordinates.
58
- Both the weight of each edge and heuristic values for each node are
59
- related to the "distance" between the two nodes. Two measures of
75
+ are on a two-dimensional grid so each has (x,y) integer coordinates.
76
+ Edge weights can be entered manually or computed automatically, based on
77
+ the "distance" between the two nodes. Two measures of
60
78
distance are provided: Euclidean and Manhattan. Eclidean distance is
61
79
the straight line distance; here we round it up to the next integer.
62
80
Manhattan distance is the difference in x coordinate values plus the
63
- difference in y coordinate values. You can choose which distance measure
64
- to use for both weights and heuristic values to explore behaviour of the
65
- algorithm. You can also manually input weights. Note that if Euclidean
66
- distance is used for weights and Manhattan distance is used for the
67
- heuristic, it is not admissible so the shortest path may not be the one
68
- returned (you may like to experiment with the default supplied graph
69
- and change the weight and heuristic settings). For other combinations
70
- of Manhattan/Euclidean the heuristic is admissible. You can also choose
71
- the start and end nodes and change the graph choice (see the instructions
81
+ difference in y coordinate values. You can choose the way weights are
82
+ decided, toggle between Euclidean and Manhattan for the heuristic
83
+ function, choose the
84
+ start and end nodes and change the graph choice (see the instructions
72
85
tab for more details).
73
86
87
+ Note that if Euclidean
88
+ distance is used for weights and Manhattan distance is used for the
89
+ heuristic, it is not admissible, so the shortest path may not be the one
90
+ returned. For other combinations
91
+ of Manhattan/Euclidean the heuristic is admissible.
92
+
0 commit comments