Skip to content

Commit ed0fee8

Browse files
committed
Graph alg backgrounds (not simpler Prim's)
1 parent d07559c commit ed0fee8

File tree

10 files changed

+242
-183
lines changed

10 files changed

+242
-183
lines changed

src/algorithms/controllers/BFS.js

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,6 @@ export default {
9292
6,
9393
(vis, x) => {
9494
vis.array.set(x, 'BFS');
95-
9695
},
9796
[[displayedNodes, displayedParent, displayedVisited]]
9897
);
@@ -142,6 +141,7 @@ export default {
142141
9,
143142
(vis, x, y, z, Nodes) => {
144143
// Graph and array have been updated above
144+
145145
vis.array.setList(y); // updated Queue
146146
},
147147
[[displayedNodes, displayedParent, displayedVisited], displayedQueue, explored, Nodes]

src/algorithms/controllers/prim_old.js

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,7 @@ export default {
5252
const closed = [];
5353
const pqCost = [];
5454
const prevNode = [];
55+
// XXX add finalCost array + compute total cost at end?
5556

5657
chunker.add(
5758
1,

src/algorithms/explanations/ASTExp.md

Lines changed: 62 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# A\* Algorithm
1+
# A\*
22

33
---
44

@@ -11,41 +11,59 @@ end node. If the heuristic never over-estimates the path length it is
1111
said to be "admissible" and A\* is then guaranteed to find a shortest
1212
path to the end node (assuming all edges have a positive weight).
1313
With an inadmissible heuristic a path will still be found (and it
14-
may even be found more quickly) but it may not be the shortest. A\*
15-
is one of several algorithms that can be viewed as having a similar
16-
structure. Some of these can be used for both directed and undirected
17-
graphs; here we use undirected graphs for simplicity. The way paths are
18-
represented is for each node to point to the previous node in the path
19-
(so paths are actually reversed in this representation and essentially we
20-
have a tree with "parent" pointers and the start node at the root). This
21-
allows multiple nodes to each have a single path represented.
14+
may even be found more quickly) but it may not be the shortest.
2215

23-
As all these algorithms execute, we can classify nodes into three sets.
24-
They are the nodes for which the final parent node has been found (this
25-
is a region of the graph around the start node), "frontier" nodes that
26-
are not finalised but are connected to a finalised node by a single edge,
27-
and the rest of the nodes, which have not been seen yet. The frontier
28-
nodes are stored explicitly in some data structure and some algorithms
29-
also need some way to check if a node has been seen and/or finalised. The
30-
frontier initially contains just the start node. The algorithms repeatedly
31-
pick a frontier node, finalises the node (its current parent becomes
32-
its final parent) and updates information about neighbours of the node.
16+
A\* is one of a related group of graph traversal
17+
algorithms that can be viewed as having a similar structure.
18+
Others of these algorithms work with weighted graphs
19+
where the aim is to find the least cost path(s), while BFS and DFS
20+
ignore edge weights and Prim's
21+
algorithm finds a minumum spanning tree of the graph (the least cost
22+
set of edges that connects all nodes, if the graph is connected).
3323

34-
The A\* algorithm keeps track of the length of the shortest path found so
35-
far to each node (if any) and uses a priority queue (PQ) for the frontier
36-
nodes, ordered according to this length *plus* the heuristic value for
37-
the node. At each stage the node with minimum path length plus heuristic
38-
value is removed from the priority queue and finalised; its neighbours
39-
in the frontier may now have a shorter path to them so their costs need
40-
to be updated (and other neighbours must be added to the frontier).
24+
These graph traversal algorithms can be used for both directed
25+
and undirected graphs; in AIA we use undirected graphs for simplicity.
26+
Paths are represented by having each node point to the previous
27+
"parent" node in the path, so
28+
we have a tree with "parent" pointers and the start node at the
29+
root, that is a tree of reversed paths.
30+
31+
As these algorithms execute, we can classify nodes into three sets.
32+
These are:
33+
34+
35+
- "Finalised" nodes, for which the shortest or least costly path back to the start node has already
36+
been finalised, that is the final parent node has been determined and is recorded;
37+
38+
- "Frontier" nodes, that are not finalised but are connected to a finalised node by a single edge; and
39+
40+
- The rest of the nodes, which have not been seen yet.
41+
42+
The frontier nodes are stored in a data structure.
43+
Some of the algorithms also need a way to check if a node has already been seen and/or finalised.
44+
45+
The frontier initially contains just the start node. The algorithms repeatedly
46+
pick a frontier node, finalise the node (its current parent becomes
47+
its final parent) and update information about the neighbours of the node.
48+
A\* uses a priority queue for the frontier nodes,
49+
ordered on the shortest distance to the node found so far *plus* the
50+
heuristic value of the node. At each
51+
stage the node with the minimum cost
52+
is removed for processing, and its neighbors have their information
53+
updated if a shorter path has now been found.
54+
Other algorithms use other data structures to keep track
55+
of the frontier nodes.
4156

4257
In the presentation here, we do not give details of how the priority
4358
queue is implemented, but just emphasise it is a collection of nodes
44-
with associated costs and the node with the minimum cost is selected
45-
each stage. When Length elements values disappear it means the element
46-
has been removed from the PQ (the length and heuristic values are not
47-
used again). The pseudo-code is simpler if nodes that are yet to be
48-
seen are also put in the PQ, with infinite cost, which we do here.
59+
with associated path lengths plus heuristic values and the node with the
60+
minimum total is selected each
61+
stage. When elements disappear from the length and heuristic arrays
62+
it means the element
63+
has been removed from the priority queue (the value is not used again).
64+
The pseudo-code is simpler if nodes that are yet to be seen are also
65+
put in the PQ, with infinite cost, which we do here. The frontier is the
66+
set of nodes with a finite path length shown.
4967

5068
Here we number all nodes for simplicity so we can use arrays for the
5169
graph representation, the parent pointers, etc. For many important
@@ -54,20 +72,21 @@ be huge and arrays are impractical for representing the graph so other
5472
data structures are needed.
5573

5674
In this animation the layout of the graph nodes is important. All nodes
57-
are on a two-dimensional grid so each have (x,y) integer coordinates.
58-
Both the weight of each edge and heuristic values for each node are
59-
related to the "distance" between the two nodes. Two measures of
75+
are on a two-dimensional grid so each has (x,y) integer coordinates.
76+
Edge weights can be entered manually or computed automatically, based on
77+
the "distance" between the two nodes. Two measures of
6078
distance are provided: Euclidean and Manhattan. Eclidean distance is
6179
the straight line distance; here we round it up to the next integer.
6280
Manhattan distance is the difference in x coordinate values plus the
63-
difference in y coordinate values. You can choose which distance measure
64-
to use for both weights and heuristic values to explore behaviour of the
65-
algorithm. You can also manually input weights. Note that if Euclidean
66-
distance is used for weights and Manhattan distance is used for the
67-
heuristic, it is not admissible so the shortest path may not be the one
68-
returned (you may like to experiment with the default supplied graph
69-
and change the weight and heuristic settings). For other combinations
70-
of Manhattan/Euclidean the heuristic is admissible. You can also choose
71-
the start and end nodes and change the graph choice (see the instructions
81+
difference in y coordinate values. You can choose the way weights are
82+
decided, toggle between Euclidean and Manhattan for the heuristic
83+
function, choose the
84+
start and end nodes and change the graph choice (see the instructions
7285
tab for more details).
7386

87+
Note that if Euclidean
88+
distance is used for weights and Manhattan distance is used for the
89+
heuristic, it is not admissible, so the shortest path may not be the one
90+
returned. For other combinations
91+
of Manhattan/Euclidean the heuristic is admissible.
92+

src/algorithms/explanations/BFSExp.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
# Breadth First Search Algorithm
1+
# Breadth First Search
22
---
33

44
Breadth first search (BFS) for graphs can be used to find a path from
5-
a single start node to either a single end node; to one of several end
6-
nodes; or to all nodes that are connected to the start node, depending on the termination
7-
condition. BFS returns the path to this (these) node(s)
5+
a single start node to either a single end node, to one of several end
6+
nodes, or to all nodes that are connected to the start node (depending on the termination
7+
condition). BFS returns the path to this (these) node(s)
88
that can be reached with the minimum number of edges traversed, regardless of
99
edge weights.
1010

@@ -15,13 +15,13 @@ for all edges), where the aim is to find the least cost path(s), while Prim's
1515
algorithm finds a minumum spanning tree of the graph (the least cost
1616
set of edges that connects all nodes, if the graph is connected).
1717

18-
These graph search algorithms can be used for both directed
18+
These graph traversal algorithms can be used for both directed
1919
and undirected graphs; in AIA we use undirected graphs for simplicity.
2020
Paths are represented by having each node point to the previous
2121
"parent" node in the path, so
2222
we have a tree with "parent" pointers and the start node at the
2323
root, that is a tree of reversed paths. This allows these algorithms to return
24-
multiple end nodes that each have a single path from the start node.
24+
multiple nodes that each have a single path from the start node.
2525
BFS will find paths with
2626
the minimum number of edges.
2727

@@ -36,7 +36,7 @@ been finalised, that is the final parent node has been determined and is recorde
3636

3737
- The rest of the nodes, which have not been seen yet.
3838

39-
The frontier nodes are stored explicitly in a data structure.
39+
The frontier nodes are stored in a data structure.
4040
Some of the algorithms also need a way to check if a node has already been seen and/or finalised.
4141

4242
The frontier initially contains just the start node. The algorithms repeatedly

src/algorithms/explanations/BSTExp.md

Lines changed: 4 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -7,11 +7,8 @@ The binary search tree is built up by adding items one at a time. Since the aver
77

88
The biggest problem with the binary search tree is that its behavior degenerates when there is order in the input data. In the worst case, sorted or reverse sorted data items yield a linear tree, or "stick", the complexity of building the tree is `O(n^2)`, and the complexity of a search for a single item is `O(n)`.
99

10-
## Time Complexity
10+
## Complexity
1111

12-
Algorithm | Average | Worst Case
13-
--- | --- | ---
14-
Space | O(n) | O(n) |
15-
Search | O(log n) | O(n)
16-
Insert | O(log n) | O(n)
17-
Delete | O(log n) | O(n)
12+
Space complexist for building a tree is O(n). Time complexity for
13+
search, insert (and delete) is O(log n) on average and O(n) in the worst
14+
case.

src/algorithms/explanations/DFSExp.md

Lines changed: 52 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,40 +1,60 @@
1-
# Depth First Search Algorithm
1+
# Depth First Search (iterative)
22
---
3+
34
Depth first search (DFS) for graphs can be used to find a path from
4-
a single start node to either a single end node, one of several end
5-
nodes, or all nodes that are connected (depending on the termination
6-
condition).
7-
It is one of several algorithms that can be viewed as having a similar
8-
structure. Some of these work with weighted graphs (with positive weights
9-
for all edges), where the aim is to find the shortest path(s) (or the
10-
minimum spanning tree in the case of Prim's algorithm) but DFS ignores
11-
weights. These graph search algorithms can be used for both directed
12-
and undirected graphs; here we use undirected graphs for simplicity.
13-
The way paths are represented is for each node to point to the previous
14-
node in the path, so paths are actually reversed in this representation
15-
and we have a tree with "parent" pointers and the start node at the
16-
root. This allows multiple nodes to each have a single path returned
17-
(or we can return a spanning tree).
18-
19-
As all these algorithms execute, we can classify nodes into three sets.
20-
They are the nodes for which the final parent node has been found (this
21-
is a region of the graph around the start node), "frontier" nodes that
22-
are not finalised but are connected to a finalised node by a single edge,
23-
and the rest of the nodes, which have not been seen yet. The frontier
24-
nodes are stored explicitly in some data structure and some algorithms
25-
also need some way to check if a node has been seen and/or finalised. The
26-
frontier initially contains just the start node. The algorithms repeatedly
27-
pick a frontier node, finalises the node (its current parent becomes
28-
its final parent) and updates information about neighbours of the node.
29-
30-
DFS can be coded recursively but here we give a rather more complex
31-
iterative version because illustrates the similarity with the other
32-
algorithms. DFS uses a stack of nodes that includes all the frontier
5+
a single start node to either a single end node, to one of several end
6+
nodes, or to all nodes that are connected to the start node (depending
7+
on the termination
8+
condition). DFS makes no attempt to find shortest paths and weights/costs
9+
of edges are ignored.
10+
11+
DFS can be coded recursively (this is presented elsewhere).
12+
Here we give a rather more complex
13+
iterative version to illustrate how DFS is
14+
one of a related group of graph traversal algorithms that can be viewed as having a similar
15+
structure.
16+
Others of these algorithms work with weighted graphs (with positive weights
17+
for all edges), where the aim is to find the least cost path(s), while Prim's
18+
algorithm finds a minumum spanning tree of the graph (the least cost
19+
set of edges that connects all nodes, if the graph is connected).
20+
21+
These graph traversal algorithms can be used for both directed
22+
and undirected graphs; in AIA we use undirected graphs for simplicity.
23+
Paths are represented by having each node point to the previous
24+
"parent" node in the path, so
25+
we have a tree with "parent" pointers and the start node at the
26+
root, that is a tree of reversed paths. This allows these algorithms to return
27+
multiple nodes that each have a single path from the start node.
28+
29+
As these algorithms execute, we can classify nodes into three sets.
30+
These are:
31+
32+
33+
- "Finalised" nodes, for which the shortest or least costly path back to the start node has already
34+
been finalised, that is the final parent node has been determined and is
35+
recorded (DFS is an exception in that path lengths/costs are ignored and
36+
finalised nodes can have very long paths to them);
37+
38+
- "Frontier" nodes, that are not finalised but are connected to a finalised node by a single edge; and
39+
40+
- The rest of the nodes, which have not been seen yet.
41+
42+
The frontier nodes are stored in a data structure.
43+
Some of the algorithms also need a way to check if a node has already been seen and/or finalised.
44+
45+
The frontier initially contains just the start node. The algorithms repeatedly
46+
pick a frontier node, finalise the node (its current parent becomes
47+
its final parent) and update information about the neighbours of the node.
48+
DFS uses a stack of nodes that includes all the frontier
3349
nodes plus some that may have been finalised already (in the recursive
3450
coding the stack is implicit). At each stage the top node is popped off
3551
the stack. If it has been finalised already it is ignored, otherwise it
3652
is finalised and its neighbours that have not been finalised are pushed
37-
onto the stack.
53+
onto the stack. Thus the frontier is represented by the stack plus the
54+
finalised status of each node. Other algorithms use other data structures to keep track
55+
of the frontier nodes.
56+
57+
3858

3959
Here we number all nodes for simplicity so we can use arrays for the
4060
graph representation, the parent pointers, etc. For many important
@@ -45,7 +65,7 @@ data structures are needed.
4565
For consistency with other algorithm animations, the layout of the
4666
graph is on a two-dimensional grid where each node has (x,y) integer
4767
coordinates. You can choose the start and end nodes and change the
48-
graph choice (see the instructions tab for more details). While eights of
68+
graph choice (see the instructions tab for more details). While weights of
4969
edges can be included in the text box input, DFS will ignore weights
5070
and positions of nodes. Only a single end node is supported; choosing
5171
0 results in finding paths to all connected nodes.

0 commit comments

Comments
 (0)