|
1 | 1 | # AVL Trees
|
2 | 2 |
|
3 |
| - |
4 |
| - |
5 |
| -An AVL tree is **self-balancing** **binary search tree**. |
6 |
| - |
7 |
| -Unlike the basic binary search tree, which can exhibit *O(n)* worst case behavior for certain inputs, the AVL tree is always balanced, so search will be *O(log n)* |
8 |
| -regardless of the order of the input data. |
9 |
| - |
10 |
| -### The Binary Search Tree Invariant |
11 |
| - |
12 |
| -A **binary tree (BST)** is either is either empty (Empty) or else it |
13 |
| -it has a root node and two subtrees (which are binary trees, and can also be empty). |
14 |
| -The root node t has a key t.key. Ordinarily every node would also |
15 |
| -hold other data (t.data), which the user would like to find by |
16 |
| -searching for the key, *e.g.* search for Student ID Number (key) to find street address (data). Since the |
17 |
| -data attribute has no impact on |
18 |
| -how insertion and search take place, we disregard it in this animation. |
19 |
| - |
20 |
| - |
21 |
| -Note that a newly inserted node |
22 |
| -will always appear as a leaf |
23 |
| -in the tree. |
24 |
| - |
25 |
| - |
26 |
| -In any binary search tree, the **BST invariant** is always maintained; that is, |
27 |
| -for each |
28 |
| -subtree t, with root key t.key, the left subtree, t.left, |
29 |
| -contains no node with key greater than t.key, and the right subtree, |
30 |
| -t.right, contains no node with key smaller than t.key. |
31 |
| - |
32 |
| - |
33 |
| - |
34 |
| -### Insertion and Tree Balancing |
35 |
| - |
36 |
| - |
37 |
| -**Insertion of a new item** into any binary tree (*e.g.* a BST or an AVL tree) requires |
38 |
| -(1) first the search to find the correct place to insert, then (2) the actual insertion. |
39 |
| - |
40 |
| -In the **recursive implementation** of the AVL shown in *AIA*, the first stage, |
41 |
| -determining the insertion point, takes place |
42 |
| -during the recursive *calls* to the insert function, which accumulate on the machine stack. |
43 |
| - |
44 |
| - |
45 |
| -The last recursive call *returns* when the new node has been inserted into the tree. |
46 |
| - |
47 |
| -In the AVL tree, the height of the subtree rooted at each node is stored in the |
48 |
| -node.During *each* successive *return* from the call to insert, |
49 |
| -the **height of the is updated** and the **tree is checked for balance**. |
50 |
| -When a new node is inserted in an AVL tree, the tree may become *temporarily* unbalanced, that is |
51 |
| -the difference in heights of the left and right subtrees of *any* node is greater than 1. |
52 |
| - |
53 |
| - |
54 |
| - |
55 |
| -If the tree has become unbalanced, |
56 |
| -balance will be restored using one or two **rotation** |
57 |
| -operations, which reduce the height of the unbalanced subtree, while still maintaining the BST invariant. |
58 |
| - |
59 |
| -### Imbalance configurations and rotations |
60 |
| - |
61 |
| -The exact sequence of rotations depends on the configuration around the |
62 |
| -node where the imbalance has been detected. |
63 |
| - |
64 |
| -There are four possible configurations at the node where an imbalance has been detected: |
65 |
| -(1) *left-left*, where the where the child and grandchild nodes or subtrees of the unbalanced node are |
66 |
| -either both left subtrees, and (2) its mirror image and *right-right*, where the child and grandchild are both right |
67 |
| -subtrees; (3) *left-right*, where there is a left child, with a right subtree as the grandchild, and (4) its mirror image |
68 |
| -*right-left*, where the the right child of the unbalanced node has a left (grand)child. |
69 |
| - |
70 |
| -The *left-left* imbalance is restored by a single rotation around the edge between the node |
71 |
| -where the imbalance is detected and its (left) child. Try inserting items 30, 20, then 10 into an AVL tree in *AIA*. You |
72 |
| -will see that the rotation restores balance, and at the same time maintains the binary search invariant. Similarly for |
73 |
| -the *right-right* configuration; try inserting 10, 20, then 30 into the *AIA* AVL tree. |
74 |
| - |
75 |
| - |
76 |
| - |
77 |
| - |
78 |
| -If you try a larger tree |
79 |
| -XXX Lee -- we should come up with a sample tree here, to show imbalance up the tree from the new insert |
80 |
| -XXX Lee -- we can put URLs here, as you suggested. Should we do that for the really simple ones above? |
81 |
| - |
82 |
| -As you can see, while |
83 |
| -**rotation** is a **local operation**, involving only 6 pointer reassignments, |
84 |
| -it can affect the balance of the tree overall. |
85 |
| - |
86 |
| -The *left-right* and *right-left* configurations require a double rotation. For the *left-right* configuration, |
87 |
| -first a left rotation at the edge between the child and grandchild of the node, and then a right rotation |
88 |
| -at the edge between the node and its now left child (previously grandchild). The *right-left* configuration requires |
89 |
| -first a right rotation between the child and grandchild, then a left rotation around the node and its (new) right child. |
90 |
| -You can see how these work by inserting 30,10, then 20 into AIA (*left-right*) and 10, 30, 20 (*right-left*). |
91 |
| - |
92 |
| - |
93 |
| - |
94 |
| - |
95 |
| -XXX Lee -- put these exercises here? or integrated into the text, as I've done for left-left above? |
96 |
| -Only one place, not both |
97 |
| - |
98 |
| -_**Suggested exercises in AIA**_ |
99 |
| -XXX Lee - I haven't reviewed these recently, waiting for us to agree on desired format first. |
100 |
| -XXX In any case, will be rewritten to obliterate zig-zag and friend |
101 |
| - |
102 |
| - |
103 |
| --For a left-left zig-zig configuration, enter 50, 40, then with the code expanded enter 30, step by step, to see the temporary left-left zig-zig imbalance, followed by a single rotation.\ |
104 |
| --For right-right zig-zig and single rotation, enter 30, 40, then slowly 50.\ |
105 |
| --For left-right zig-zag and double rotation, enter 50, 30, then slowly 40.\ |
106 |
| --For right-left zig-zag and double rotation enter 30, 50, then slowly 40. |
107 |
| - |
108 |
| -In the above exercises the imbalance takes place near the newly inserted node. To see how an imbalance can quite remote, and how this is handled: |
109 |
| - |
110 |
| - |
111 |
| -For imbalance further up the tree from the newly inserted node: |
112 |
| --Input 60,40,80,20,50,70,90,15,25,45,55,10. Insert 60..55 quickly (use the speed bar and collapse the pseudocode), then expand the pseudocode and proceed step by step as 10 is inserted. |
113 |
| - |
114 |
| - |
115 |
| - |
116 |
| - |
117 |
| - |
118 |
| - |
119 |
| - |
120 |
| - |
121 |
| - |
122 |
| - |
123 |
| - |
| 3 | +An AVL tree is a kind of **binary search tree** that is |
| 4 | +**self-balancing**. |
| 5 | + |
| 6 | +Unlike a basic binary search tree, which can exhibit *O(n)* worst case |
| 7 | +behavior for both search and insert, an AVL tree is always balanced, so |
| 8 | +the worst case *O(log n)*. The way balanced is maintained is quite |
| 9 | +complicated; we suggest experimenting with the examples at the end of |
| 10 | +this background. |
| 11 | + |
| 12 | +A **binary search tree (BST)** is either empty or else it is a root |
| 13 | +node containing a key and two subtrees, which are binary search trees. |
| 14 | +Binary trees are ordered, so keys in the left subtree are smaller (or |
| 15 | +equal to) the key in the root and keys in the right subtree are greater |
| 16 | +(or equal to). Normally there is additional data in each node as well |
| 17 | +as the key, disregarded here. For an **AVL tree**, nodes also contain |
| 18 | +the *height* of the tree; this is used in the insertion algorithm to |
| 19 | +ensure the tree is balanced. The two children of an AVL tree node have |
| 20 | +*a height difference of at most one*. The height is ignored for search |
| 21 | +and AVL tree search is identical to BST search. |
| 22 | + |
| 23 | +## Insertion |
| 24 | + |
| 25 | +BST insertion traverses down the tree from the root (going left or |
| 26 | +right at each stage, depending on the comparison between the key in |
| 27 | +the node and the key to be inserted) then adds new leaf containing the |
| 28 | +inserted key. AVL tree insertion does the same, but additionally, the |
| 29 | +height information is updated and the tree may be adjusted to restore the |
| 30 | +balance. The coding here is recursive, with each recursive insert call |
| 31 | +going one step further down the tree. After each recursive call returns |
| 32 | +(implicitly traversing back up the tree to the root) the height adjustment |
| 33 | +and re-balancing is performed. The collapsed pseudocode is simply BST |
| 34 | +insertion code containing recursive calls, with code for height update |
| 35 | +and re-balancing added at the end. The new height is simply the maximum |
| 36 | +height of the two subtrees plus 1. |
| 37 | + |
| 38 | +### Tree re-balancing |
| 39 | + |
| 40 | +If the difference between the heights of the left and right subtrees is |
| 41 | +more than one, the tree is considered unbalanced and must be rearranged |
| 42 | +so that it is balanced. This is done by **local** operations, called |
| 43 | +**rotations**, that simply re-assign several pointers in the vicinity |
| 44 | +of the node. A rotation will raise up one subtree and lower another; |
| 45 | +the AVL insertion algorithm chooses what rotations to perform so as |
| 46 | +to ensure the tree is balanced after the insertion operation has been |
| 47 | +completed (assuming it was balanced to begin with). |
| 48 | + |
| 49 | +An AVL tree node can only become unbalanced if insertion into one of the |
| 50 | +"grandchildren" (sub-sub-trees) increased the height of the tree. There |
| 51 | +are four grandchildren (called "left-left", "left-right", "right-left" |
| 52 | +and "right-right"), which must be handled separately. For the left-left |
| 53 | +and right-right cases a single rotation operation will restore balance. |
| 54 | +The other two cases each require two rotation operations. |
| 55 | + |
| 56 | +#### Single rotations |
| 57 | + |
| 58 | +A single rotation transforms the tree as shown in the diagram below, |
| 59 | +where t1, t4 and t7 are subtrees that may be of any size (in the "More" |
| 60 | +tab there is a W3Schools link that has an animation of these |
| 61 | +rotations). |
| 62 | +``` |
| 63 | + / / |
| 64 | + t6 t2 |
| 65 | + / \ Right Rotation / \ |
| 66 | + t2 t7 - - - - - - - > t1 t6 |
| 67 | + / \ < - - - - - - - / \ |
| 68 | + t1 t4 Left Rotation t4 t7 |
| 69 | +``` |
| 70 | + |
| 71 | +Going from left to right (a **right rotation** of t6), subtree t1 is |
| 72 | +raised and t7 is lowered, but the ordering is preserved. You can think of |
| 73 | +t6, t2 and the edge between them being rotated clockwise, so t2 becomes |
| 74 | +the parent and t6 becomes the right child. Additionally, the parent of t4 |
| 75 | +changes and the root of the tree changes (so the pointer from its parent |
| 76 | +changes). If t6 was unbalanced due to an insertion into t1 (the left-left |
| 77 | +case), this restores the balance. Similarly, the inverse operation, going |
| 78 | +from right to left (a **left rotation** of t2), restores the balance if |
| 79 | +unbalance was caused by insertion into t7 (the right-right case). The AIA |
| 80 | +pseudocode for rotation uses variables names consistent with this diagram. |
| 81 | + |
| 82 | +#### Double rotations |
| 83 | + |
| 84 | +If the tree becomes unbalanced due to insertion into t4 (the left-right |
| 85 | +case), balance can be restored by performing a left rotation at t2 |
| 86 | +followed by a right rotation at t6, as shown in the following digram: |
| 87 | + |
| 88 | +``` |
| 89 | + / / / |
| 90 | + t6 Rotate t6 Rotate t4 |
| 91 | + / \ left at t2 / \ right at t6 / \ |
| 92 | + t2 t7 - - - - - > t4 t7 - - - - - > t2 t6 |
| 93 | + / \ / \ / \ / \ |
| 94 | +t1 t4 t2 t5 t1 t3 t5 t7 |
| 95 | + / \ / \ |
| 96 | + t3 t5 t1 t3 |
| 97 | +``` |
| 98 | + |
| 99 | +Note that subtree t4 is broken into three parts but all these parts are |
| 100 | +raised whereas t7 is lowered (and the order is preserved). The right-left |
| 101 | +case is the mirror image and balance can be restored with a right rotation |
| 102 | +followed by a left rotation (for brevity we omit the details). |
| 103 | + |
| 104 | +## Examples |
| 105 | + |
| 106 | +The following examples of inputs result in rotation for the last key |
| 107 | +inserted. You can copy/paste these into AIA, use the progress bar to |
| 108 | +get to the point where the last key is inserted and step through the |
| 109 | +execution; expand the pseudocode to show details of the rotations being |
| 110 | +performed. |
| 111 | + |
| 112 | +### Left-left cases (right rotations) |
| 113 | + |
| 114 | +The following examples show a simplest case (where t4 is empty), a |
| 115 | +case most like the first diagram above, a case where the rotation is |
| 116 | +not at the root and a case where insertion is several levels below the |
| 117 | +rotation point. |
| 118 | + |
| 119 | +``` |
| 120 | +60,20,10 |
| 121 | +60,20,70,10,40,15 |
| 122 | +60,20,70,10,5 |
| 123 | +60,20,70,10,40,80,30,5,15,12 |
| 124 | +``` |
| 125 | + |
| 126 | +### Right-right cases (left rotations) |
| 127 | + |
| 128 | +In the following examples the trees are mirror images of the ones above. |
| 129 | + |
| 130 | +``` |
| 131 | +20,60,70 |
| 132 | +20,10,60,40,70,65 |
| 133 | +20,10,60,40,35 |
| 134 | +20,10,60,5,40,70,50,65,80,67 |
| 135 | +``` |
| 136 | + |
| 137 | +### Left-right and right-left cases (double rotations) |
| 138 | + |
| 139 | +These cases require double rotations. |
| 140 | + |
| 141 | +``` |
| 142 | +60,20,40 |
| 143 | +60,20,70,40,10,30 |
| 144 | +20,60,40 |
| 145 | +20,10,60,40,70,30 |
| 146 | +``` |
0 commit comments