Introduction
All notes are on Abdul Bari’s DSA Course on Udemy. He also has a free YouTube version that is consolidated.
Definition of the Data Structure and the Algorithm

What is a Data Structure?
 Can be defined as the arrangement of a collection of data items so that they can be utilized efficiently. It is all about data and the operations of the data.

What is an Algorithm?
 Can be defined as a procedure used for solving a problem or performing a computation. It is a fininte set of instructions to execute a task.
Stack and Heap
What is the Stack?
In a stack, memory is allocated in blocks that are of fixed size. Accessing this data is fast, and the (de)allocation is handled by the compiler entirely. The stack is preffered when possible, as it is safer to work with as it is thread safe, and it costs less to use. Memory is always temporary in the stack. The cons are that it isn’t dynamic, so everything that lives in the stack must be predefined in size.
What is the Heap?
In a heap, memory is allocated in random order in dynamic size. In other words, it isn’t fixed and the size of what is stored on there can be changed. The heap is slow and costs a lot, but it is very flexable.
Physical and Logical Data Structures
Basic Physical Data Structures
Array
An array is a collection of memory locations, all of which are side by side. Arrays are static, so once established, their size cannot be adjusted. An array can be created in the stack or the heap. An array is chosen when the size of the collection is known.
Linked List
A linked list also a collection of memory locations as well, but it is created differently. A linked list uses nodes, and these nodes contain a head and a pointer pointing to the next node. In this way, Linked Lists are dynamic and must always be created in the heap. A linked list is chosen when the size of the collection is unknown.
Logical Data Structures
There are linear, nonlinear, and tabular Logical Data Structures. The linear ones are the Stack and Queues. The non linear ones are the Tree and Graph. The tabular one is the Hash Table. We will delve into these later, but it is important to understand that the Physical Data Structures build these Logical Data Structures.
Abstract Data Type
A data type consists of its representation and its operation. The representation is what it actually is and how it is implemented. The operation portion is what can be done with the data. In other words, the representation is the technical details of a data type, and the operation is what we can actually do with it. An Abstract Data Type is a Data Type where the operation is known, but the representation is not. A real life example is a book, a book is Abstract, but a Telephone Book is a representation. We will look at all of these Logical Data Structures as ADT first, so we can understand them, and then we can delve into how they work.
Time and Space Complexity
Time Complexity
What is the purpose of the Computer? To compute, obviously. It is to automate a task we’d do on pen and paper. Time complexity is a measurement that explains the number of times a statement is executed. The greener the arrow, the better. If we are analyzing an algorithm and we decide that it takes, at worst, $n + 1$ operations, we just state that it is $O(n)$. We drop constants. Similarly, if we determine that it takes $\frac{n^2 n}{2}$ operations, we just state that it is $O(n^2)$. So the largest degree is chosen. If we determine that an algorithm is constantly halfing operations, then we state that it is $O(\log_2 ~n)$.
Let’s look at some examples:


In this example, no matter our input size, the number of steps is constant, it is 3 steps. Thus, the time complexity for this function is $O(1)$. Next example:


So in this example, we have $2n + 3$ steps, but we drop constants so the time complexity is $O(n)$. Next example:


So in this example, we have $2n^2 + 2n + 1$ steps, but we drop constants so the time complexity is $O(n^2)$.
Space Complexity
Space complexity is similar to time complexity in the way that it describes something with respect to the input size. As the name suggests, though, space complexity states the total space taken by an algorithm.
Recursion
Introduction
Given the function fun
:


We can see that the function consists of an ifelse statement, where if the input is greater than 1, we return the value of the function fun(n1)*2
, and if not, we just return n. In this way, we can simplify the function to just be:


This is an example of a recursive function. A recursive function consits of two parts, an ascending portion and a descending portion. We ascend by returning values of the function, and descend once we reach our catch value, in this case it is once $n \le 1$. The algorithm returns $2^n$.
Tail Recursion
Tail recursion is defined as recursion in which the last statement is the recursive call, for example:


So all the processing is done before the recursive call.
Head Recursion
Head recursion is defined as recursion in which the first statement is the recursive call, for example:


So all the processing is done after the recursive call.
Linear Recursion vs. Tree Recursion
Linear Recursion is recursion in which there is only one branch of recursion, in other words it maintains a straight line, for example:


Tree Recursion is recursion in which there is more than one branch of recursion, kind of like a tree, for example:


An example of tree recursion can be found here:


This will give the output 3 2 1 1 2 1 1
. This is because we call the function with the input 3, we print 3, and then call fun(31)
, it then prints 2, and then calls fun(21)
, which prints 1, and then calls fun(21)
again, which prints 1. Then we descend, and fun(31)
is called a second time, which repeats the same process over again one last time.
Indirect Recursion
Indirect recursion is recursion that happens between two or more functions, which makes a sort of circular pattern, for example:


Nested Recursion
Nested Recursion is recursion within recursion, in other words a recursive call within a recursive call, for example:


In this way, we evaulate the parameter of the recursive call first, then whatever that is, place it into that recursive call.
Examples of Recursion
Fibonacci Sequence
The Fibonacci Sequence is defined as $F_n = F_{n1} + F_{n2}$, where $F_0 = 0$ and $F_1 = 1$. We can use Tree Recursion to write an algorithm to calculate the sum of the sequence:


Sum of Natural Numbers
The sum of all natural numbers at point n, is equal to the sum of the natural numbers $(n1)$, plus $n$. In other words, sum(n) = sum(n1)+n


Arrays
As spoken about in the introduction, an array is a collection of memory locations, all of which are side by side. The basic way of declaring an array is as follows:


For instance:


This is an array of integers, with length 5. This is a static array, which means that once declared, the size is not changable, and it lives on the stack. A dynamic array is as follows:


This is an unsafe way of doing it, but is how you’d do it with new
, using a smart pointer, you could do this in C++14, which skips all the crazy new[]
and delete[]
stuff:


A bit semantical, just something that should be kept in mind when actually doing this in practice. You can also declare arrays that are two dimensional, eg.
int A[3][4];
This means there are three integer arrays, all four in length.
A Low Level Look At Arrays


Full output:
0x7ffdc44f1460
0x7ffdc44f1460
0
1
0x7ffdc44f1464
1
1
The integer type consists of 4 bytes, so when we increment the pointer, we are looking at the memory 4 bytes to the right.
Array ADT
The Data of an Array consists of:

Array Space
 In C++ this is done with
int A[10];
 In C++ this is done with

Size

In C++ this is done with:

1 2
int *A; A = new int[size];


Length (Number of Elements)
 When we initialize an array, it has length 0. When we add an element, it is length 1, etc.
Operations an array should support:

Display()

Append(x)

Insert(index, x)

Delete(index)

Search(x)

Get(index)

Set(index, x)

max()/min()

Reverse()

Shift()
Initial Creation


As we can see, we have created the class Array which consists of a pointer, the size of the array, and the length of the number of elements in the array. What we need to create are the following:

Create()

Display()


A note, for ~Array
, that is run whenever the variable goes out scope. In this case, we just delete the pointer.
Append
and Insert
For Append(x)
:


If we’re not at max capacity, we can append.
For Insert(index, x)
:


Here we just loop over everything, and set everything back one point to the right of the index we are trying to insert at.
Delete


This is all pretty self explanatory.
LinearSearch
and BinarySearch
For LinearSearch(key)
:


The idea behind do_swap
is that if we are searching for an element, we might search for it again. If we have an unsorted array, it can be adventageous to just swap as we might search for the value again. A linear search has a time complexity of $O(n)$.
For BinarySearch(key)
:


For this to work, the array has to be sorted. A binary search has a time complexity of $O(\log_2 n)$
Get
, Set
, Max
, and Min
For Get(index)
:


For Set(index, x)


For Max()
:


For Min()
:


Reverse


This reverses the array, and its time complexity is $O(n)$.
isSorted


Merge


Strings
Introduction
The char
is a data type that represents a character. It can take number values from 128 to 127, and each number represents an ASCII character. Strings can thus be constructed by “stringing” together characters. For instance:


When analyzing this declaration, it becomes apparent we have a problem. This is because we are declaring an array with size 10, which means that:
[J] [o] [h] [n] [0] [0] [0]
[0] [1] [2] [3] [4] [5] ... [10]
Each of these sections of the array technically could include a character. This means we need a specific character to denote the termination of a string, and this is known as the “Null” character, or '\0'
. So this:


is equivalent to this:


Length
To determine the length of a string, we loop over it until we find a \0
:


Change Case
The ASCII Character code for capital A is 65, and the ASCII Character code for lowercase a is 97. Then, Z and z are 90 and 122, respectively. This means that the absolute value of the difference between the corresponding values of any two characters (A and a for example) is 32. We can use this to our advantage to change the case of a string:


Reversing a String
To reverse a string, we use the following method:


Duplicates in a String


Matrices
Diagonal Matrix
A Diagonal Matrix is like the following:
1 2 3 4 ...
1  a 0 0 0
2  0 b 0 0
3  0 0 c 0
4  0 0 0 d
...
Which can be defined as $M[i, j] = 0 \text{ if } i \ne j$
This matrix will be the framework for later types of matrixs:


Returns:
1 0 0 0 0
0 2 0 0 0
0 0 3 0 0
0 0 0 4 0
0 0 0 0 5
Lower Triangle Matrix
A Lower Triangle Matrix is like the following:
1 2 3 4 ...
1  a 0 0 0
2  b c 0 0
3  d e f 0
4  g h i j
...
Which can be defined as: $M[i, j] = 0 \text{ if } i < j$
$M[i, j] \ne 0 \text{ if } i \ge j$
When observing the space this matrix takes up, we can observe that each n row adds n amount of space, eg. 1 + 2 + 3, etc. etc. This means the number of non zero elements are give in $\frac{n(n+1)}{2}$, and the number of zero elements is given by $\frac{n(n1)}{2}$.
Though this is the case, there are two methods to represent this as an array, the first is row major mapping:
A  0 1 2 3 4 5 6 7 8 9
a b c d e f g h i j
This corresponds to the rows in the first diagram. So 0 is the first row, 1 and 2 is the second, 35 the third, and 69 the fourth.
The second method is column major mapping:
1 2 3 4 ...
1  a 0 0 0
2  b e 0 0
3  c f h 0
4  d g i j
...
So looking at our array now, 04 represents the first column, 56 represents the second column, 7 and 8 the third, and 9 the fourth.
Here is an implementation of Row Major mapping:


Returns:
1 0 0 0 0
2 2 0 0 0
3 3 3 0 0
4 4 4 4 0
5 5 5 5 5
Upper Triangle Matrix
An Upper Triangle Matrix is represented as:
1 2 3 4 ...
1  a b c d
2  0 e f g
3  0 0 h i
4  0 0 0 j
...
Which can be defined as: $M[i, j] = 0 \text{ if } i > j$
$M[i, j] \ne 0 \text{ if } i \le j$
When observing the space this matrix takes up, we can observe that each n row adds n amount of space, eg. 1 + 2 + 3, etc. etc. This means the number of non zero elements are give in $\frac{n(n+1)}{2}$, and the number of zero elements is given by $\frac{n(n1)}{2}$.
Like the Lower Triangle Matrix, there also exists a ColumnMajor Mapping, but we won’t get into that as the concept is basically the same. The implementation of this matrix is the same as the last, just adjust lines 19, 25, and 24 from i >= j
to i <= j
.
Symmetric Matrix
A Symmetric Matrix is represented as:
1 2 3 4 ...
1  a a a a
2  a b b b
3  a b c c
4  a b c d
...
Which can be defined as:
$M[i,j] = M[j, i]$
This looks intimidating, but we already implemented this already, as if we notice, we can draw a diagonal line down the center, and anything to the right or to the bottom is already covered by the corresponding. Eg. we can represent this as either a Lower Triangle or an Upper Triangle, and just mirror it.
Tridiagonal Matrix
A Tridiagonal Matrix is represented as:
1 2 3 4 5 ...
1  a b 0 0 0
2  c d e 0 0
3  0 f g h 0
4  0 0 i j k
5  0 0 0 l m
...
Which can be defined as:
$M[i, j] \ne 0 \text{ if } i  j \le 1$
Here is an implementation:


Result:
1 1 0 0 0 0 0 0 0
2 2 2 0 0 0 0 0 0
0 3 3 3 0 0 0 0 0
0 0 4 4 4 0 0 0 0
0 0 0 5 5 5 0 0 0
0 0 0 0 6 6 6 0 0
0 0 0 0 0 7 7 7 0
0 0 0 0 0 0 8 8 8
0 0 0 0 0 0 0 9 9
The formula for space is technically $3n 2$, but the space is basically ignorable so I just set it 3n.
Sparse Matrix
A sparse matrix is a matrix where nonzero elements can be anywhere in the matrix, such as:
1 2 3 4 5 ...
1  a 0 0 0 0
2  0 0 b 0 c
3  d 0 e 0 0
4  0 f 0 0 0
5  0 0 0 g 0
...
As we can see, the elements are dispersed about the matrix, with no pattern. As a result, we have to store our elements using coordinates, namely their i and j coordinates. As a result, we can define the class Element
to facilitate this:


Next, we need the actual Matrix, so we will use the class Sparse
:


Next, we need a way to set and get the matrix, for this we are going to use the istream
and ostream
operators.


Last but not least, lets add a method to allow addition of two sparse matrices.


Putting it all together:


Linked List
Introduction
An array works off the principle of a bench. A benches seats are right next to each other, and they are of a fixed size. I can’t call chairs that are 10 feet apart a bench, and I can’t go up to a bench and extend or shrink it. Similarly, an array is defined as pieces of memory that are next to each other, and they are fixed in size. Linked List are a different way of creating collections of data. An array is formed like the following:


Each index has a corresponding element, and each element corresponds to an address. A linked list is constructed like this:
[Element][Pointer] > [Element 2][Pointer] > ... > [Element n][/0]
A linked list is constructed as a series of nodes, consisting of an element, and a pointer, pointing to the next node. This creates a list of links, until we reach the end node, which a NULL pointer.
Writing
A Linked List can be represented as:


The Node class contains the data and a pointer to another Node. We can work with the Linked List a little bit with just this:


This outputs:
1 > 2 > 3 > 4 > 5 >
Another way to display a linked list is with a recursive function:


These notes don’t use a Class implementation of Linked List, instead passing the linked list into each one. This was helpful when I was learning how Linked Lists work, but ideally, for C++ use a class instead.
Counting Nodes
Counting Nodes is pretty similar to Display, we just loop and count until we reach the end:


As you can see, we just loop over it until we reach the null pointer, with each iteration we are just increasing the num
integer.
Recursive Implementation of Count
We can also count Nodes recursively:


Sum of all Elements
Just like counting, sum is pretty similar, just we have to access the data as well:


Recursive Implementation of Sum
We can also sum the Nodes recursively:


Max of Elements
Just like counting, max is pretty similar:


Searching
A linked list is a linear search only:


Inserting
An insert for a linked list is pretty simple:


If we are out of bounds, we return null. We define two new node objects, one which is equal to the head node, and another which is a fresh node. In the case that we are inserting at the 0th index, we just set the t>next
value to head, and then set head equal to t
. If not, we loop over p
, incrementing everything right ward index  1
number of times. Afterwards, we set t>next
equal to p>next
, and then set p>next
to equal t
. This places t
in between, right where we want it to be.
Check If Linked List Loops
So a linked list consists of Nodes as we have seen, but what if a pointer from a node points to another node within itself, in other words it forms a looping linear linked list:
V
[Element][Pointer] > [Element 2][Pointer] > [Element n][Pointer] ^
In this way, it loops back around. Well, we can detect this using two pointers that increment over the list. One increments by one step, and the other by two. If they meet again, then the linked list must loop:


Doubly Linked List
A Linked List has one pointer, pointing to the next node. A Doubly Linked List is a Linked List with two pointers, one pointing to the next node and the second pointing to the node before it. To begin, lets modify our Node
object:


As we can see, we added a new variable called prev
, which as you can guess, points to the previous element. Unlike our Linked List, we are also going to code a class for this to make things easier to develop, now that we have an understanding of the Linked List:


Next we can code our main function:


We need to add a display function, and we will basically recycle our old code from the Single Linked List:


Inserting
The insert for a doubly linked list is very similar to the normal linked list, just we also have to point to the previous element.


Deleting
Much like a normal linked list, the same principle applies, but we have to account for two pointers now:


Stack
The stack is a way of representing Data, and it operates on the LIFO principle. The last in is the first to be pulled out. A recursion uses a stack, but this is handled by a compiler. It is known that every recursion can be converted to an interative algorithm and viseversa, but some recursive algorithm once conerted to an iterative proccess need a programmer designed stack. A stack consists of two parts of data:

Space for storing elements

Top pointer
And the following operations:

push(x)

pop()

peak(index)

stackTop()

isEmpty()

isFull()
Let’s look at some implementations of the stack.
Implementation with Array
For this implementation, we have 3 pieces of data, the actual array, a pointer for the top most element, and the size of the array.


The question arrises, though, when we push a value to the stack, do we push from the left or the right. Well, if we pushed from the left, eg. push to 0, and then the old 0 becomes 1, and then the old 1 becomes 2, etc. etc., our time complexity becomes $O(n)$, but if we know the top most value, so if push from the right, our time complexity is $O(1)$.
isFull
and isEmpty


The operations are all $O(1)$, and this is because of our top
and size
variables.
peak


push
and pop




Display the Stack


stackTop


Implementation with Linked List
The advantage of a linked list over an array is that the stack isn’t a fixed size on creation, and instead we can dynamically increase or decrease it with the linked list. Due to the nature of linked lists, this implementation requires less Class variables:


isFull
and isEmpty


stackTop


peek


push
and pop


Applications of Stack
Parenthesis Matching
Imagine we have the expression $(a(a+b) \times (cd+e))$, how we can determine if the expression has all matching parenthesis, or in other words, if the parenthesis are closed. To do this, lets build off of our Array implementation of the stack:


Queues
The stack works off of the LIFO principle, a queue works off of the FIFO principle. As a result, everything that is going into the data strucutre enters from one end and then exits from the other. This is incontrast to the stack, where everything enters and exists from the same end. A queue consists of the following parts of data:

Space for storing elements

Front for deletion
And consists of the following methods:

enqueue(x)

dequeue()

isEmpty()

isFull()

first()

last()
Queue using Single Pointer
Imagine a queue with an array such as:
Q [0][1][2][3][4][5]
Rear = 1
Where rear points to the newest element. So:
Q [A][B][C][D][4][5]
Rear = 3
In this case, our rear most element is D, so Rear is set to 3. The issue is when we want to dequeue, or take out the newest element. Our newest element is A, so when we pull out A, B has to take A’s spot, and C has to take B’s spot, etc. etc. This means that every time we dequeue, the time complexity is $O(n)$. This is not very good, and can be improved by using two pointers.
Queue Using Two Pointers
Imagine we have the same queue as earlier but this time with an additional pointer:
Q [0][1][2][3][4][5]
Front = Rear = 1
Now, lets look at the same example:
Q [A][B][C][D][4][5]
Front = 1
Rear = 3
Insert is still $O(1)$, but what about dequeuing? Well lets say we want to take out A. Once we take out A, we just increment the Front pointer. This shows us where the front is, and as a result we don’t have to move everything over. This makes the dequeue process also $O(1)$!
So, when looking at this structure, the queue is empty when the front and the rear are equal to each other, and it is full when the rear is equal to size  1.
Implementing with Array
Creating the class:


Display:


isEmpty
and isFull


enqueue
and dequeue


Problems with Array Queue
This current implementation is cool and all, but it~~~~ sucks. Very bad. This is because over time, the queue just outright fails. If you notice, the front and rear pointer just keep moving further and further to the right. This results in the queue over time failing. So we want to be able to use all of the space in the array effectively, and not let the queue die. We have two ways of approaching this:

Resetting Pointers

Circular Queue
The Circular Queue is the best way of approaching the issue, so we will look at that.
Circular Queue with Array
The Circular Queue works of the same principle as the two pointer array queue, but it is really a more intelligent implementation. Our new implementation will initially look like this:


Now, the Front and the Rear equal 0. When we start to populate, the array looks like this:


Now, we can see that the Front pointer is at 0, and the Rear pointer is at 5, which is E. Now, when we dequeue, we increment front by 1 and then return Q[front], so it’d return A. In the old implementation, if we wanted to queue an element, it’d say the queue is full, now what we do? Well we just include an element at 0 and set rear to 0! We can’t have rear and front on the same point though, because that’d mean the queue is empty, so with this system, the queue can store size  1 elements. Let’s take a look at the implementation.
As before, the initial class looks identical:




isEmpty
and isFull


enqueue
and dequeue


Implementing with Linked List
A queue can be implemented fairly easily with a Linked List. Given a linked list like the following:
[1]>[2]>[3]>[4]>
We can use a pointer on 1 to be the front, and a pointer on 4 to be the rear. Usually with a linked list, we only have one pointer that stores the first value in the list, but with a second pointer we can add to the linked list easier in constant time.
Creating the class:


Display:


isEmpty


enqueue
and dequeue


Trees
Let’s go over some tree terminology
[A]
[B] [C]
[D] [E] [F] [G]
[H] [I] [J] [K] [L] [M]
This is a tree, what we call the root is A. The root note is made of disjoined subsets, for instance the B subset and the C subset, and then within the B subset, there is the D subset and the E subset. B is a parent of D and E, and D and E are a parent of B. D and E are siblings, B and C are siblings. Descendents are a set of nodes such that they can be reached by another Node. D, E, H, I, and J are all descendents of B. Inversely, an ancestor is the line of nodes reaching back the root, for instance the ancestors of I are E, B, and A. It is the direct line of ancestory. The degree of a node is the number of children it has, the degree of A is 2, the degree of B is 2, the degree of D is 1, and the degree of H is 0. Leaf Nodes are nodes where the degree is 0, such as H, I, J, K, etc. Internal Nodes are nodes where the degree is greater than 0, such as B, C, D, E, F, and G. The depth of a node is the number of edges from the node to the tree’s root node. A root node will have a depth of 0. So A’s depth is 0, B’s depth is 1, E’s depth is 2, and J’s depth is 3. The height of a tree would be the height of its root node, which is equal to the depth of the deepest node. So the height of A is equal to 3. Height doesn’t perfectly decrease though, for instance if we introduced another node to A named “Y” with a child node of “Z,” A’s height would still equal 3, but Y’s height would be 1. Something to keep in mind. The level of a node is where it is relative to the others, so A is at level 1, B at level 2, D at level 3, etc. etc.
Binary Tree
A binary tree must have a degree of 2, eg. every node can have, at maximum, 2 children. This allows us to do things like:
[A]
[B] [C]
[D] [E] [F] [G]
Because of this two node setup, we can easily refer to each node as either the left child and the right child. A binary tree is full when the height of the tree would increase if we were to add an extra node. The tree above is full, the height is 2, and adding any extra nodes would result in an height increasing by one.
Catalan Number
The different combinations of binary trees that can be generated with a set of unlabeled nodes is called the Catalan number, where $n$ is our node:
$\displaystyle C_n = \frac{1}{n+1}(\begin{smallmatrix}2n \ n\end{smallmatrix}) = \frac{(2n)!}{(n+1)!n!}= \prod_{k=2}^n \frac{n+k}{k}$
So, when the $n$ nodes is equal to 1, the Catalan number is 1, when it is 2, the Catalan number is 2, when it is 3, the Catalan number is 5, etc. etc. The number of trees generated with max height is equal to $2^{n1}$. What about with labeled nodes? Well we just build off of our formula, where the number of individual combinations with labeled nodes is $C_n \times n!$
Height vs. Nodes
When given a height for a binary tree, what is the minimum and maximum number of nodes there can be? The minimum number of nodes is quite easy to find, for height $h$, minimum nodes is $n=h+1$. Maximum number of nodes is $n=2^{h+1}1$. The minimum height is solving for h in the maximum number of nodes formula, so for the number of nodes $n$, the minimum height is $h=\log_2(n+1)1$, and the maximum height is solving for h in the minimum number of nodes formula, so for the number of nodes $n$, the maximum height is $h=n1$.
Determining no. Leaf Nodes
The number of leaf nodes/external nodes is $\text{deg}(0)=\text{deg}(2)+1$
Strict Binary Tree
A Strict Binary Tree, also known as a proper binary tree or complete binary tree, is a binary tree where the degree of each node is either 0 or 2, never 1.
Height vs. Nodes
The minimum number of nodes for a strict binary tree for height $h$ is $n=2h+1$, and max is the same as a nonstrict node, $n=2^{h+1}+1$. The mimimum height formula is the same, it is $h=\log_2(n+1)1$. The maximum height formula is different for a strict binary tree, it is $h=\frac{n1}{2}$.
Determining no. Leaf Nodes
The number of leaf nodes/external nodes for a Strict Binary Tree is $\text{deg}(0)=\text{deg}(2)+1$
Tree Traversals
In a linear data structure, data can be traversed in two ways, front to back, or back to front. For a nonlinear data structure, like a tree, there are a lot more options at play as to how we can traverse the data structure. In this way, we will look over four methods to traverse the following trees:
[A]
[B] [C]
[A]
[B]
[A]
[B]
Preorder
Preorder is defined as:
 Visit the current Node
 Pre(Left Child)
 Preorder(Right Child)
So for the first tree, Preorder(A)
would return A, B, C. For the second tree, Preorder(A)
would return A, B. For the third tree, Preorder(A)
would return A, B.
Inorder
Inorder is defined as:
 Inorder(Left Child)
 Visit the current Node
 Inorder(Right Child)
So for the first tree, Inorder(A)
would return B, A, C. For the second tree, Inorder(A)
would return B, A. For the third tree, Inorder(A)
would return A, B.
Postorder
Postorder is defined as:
 Postorder(Left Child)
 Postorder(Right Child)
 Visit the current Node
So for the first tree, Postorder(A)
would return B, C, A. For the second tree, Postorder(A)
would return B, A. For the third tree, Postorder(A)
would return B, A.
Levelorder
Levelorder is definde as:
 Visit the current level
So for the first tree, Levelorder(A)
would return A, B, C. For the second tree, Levelorder(A)
would return A, B. For the third tree, Levelorder(A)
would return A, B.
Representing Binary Tree
Array Representation
Given a binary tree such as the following:
[A]
[B] [C]
[D] [E] [F] [G]
We can feed this data into an array like the following:
T  A B C D E F G
1 2 3 4 5 6 7
You’re probably saying: “Coleton, Arrays start at Index 0 why the heck did you put 1 as the first element.” Well, you’ll see. This representation means the following:
Element  Index  LChild  RChild 

A  1  2  3 
B  2  4  5 
C  3  6  7 
i  2i  2i+1 
This pattern showcases the usefulness of starting with 1. You can also determine the parent fast, because the parent is $\frac{i}{2}$. This concept is important for another concept, called completeness. The tree above is complete, but what about this tree:
[A]
[B] [C]
[D] [E]
An array representation of this would look like:
T  1 2 3 4 5 6 7
A B C D   E
Notice that there are spaces between D and E? Any space in an array representation of the tree means that the tree is not complete. This is a practical definition of a noncomplete tree. A textbook definition of completeness is a tree such that the tree is full (left and right children for every node) at h  1, and all nodes at h are filled left to right, and if it stops, no other nodes are present, such as this tree:
[A]
[B] [C]
[D] [E] [F]
Here, the height is 2, so at 1, the nodes are filled completely, and at the last level, the nodes are filled left to right, with no breaks. Notice that the right child of C is not present, the tree is still complete with it.
Linked Representation
A linked representation of a Binary Tree can be like this:


Creating Binary Tree
This implementation uses the Queue. Instead of rewriting our current Queue implementation to support the Node
class, we will just use the built in queue
in C++. To begin, lets setup the Tree Class:


Next, we code the traversals, the traversals implementations are pretty similar to how they are described above.
Preorder


Postorder


Inorder


Levelorder


Height
int Height(Node *p) { int l = 0; int r = 0; if (p == nullptr) { return 0; }
l = Height(p>lchild); r = Height(p>rchild);
if (l > r) { return l + 1; } else { return r + 1; } }
nary tree
An nary tree is a tree where the degree of any node is at maximum $n$, eg. a 3ary tree can have {0, 1, 2, 3} degrees.
Strict Nary tree
A strict nary tree is a tree where the degree of any node is either {0, n}.
Height vs. Nodes
For a given nary tree (where $n$ is given as $m$) and height, the minimum number of nodes $n$ is $n = mh+1$. The maximum number of nodes $n$ is $n=\frac{m^{h+1}1}{m1}$. Given the number of nodes $n$, the minimum height is $h=\log_m[n(m1)+1]1$, and the maximum height is $h=\frac{n1}{m}$.
Search Trees
Binary Search Tree
[30]
[15] [50]
[10] [20] [40] [60]
This is an example of a Binary Search Tree. Notice that to the left of each node, the value is less than the right. Notice the following:
 No duplicates
 Inorder traversal would give a sorted list of the elements, least to greatest.
Implementation


Insert


Inorder


Search


Recursive Insert
and Search


Delete


Create From Preorder


Drawbacks of Binary Search Tree
A drawback of a standard Binary Search Tree is the fact that they can’t control their own height. Ideally, the time complexity to search the tree is $O(\log n)$, but at worst, the time complexity is $O(n)$. This is what the use of an AVL tree is. It forces the height to be more efficient.
AVL Tree
Balance Factor
A balance factor of a tree is equal to the height of the left sub tree minus the height of the right subtree, or $bf = hl  hr$. A balanced node is equal to {1, 0, 1}, or $bf = hf  hr \le 1$. If the bf at any node is greater than 1, it is imbalanced. Let’s first learn to calculate balance factor.
[A]
[B] [C]
[D] [E] [F]
The left child height of A is 2, and the right child height of A is 2, so 2  2 = 0, so the BF for A is 0. The left child height of B is 1, and the right child height B is 1, so 1  1 = 0, so the BF for B is 0. The left child height of C is 1, and the right child height of C is 0, so 1  0 = 1, so the BF for C is 1. D, E, and F have no child nodes, so their BF is 0.
[A]
[B] [C]
[D] [E]
The left child height of A is 2, and the right child height of A is 2, so 2  2 = 0, so the BF for A is 0. The left child height of B is 1, and the right child height of B is 0, so 1  0 = 1, so the BF for B is 1. The left child height of C is 0, and the right child height of C is 1, so 0  1 = 1, so the BF for C is 1. D and E have no child nodes, so their BF is 0.
[A]
[B] [C]
[D]
[E]
The left child height of A is 3, and the right child height of A is 1, so 3  1 = 2, so the BF for A is 2. The left child height of B is 2, and the right child height of B is 0, so 2  0 = 2, so the BF for B is 2. The left child height of D is 1, and the right child height of D is 0, so 1  0 = 1, so the BF for D is 1. C and E have no child nodes, so their BF is 0.
[A]
[B] [C]
[D]
[E]
[F]
The left child height of A is 4, and the right child height of A is 1, so 4  1 = 3, so the BF for A is 3. The left child height of B is 3, and the right child height of B is 0, so the BF for B is 3. The left child height of D is 0, and the right child height of D is 2, so the BF for D is 2. The left child height of E is 1, and the right child of E is 0, so the BF for E is 1. C and F have no child nodes, so their BF is 0.
Rotation
So this principle of balance begs the question, after we insert, we want our AVL Tree to remain balanaced. What method ensures balanace? This is where the idea of rotation comes into play. There are 4 types of rotation:
 LL Rotation
 RR Rotation
 LR Rotation
 RL Rotation
We will go over the steps for all types of rotation. Note that rotation is done with 3 nodes only. The LL and RR rotations are single rotation, and LR and RL are double rotation.
LL
 Initial Creation
[30]
[20]
 After Inserting
Let’s say we insert 10:
[30]
[20]
[10]
Our BF is now 2, which makes this AVL tree invalid as it is imbalanced. Notice the position, it is LeftLeft: LL.
 Rotation
So we now have our tree, how do we perform the rotation? Imagine holding your finger on 30, and then pulling on the tree like it is a string. Thus the tree looks like this:
[20]
[10] [30]
Now, the BF for each node is 0.
RR
 Initial Creation
[10]
[20]
 After Inserting
Let’s say we insert 30:
[10]
[20]
[30]
Our BF is now 2, which makes this AVL tree invalid as it is imbalanced. Notice the position, it is RightRight: RR.
 Rotation
So we now have our tree, lets perform the rotation. For this one, we shift it counterclockwise:
[20]
[10] [30]
LR Rotation
 Initial Creation
[30]
[10]
 After Inserting
Let’s say we insert 20:
[30]
[10]
[20]
Our BF is now 2, which make this AVL tree invalid as it is imbalanced. Notice the position, it is LeftRight: LR.
 Rotation
[30]
[20]
[10]
[20]
[10] [30]
If we notice, this rotation was a two step process, first we rotated it into a LL position, then performed a LL rotation.
RL
 Initial Creation
[10]
[30]
 After Inserting
Let’s say we insert 20:
[10]
[30]
[20]
Our BF is now 2, which makes this AVL tree invalid as it is imbalanced. Notice the position, it is RightLeft: RL.
 Rotation
[10]
[30]
[20]
[20]
[10] [30]
If we notice, this rotation was a two step process, first we rotated it into a RR position, then performed a RR rotation.
General Form For Rotation
Something that can be hard to determine is what rotation should be used, especially if we are using this in an algorithm. Something of note is that any rotation is just with 3 nodes. Nodes can have children, but when we are analyzing a node that is unbalanced, we only balance it using 2 other nodes. It’s positions are important for the rotation, but determining this rotation can be hard to determine. The general idea is once we insert, we don’t care where the node went past 3 nodes. Eg. if we insert a node, and it goes left, then right, then left, then left again, we don’t care at all about the last two lefts, just the first two directions, left and right. In this case, on the node that has become unbalanced, we perform a LR Rotation.
Implementation


The AVL class is very similar to the normal Binary Tree class, just with some modifications.
NodeHeight


Rotations
RR


LL


LR


RL


Insert


Display


Delete


Binary Heap
A binary heap is a binary tree where duplicates are allowed. Further more, there are two types of binary heaps, a max heap and a min heap. In a max heap, it means that each element is greater than or equal to its children. Obviously, the min heap means that each element is less than or equal to its children. Binary heaps are almost always represented as arrays, and they are complete trees. In other words, there are no gaps in the array, and ecah level is filled left to right. This is an example of a max heap:
[30]
[20] [15]
[5] [10] [12] [6]
As you can see, each node’s value is greater than or equal to the children, eg. 30 >= 20 & 15. Here is an example of a min heap:
[5]
[15] [12]
[20] [25] [30] [18]
The opposite fact is true here, where each node’s value is less than or equal to the childre, eg. 5 <= 15 & 12. the max heap is the most used form of binary heap. Let’s look at the binary representation of the max heap.
B  30 20 15 5 10 12 6
1 2 3 4 5 6 7
Recall that a Node is at Index I, thus its left child is at 2i and its right child is at 2i+1.
Insertion
A heap is cool, but because it has to stay complete, there can’t be any gaps in the array. Imagine we have the above array, and we want to insert the value 40. What do we do? Well, we just insert at the end of our array.
B  30 20 15 5 10 12 6 40
1 2 3 4 5 6 7 8
This is nice and easy, but there is one crucial problem, this makes 40 a child of 5. It cannot be a child of 5, that doesn’t work. How do we go about remedying this problem? Well, we compare 40 with its parents, if it is greater than the parent element, we swap the parent with 40, lets see how this operation would take place:
40 > 5, Swap:
B  30 20 15 40 10 12 6 5
1 2 3 4 5 6 7 8
40 > 20, Swap:
B  30 40 15 20 10 12 6 5
1 2 3 4 5 6 7 8
40 > 30, Swap:
B  40 30 15 20 10 12 6 5
1 2 3 4 5 6 7 8
Now the tree looks like:
[40]
[30] [15]
[20] [10] [12] [6]
[5]
Implementation with Array
Insert


CreateHeap


Delete
Something of note about Delete
, is that it also sorts the heap. Basically, when we delete, we are always removing the top element by swapping it with the last index in the heap array. Afterwards, we do the insert logic, basically moving the swapped value to its proper place in the heap. Repeated deletions push the max value further and further down the array, moving each one out of the scope of the array, but also sorted. This is useful, as it basically creates a priority queue, which we will explore later.


Implementation with Vector
Insert


CreateHeap
Takes an array of values and puts them into a vector


Print
To display the binary heap:


Heap As a Priority Queue
Due to the very nature of the binary heap, whenever we insert into it, the largest or smallest value takes precedence. This allows us to build a priority queue. Insert is $O\log n$ and Delete is $O(\log n)$. This makes a binary heap the best structure for a priority queue.
Sorting
The definition of sorting is pretty easy to understand, it is an algorithm that sorts data. Some sorts are better for certain applications. The analysis of a sort will be based on this criteria to determine its situational efficiency:
 Number of comparisons
 Number of swaps
 Adaptive
 Stable
 Extra Memory
General Overview
Comparison Based Sorts
$O(n^2)$
 Bubble Sort
 Insertion Sort
 Selection Sort
$O(n \log n)$
 Heap Sort
 Merge Sort
 Quick Sort
 Tree Sort
$O(n^{\frac{3}{4}})$
 Shell Sort
Index Based Sorts
$O(n)$
 Count Sort
 Bucket/Bin Sort
 Radix Sort
Stable
Stable sorting algorithms preserve the relative order of equal elements. For example, imgaine I have a list of student names with corresponding grades. Currently, the list is sorted alphabetically, but imagine I want to sort numerically. A stable sorting algorithm would not change the relative order of duplicates, thus a student with a first name A with a score of say 80 would be ahead of a student with a first name B with a score of 80. The use of a stable algorithm is that it makes it more efficient when there are duplicate cases, as we don’t have to sort by another key when a duplicate is present.
Adaptive
Adaptive sorting algorithms take advantage of existing order in its inputs, and this advantage makes it faster by not trying to reorder if it is already ordered. Bubble Sort and Insertion Sort are examples of adaptive sorting algorithms.
Synopsis and Implementation of Each Sort
Bubble Sort
Synopsis
Bubble sort is the most intuitive sorting algorithm, it basically just passes over a list of elements, comparing the first element with every other element, and then does the same thing for the second, third, fourth, etc.
Implementation


Print
It can be assumed that this print method will be included with every implementation from here onwards:


swap
It can be assumed that this swpa method will be included with every implementation from here onwards (when required):


Insertion Sort
Synopsis
Insertion is a pretty basic principle, given the sorted array below:
A  2 4 8 16 32
0 1 2 3 4
We want to insert the value 9
. To do this, we’d logically have to shift values rightward, but we can’t shift every element at once. The easiest way to do this is the following:
Is 9 > 32? No. Shift rightward.
A  2 4 8 16 32
0 1 2 3 4 5
Is 9 > 16? No. Shift rightward.
A  2 4 8 16 32
0 1 2 3 4 5
Is 9 > 8? Yes. Place at empty:
A  2 4 8 9 16 32
0 1 2 3 4 5
In this way, we are only shifting the elements we need at any point, all in one scan. This principle of shifting with each insert is how insert sort works. Given an unsorted array:
A  8 2 12 4 6 20
0 1 2 3 4 5
Each step in an insertion sort requires breaking the array into seperate parts. First step is breaking off the 0 index, which is sorted by definition as it is only one element. Second step is adding an extra spot, and doing an insertion:
Is 2 > 8? No. Shift rightward.
I  2 8
0 1
A  [2] 12 4 6 20
0 1 2 3 4 5
Break off next element, 12:
Is 12 > 8? Yes. Place at empty.
I  2 8 12
0 1 2
A  [12] 4 6 20
0 1 2 3 4 5
Break off next element, 4:
Is 4 > 12? No. Shift rightward.
I  2 8 12
0 1 2 3
A  [4] 6 20
0 1 2 3 4 5
Is 4 > 8? No. Shift rightward.
I  2 8 12
0 1 2 3
A  [4] 6 20
0 1 2 3 4 5
Is 4 > 2? Yes. Place at empty.
I  2 4 8 12
0 1 2 3
A  [4] 6 20
0 1 2 3 4 5
We continue this proces until the very end.
Implementation


Bubble Sort vs. Insertion Sort
Comparisons  Bubble Sort  Insertion Sort 

Min Comp  $O(n)$  $O(n)$ 
Max Comp  $O(n^2)$  $O(n^2)$ 
Min Swap  $O(1)$  $O(1)$ 
Max Swap  $O(n^2)$  $O(n^2)$ 
Adaptive  Yes  Yes 
Stable  Yes  Yes 
Linked List  No  Yes 
k passes  Yes  No 
Selection Sort
Synopsis
Selection sort utilizes three variables, i
which denotes current position, k
which denotes next smallest value, and j
which denotes current index being scanned. When j
goes outside index, the pass is complete. Each pass stores the minimum value, and swaps it with the current index. Let’s see this in action:
A  8 6 3 2 5 4
0 1 2 3 4 5
i = 0, scan for smallest value:
i = 0
k = 3
j = 6
Swap:
A  2 6 3 8 5 4
0 1 2 3 4 5
i = 1, scan for smallest value:
i = 1
k = 2
j = 6
Swap:
A  2 3 6 8 5 4
0 1 2 3 4 5
i = 2, scan for smalled value
i = 2
k = 5
j = 6
Swap:
A  2 3 4 8 5 6
0 1 2 3 4 5
We continue this process until the list is sorted. Selection sort is neither adaptive or stable, but it has a low memory footprint and works exceptionally well on small sets of data.
Implementation


Quick Sort
Synopsis
Quick Sort can be confusing, but this is because the sort has a lot of steps that aren’t intuitive. But, once you understand the steps, it becomes rather easy. To begin, let’s look at this array:
A  50 70 60 90 40 80 10 20 30
0 1 2 3 4 5 6 7 8
To begin, set the first element as the pivot element. The idea behind quicksort is that we determine a place that is sorted, so each value left of the pivot is less than or equal to the pivot, and eveything to the right of the pivot is greater than or equal to the pivot. So, i
looks for everything greater than to the pivot, and j
looks for everything less than or equal to the pivot. So i starts at 0, and j starts at length of array:
// First element greater than 50 is 70 at 1:
i = 1
// First element less than or equal to 50 is 30 at 8:
j = 8
Swap:
A  50 30 60 90 40 80 10 20 70
0 1 2 3 4 5 6 7 8
// Next element greater than 50 is 60 at 2:
i = 2
// Next element less than or equal to 50 is 20 at 7:
j = 8
Swap:
A  50 30 20 90 40 80 10 60 70
0 1 2 3 4 5 6 7 8
// Next element greater than 50 is 90 at 3:
i = 3
// Next element less than or equal to 50 is 10 at 6:
j = 6
Swap:
A  50 30 20 10 40 80 90 60 70
0 1 2 3 4 5 6 7 8
// Next element greater than 50 is 80 at 5:
i = 5
// Next element less than or equal to 50 is 40 at 4:
j = 4
j < i, swap pivot with j:
A  40 30 20 10 50 80 90 60 70
0 1 2 3 4 5 6 7 8
Now that we have swapped pivot with j, we are in a position where pivot is sorted. Everything to the left of 50 is less than or equal to 50, and everything to the right 50 is greater than or equal to 50. This position is called the partitioning position. And with it, we can apply, recursively, quick sort to the left and quick sort to the right. If the list is already sorted, Quick Sort is at worst time, which is $O(n^2)$. The average and best case time complexity is $O(n\log n)$
Implementation
Pivot First


Pivot Last


Merge Sort
Synopsis
Ideas Behind Merging
Given two sorted arrays, we are going to incremement over each and merge, eg.:
n = 0
A  1 5 12
0 1 2
m = 0
B  3 4 9 10
0 1 2 3
New Array:
C 
0 1 2 3 4 5 6
n = 0
m = 0
A[n] = 1
B[m] = 3
A[n] < B[m]:
C  1
0 1 2 3 4 5 6
n++
n = 1
m = 0
A[n] = 5
B[m] = 3
B[m] < A[n]:
C  1 3
0 1 2 3 4 5 6
m++
n = 1
m = 1
A[n] = 5
B[m] = 4
B[m] < A[n]:
C  1 3 4
0 1 2 3 4 5 6
m++
n = 1
m = 2
A[n] = 5
B[m] = 9
A[n] < B[m]
C  1 3 5
0 1 2 3 4 5 6
n++
n = 2
m = 2
A[n] = 12
B[m] = 9
B[m] < A[n]
C  1 3 5 9
0 1 2 3 4 5 6
m++
n = 2
m = 3
A[n] = 12
B[m] = 10
B[m] < A[n]
C  1 3 5 9 10
0 1 2 3 4
m++
n = 2
m = 4
A[n] = 12
B[m] Does Not Exist
C  1 3 5 9 10 12
0 1 2 3 4 5
Implementation of Merge Concepts


Implementation
Iterative Merge Sort


Recursive Merge Sort


Count Sort
Synopsis
First, find the largest value in your array. Create an array where the last index is equal to that value, eg. a max value of 15 would be an array size 16. Initialize the array with zeros. This is important for what we are about to do. So with the given array, create another array with this directions:
A  6 3 9 10 15 6 8 12 3 6
0 1 2 3 4 5 6 7 8 9
15 is max value:
C  0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Next, scan the array and increment with each value, eg.:
A  6 3 9 10 15 6 8 12 3 6
0 1 2 3 4 5 6 7 8 9
C  0 0 0 2 0 0 3 0 1 1 1 1 0 0 0 1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Notice, there are 3 6 values in our A
array, and there is only one 8. We denote each count. Next, we clear the A
array, and scan the B
array, copying the number of times we see each value:
A  3 3 6 6 6 8 9 10 12 15
0 1 2 3 4 5 6 7 8 9
The time complexity for count sort is $O(n)$! But, the space complexity is large, being $O(k)$, where k is the largest value in our set. For small values, this sorting method is incredibly good, but for large sets, it can be very memory intensive.
Implementation


Bucket Sort
Synopsis
Bucket Sort is very similar to Count Sort, especially with the initial step, where we create an array the size of the max step. Each value will be set to null. The type of array is a linked list, it is an array of linked lists. So the “buckets” are linked lists, so at L[6], where L is an array of linked lists, there is a linked list of every 6 element in the list.
Implementation
Include the nodes:


Radix Sort
Synopsis
Radix sort also works on the bin principle, but instead of taking max value as the size, it uses the base. So for the decimal system, base 10, we have 10 bins, 09. So for value say, 278, it goes in bin 8. Well that is cool, but the issue is that this isn’t sorted, well, drop everything and then extract. Set bins to null again, and then sort by second space, eg. 257 goes in the 5 bin. Do this again for each space, and you’ll get a sorted array:
Implementation


Shell Sort
Synopsis
The grand finale, quite honestly this sort is definitely the weirdest, and probably the most complex. Basic principle is insert sort modified, where we form arrays based on gaps, eg.
A  2 1 3 5 6
0 1 2 3 4
Gap = 5 / 2 = 2
First gap set:
A  2 1 3 5 6
0 1 2 3 4
* * *
[2, 3, 6]
We perform insertion sort on each set, looking at the first pair of each gap while increasing rightward.
Implementation


Hashing
So far we have looked at two searching algorithms, the Linear Search with a time complexity of $O(n)$ and the Binary Search with a time complexity of $O(\log n)$. These algorithms aren’t bad by any means, and in the case of Binary Search, relatively efficient. But, we can go further, we can do searching in $O(1)$. This is done with Hashing.
Ideal Hashing
Ideal Hashing is the most straightforward Hashing method. How it is done is that there is a onetoone correspondence between key value and key location. So, imagine we have these set of keys:
[1, 12, 3, 7, 10, 15]
Then, the hashtable will look like this:
/ 1 / 3 / / / 7 / / 10 / 12
0 1 2 3 4 5 6 7 8 9 10 11 12
Where there isn’t a value, it is null, otherwise the hashtable returns the value at that index. Thus, the mathematical representation for this hashtable is $h(x) = x$. This is cool but what if one of our keys is 100? 1000? That’d make so much wasted space. This is the largest drawback of ideal hashing, its memory footprint is massive.
Modulus Hashing
Instead of just returning x, let’s instead use the modulus operator. This means that the space taken up by the hashtable is 09, 10 spaces. The drawback is that 15 % 10 = 5, and 25 % 10 = 5, which means that if both of these elements are present, for instance, there would be a collision. It is not onetoone, it is manytoone. So we have to resolve these collisions.
Open Hashing (Chaining)
Remember, the modulus hash function is $h(x) = x \bmod 10$, but we also are running into the problem of conficts. An intuitive idea is to use a linked list for conflicts, so we create an array of nodes, and each node contains a linked list of the elements. So for the 5, we can shovel any value that ends with 5, and if we wanted to search for 25, we can just search the linked list belonging to the 5th element. So, what is the time complexity? Well, first we have to determine what is called the loading factor. The loading factor $\lambda = \frac{n}{\text{size}}$. The size of the hastable is 10, so for example if we have 100 keys, then our loading factor $\lambda$ is equal to 10. The average successful search time complexity is $t = 1 + \frac{\lambda}{2}$. The average unsuccessful search time complexity is $t = 1 + \lambda$.
Implementation


Closed Hashing
Closed Hashing is the idea that the hashing doesn’t take up any more memory than just the space set out for it. With chaining, we increase outside of the space of it.
Linear Probing
Linear Probing is a way to solve the collisions in a finite space. To do this, we use our hash function $h(x) = x \bmod 10$, and we use a new function, $\h’(x) = (h(x) + f(i)) \bmod 10$, where $f(i) = i$. So for instance, we have the following hashtable:
H  30 / / 23 / 45 26 / / /
0 1 2 3 4 5 6 7 8 9
We want to insert the value 25, but oh no we have a collision. Let’s use $h’(x)$. $h’(x) = (h(25) + f(0)) \bmod 10 = (5 + 0) \bmod 10 = 5$. Well here, index 5 isn’t available, so let’s try again. $h’(x) = (h(25) + f(1)) \bmod 10 = (5 + 1) \bmod 10 = 6$. Well here, index 6 isn’t available, so let’s try again. $h’(x) = (h(25) + f(2)) \bmod 10 = (5 + 2) \bmod 10 = 7$. Well here, index 7 is available, so we insert our new key here.
Implementation


Quadratic Probing
Quadratic Probing is just like linear probing, but $h’(x)$ is redefined as $h’(x) = (h(x) + f(i)) \bmod 0$, where $f(i) = i^2$.
Implementation


Double Hashing
Double Hashing uses two hash functions to try and fix conflicts. So, $h_1(x) = x \bmod 10$ and $h_2(x) = R  (x \bmod R)$, where $R$ is the nearest prime number of the size of our hashtable. So for a size of 10, $R = 7$, so $h_2(x) = 7  (x \bmod 7)$. $h_2$ never returns 0 and it covers all of the indices. Now, $h’(x) = (h_1(x) + i * h_2(x)) \bmod 10$, where $i = 0, 1, 2…$.
Implementation

