Category Archives: algorithms

algorithms and data structures

Algorithm to add a number to every element in an array

  • https://www.hackerrank.com/challenges/crush/forum/comments/255515
  • https://www.hackerrank.com/challenges/crush/forum
  • https://www.hackerrank.com/challenges/crush/problem

Problem

Given an array of ten elements [0,0,0,0,0,0,0,0,0,0].

Let’s say we are to apply an update to a certain range of indexes…say from a (index 2) to b (index 6), we want to add k (10) to all elements.

so

[0,0,0+10,0+10,0+10,0+10,0+10,0,0,0]

Let’s denote iteration from a to b as m, this would be O(m) time.
If we were to apply n updates, then we’d end up n * O(m) or O(m*n)
If m or n is huge, then we’d have a performance problem, as it could get close to O(n^2).

How to Solve

So the thing we’re trying to do is to figure out how to express the idea that we want to add a number k from a to b in better running time. Our tactic involves two steps:

1) Assign indices O(x)
2) Calculate sum O(y)

Total running time is O(x+y)

Step One

Use +k and -k at indexes to denote the change

[0, 0, 0(+10), 0, 0, 0, 0(-10), 0, 0, 0]

With our original array, we use +k at index 2 and -k and index (5+1) to show what we want to add k (10) to all the numbers from index 2 to 5.

The assignment of k is O(2), which is O(1).

But, since there are x updates, and we need to assign k indices for all the updates.
so the running time will be O(x) for x updates.

Step Two

Once all the indices have been assigned, we iterate and use a summation to assign all the numbers.

Using initial sum = 0, we add a[0] to sum

a[0] = 0;
a[1] = a[1] + a[0] = 0 + 0 = 0;
a[2] = a[2] + a[1] = 10 (k) + 0 = 10
a[3] = a[3] + a[2] = 0 + 10 = 10
a[4] = a[4] + a[3] = 0 + 10 = 10
a[5] = a[5] + a[4] = 0 + 10 = 10
a[6] = a[6] + a[5] = -10 (-k) + 10 = 0
a[7] = a[7] + a[6] = 0 + 0 = 0

[0, 0, 10, 10, 10, 10, 0, 0]

To do the sum, its O(y) where y is the number of elements, since we need to iterate from 0 to length – 1.

Thus, two steps of indices assignment at O(x) and then the sum at O(y) will give
O(x) + O(y) or O(x+y).

Example

One query:
[2, 5, 100]

initial array:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

step 1 – assign indices:
[0, 0, 100, 0, 0, 0, -100, 0, 0, 0]

step 2 – summation:

sum = 0,
a[0] = a[0] + sum = 0 + 0 = 0
a[1] = a[1] + a[0] = 0
a[2] = a[2] + a[1] = 100 + 0 = 100
a[3] = a[3] + a[2] = 0 + 100 = 100
a[4] = a[4] + a[3] = 0 + 100 = 100
a[5] = a[5] + a[4] = 0 + 100 = 100
a[6] = a[6] + a[5] = -100 + 100 = 0
a[7] = a[7] + a[6] = 0 + 0 = 0
a[8] = a[8] + a[7] = 0 + 0 = 0
a[9] = a[9] + a[8] = 0 + 0 = 0

[0, 0, 100, 100, 100, 100, 0, 0, 0, 0]

Two queries:
[2, 5, 100]
[3, 4, 50]

initial array:
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]

Step 1 – assign indices:

[0, 0, 100, 0, 0, 0, -100, 0, 0, 0]
[0, 0, 0, 50, 0, -50, -0, 0, 0, 0]

We can add them together like so:

[0, 0, 100, 50, 0, -50, -100, 0, 0, 0]

Step 2 – summation:

sum = 0,
a[0] = a[0] + sum = 0 + 0 = 0
a[1] = a[1] + a[0] = 0
a[2] = a[2] + a[1] = 100 + 0 = 100
a[3] = a[3] + a[2] = 50 + 100 = 150
a[4] = a[4] + a[3] = 0 + 150 = 150
a[5] = a[5] + a[4] = -50 + 150 = 100
a[6] = a[6] + a[5] = -100 + 100 = 0
a[7] = a[7] + a[6] = 0 + 0 = 0
a[8] = a[8] + a[7] = 0 + 0 = 0
a[9] = a[9] + a[8] = 0 + 0 = 0

[0, 0, 100, 150, 150, 100, 0, 0, 0, 0]

Let’s verify by hand.

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
add 100 from 2 to 5
[0, 0, 100, 100, 100, 100, 0, 0, 0, 0]
add 50 from 3 to 4
[0, 0, 100, 150, 150, 100, 0, 0, 0, 0]

So our results match.

Let’s try one last example:

1) assign their indices first:

[0 0 0 100 0 -100 0 0 0] [3 4 100]
[0 0 100 0 0 0 -100 0 0] [2 5 100]
[0 100 0 -100 0 0 0 0 0] [1 2 100]

Because we have multiple operations, let’s group them together

[0 100 100 0 0 -100 -100 0 0]

2) calculate their summation

sum = 0,
a[0] = a[0] + sum = 0 + 0 = 0
a[1] = a[1] + a[0] = 100 + 0 = 100
a[2] = a[2] + a[1] = 100 + 100 = 200
a[3] = a[3] + a[2] = 0 + 200 = 200
a[4] = a[4] + a[3] = 0 + 200 = 200
a[5] = a[5] + a[4] = -100 + 200 = 100
a[6] = a[6] + a[5] = -100 + 100 = 0
a[7] = a[7] + a[6] = 0 + 0 = 0
a[8] = a[8] + a[7] = 0 + 0 = 0

[0 100 200 200 200 100 0 0 0]

binary loop search

We use a loop to do binary search on a sorted array.

We start with two markers, one on each end: start and end.
We then calculate mid. We compare the value at index mid to the value we’re trying to insert.

If our to insert value is larger, we move our start to mid + 1.
If our to insert value is smaller, we move our end to mid – 1.

This way, we squeeze our start and end markers tighter and tighter.

Once we squeeze where start <= end is false (AKA start > end )…our mid is at the correct index. All we need to do is see if our to insert value is larger or smaller than the value at index mid.

If larger, arr.splice(mid+1, 0)
If smaller, arr.splice(mid, 0)

Because by default splice at an index will add the value at that index and push everything down.

Prim’s Algorithm

ref –

  • https://www.geeksforgeeks.org/travelling-salesman-problem-set-2-approximate-using-mst/
  • https://iq.opengenus.org/graph-representation-adjacency-matrix-adjacency-list/
  • https://www.gatevidyalay.com/prims-algorithm-prim-algorithm-example/
  • source – https://github.com/redmacdev1988/PrimsAlgorithm
  • https://www.tutorialspoint.com/Prim-s-algorithm-in-Javascript

What it solves:

Traveling Salesman Problem (Approximate using Minimum Spanning Tree)

Algorithm:
1) Let 1 be the starting and ending point for salesman.
2) Construct MST from with 1 as root using Prim’s Algorithm.
3) List vertices visited in preorder walk of the constructed MST, and add 1 at the end.

Let us consider the following example. The first diagram is the given graph. The second diagram shows MST constructed with 1 as root. The preorder traversal of MST is 1-2-4-3. Adding 1 at the end gives 1-2-4-3-1 which is the output of this algorithm.

Given a problem like so, we use Prim’s algorithm to create a minimum spanning tree for a weighted undirected graph. It finds a subset of the edges that forms a tree, where the total weight of all the edges in the tree is minimized.

What is Minimum Spanning Tree?

Given a connected and undirected graph, a spanning tree of that graph is a subgraph that is a tree and connects all the vertices together. A single graph can have many different spanning trees. A minimum spanning tree (MST) or minimum weight spanning tree for a weighted, connected and undirected graph is a spanning tree with weight less than or equal to the weight of every other spanning tree. The weight of a spanning tree is the sum of weights given to each edge of the spanning tree.

Prim’s Algorithm

Given Problem

Adjacency List

We create an adjacency list. From the diagram, we see that A connects to B and C. So we create key A, and then have it reference a Linked List with nodes B and C.

We see B connects to C, E, and D. So in our adjacency list, we add key B, and have it reference a Linked List with nodes C, E, and D. Repeat until all edges and vertices have been added.

A -> [B] -> [C]
B -> [A] -> [C] -> [E] -> [D]
C -> [A] -> [B] -> [E]
D -> [B] -> [E] -> [F]
E -> [D] -> [E]

The key to picking only the minimum path and getting rid of the rest is accomplished by:

  • minimum priority queue
  • visited dictionary to keep track of all visited nodes

The Minimum Queue

We use the minimum priority queue to keep all the edges from nodes that we have visited. We group these edges into this queue, then take the minimum of it. That is how this algorithm picks the minimum path to travel.

Start

We start at node A so we draw it.

We need to process this node A in our min priority queue so we should add it to initialize our queue.


Min Queue: [A]
Visited: {}

Now that the queue has an item, we can start off the process it.

We pop A from the queue to process it.


Min Queue: []
Visited: {}

Has it been visited? No, so we put it in Visited dictionary.
We see from the adjacency list that A connects to B (with distance of 2) and C (with distance of 3).
So we push B and C into our queue.


Min Queue: [ B(A, 2), C(A, 3) ]
Visited: { “A” }

This means that all edges from A is only B and C. And from this group of edges, we use the min priority queue to get the absolute minimum path (which is 2) at node B.

Hence pop B from the queue.


Min Queue: [ C(A, 3) ]
Visited: { “A” }

Has B been visited? We look at the Visited dictionary. No. So we then add B to the visited dictionary. . We draw our result:


So we get all its edges from the adjacency list. We see that its nodes D(B, 3), C(B, 5), and E(B, 4).


Min Queue: [ C(A, 3), D(B, 3), C(B, 5), E(B, 4) ]
Visited: { “A”, “B” }

Now we see that the number of available paths have increased to four. We then get the minimum path, which is C(A,3). Has C been visited? Nope, we add it to the Visited dictionary. We draw the path in our diagram.

Then we process it by getting all its edges from the adjacency list. We see that it has node of E(C, 4). We push it onto the Min Queue.
Now, we have 4 available paths from nodes A, B, and C.


Min Queue: [ D(B, 3), C(B, 5), E(B, 4), E(C, 4) ]
Visited: { “A”, “B”, “C” }

We get minimum of those paths which is D(B,3). Has D been visited? Nope, we add it to the visited dictionary. We draw the path in our diagram.

Then we process D by getting all its edges from the adjacency list. We see that it has node of E(D, 2), F(D, 3). We push it onto the Min Queue.


Min Queue: [ C(B, 5), E(B, 4), E(C, 4), E(D, 2), F(D,3) ]
Visited: { “A”, “B”, “C”, “D” }

We pop the minimum of the priority queue: E(D, 2). Has E been visited? Nope, we add it to the visited dictionary. We draw the path in our diagram.

Then we process E by getting all its edges from the adjacency list. We see that it has node of E(F, 5). We push it onto the Min Queue.

Hence at this point, we have 5 available edges from our visited nodes.


Min Queue: [ C(B, 5), E(B, 4), E(C, 4), F(D,3), E(F, 5) ]
Visited: { “A”, “B”, “C”, “D”, “E” }

We pop the minimum which is F(D, 3), and process it. Has F been visited? Nope, we add it to the visited dictionary. We update our answer diagram:

From our adjacency list, we see that F does not have a Linked list of edges. Hence, we DO NOT PUSH ANYTHING onto our Priority Queue:


Min Queue: [ C(B, 5), E(B, 4), E(C, 4), E(F, 5) ]
Visited: { “A”, “B”, “C”, “D”, “E”, “F” }

We keep processing. We pop from the minimum queue: E(B, 4) . Has it been visited? Yes, so we skip and continue dequeueing.


Min Queue: [ C(B, 5), E(C, 4), E(F, 5) ]
Visited: { “A”, “B”, “C”, “D”, “E”, “F” }

We pop from the minimum queue: E(C, 4) . Has it been visited? Yes, so we skip and continue dequeueing.


Min Queue: [ C(B, 5), E(F, 5) ]
Visited: { “A”, “B”, “C”, “D”, “E”, “F” }

We pop from the minimum queue: C(B, 5) . Has it been visited? Yes, so we skip and continue dequeueing.


Min Queue: [ E(F, 5) ]
Visited: { “A”, “B”, “C”, “D”, “E”, “F” }

We pop from the minimum queue: E(F, 5) . Has it been visited? Yes, so we skip it and continue.

Now our queue is empty and we’re done.

The answer is:

Dijkstra’s Algorithm

ref – https://www.tutorialspoint.com/Dijkstra-s-algorithm-in-Javascript

Dijkstra’s Algorithm is used for solving single source shortest path problems. In this algorithm, a single node is fixed as a source node and shortest paths from this node to all other nodes in graph is found.

Minimum priority queue is the most commonly used data structure for implementing Dijkstra’s Algorithm because the required operations to be performed in Dijkstra’s Algorithm match with specialty of a minimum priority queue.

Time complexity of Dijkstra’s algorithm is O(N2) because of the use of doubly nested for loops. It depends on how the table is manipulated.

Dijkstra’s Algorithm cannot be applied on graphs having negative weight function because calculation of cost to reach a destination node from the source node becomes complex.

Given nodes: A B C D E F G
and their edges:

“A”, “C”, 100
“A”, “B”, 3
“A”, “D”, 4
“D”, “C”, 3
“D”, “E”, 8
“E”, “F”, 10
“B”, “G”, 9
“E”, “G”, 50

We first draw a node graph diagram of it. Keep in mind that from our adjacency list, if we have arrows pointing to each other from two nodes, it’s undirected. If it’s only from one to the other, then it’s directed. So undirected graphs should have double the # of edges than directed graphs.

In our case, it is a directed graph.

Then we set up the the node/distance/prev chart.
The distance means the shortest distance from the starting node to that node. Right now, we initialize all to infinity.
The prev means the previous node that this node connects to. We maximize as null.

Since our starting node is A, our distance is 0, and prev is A.

node distance prev
A 0 A
B Inf null
C Inf null
D Inf null
E Inf null
F Inf null
G Inf null

We then draw the adjacency graph to keep track of all the neighbors of a specific node presented from edges.
For each edge, it creates a neighbor of a node, and we must diagram this in order to keep track.

Finally, we have a queue to get the minimum distance as shall see later.

Start

In our code, we create an instance of our Dijkstra’s Algorithm.

We have the queue ready, so we first push A onto the queue.
queue – [A]

We start by popping the A off the queue and processing it.
queue – []

We look at the unvisited neighbors of A.
Namely, they are C, B, and D.

Distance from A to C is 100. On our chart, distance of A to C is Infinity. 100 < Infinity, so we replace Infinity with 100. Then we push C onto the queue. queue - [C]

node distance prev
A 0 A
B Inf null
C 100 A

Distance from A to B is 3. On the chart, A to B’s distance is Infinity. Is 3 < Infinity? Yes! so we replace the values in the chart, then push node B onto the queue: queue - [B, C]

node distance prev
A 0 A
B 3 A
C 100 null

Distance from A to D is 4. On the chart A to D’s distance is Infinity. Is 4 < Infinity? Yes! so we replace it on the chart, and push node D onto the queue: queue - [B, D, C]

node distance prev
A 0 A
B 3 A
C 100 A
D 4 A

We’ve processed all of node A’s neighbors. (Linked List) Once that’s done. The queue is a priority minimum queue, so we pop the queue to get the minimum distanced node, which is B.

Queue – [D, C]

B with priority of 3

At B, we process all its unvisited neighbors. In the Adjacency list, B is connected to G. B to G is 9. Distance from starting point A to B is 3.
so 3 + 9 = 12. Is 12 < Infinity? yes, so let's update the table.

node distance prev
A 0 A
B 3 A
C 100 A
D 4 A
G 12 B

Then we push G onto the queue.

Queue – [D, G, C]

Processing D, priority 4

But because D has the minimum distance at 4, we pop it from the queue.

Queue – [G, C]

D has edges to C and E.

Distance from D to C is 3. Distance from start to D 4. Distance from start to C = 3 + 4 = 7. Is 7 < Infinity? Yes, let's update it. Insert C with priority 7 into the queue. Notice we already have priority 100 for C in the queue. We simply replace the value 100 with 7. This is so because we have found a shorter distance than 100.

node distance prev
A 0 A
B 3 A
C 7 D
D 4 A
G 12 B

We then look at neighbor E. From D to E is 8. From start to D is 4. From start to E is 4 + 8 = 12. Is 12 < Infinity ? Yes, update table. Then push E with priority 12 onto the queue.

node distance prev
A 0 A
B 3 A
C 7 D
D 4 A
E 12 D
G 12 B

so we’re finished running through all edges for ‘D’.

Queue – [C(7), E(12), G(12)]

Process C with priority of 7

We pop C because it has minimum prioirty.

Queue – [E(12), G(12)]

C has no edges in our Adjacency list, so we skip.

Process G with priority of 12

Queue – [E(12)]

Then we process G with priority of 12. G has no edges in our Adjacency list, so we skip.

Process E with priority of 12

So we then process node E. It has edges F and G.

E to F is 10. Min distance from from start point A to E is 12.
Is 10+12=22 smaller than what’s on our chart? (Infinity) Yes! So we update the chart.

node distance prev
A 0 A
B 3 A
C 7 D
D 4 A
E 12 D
F 22 E
G 12 B

Then push F onto the queue.

Queue – [F(22)].

Let’s continue and process G.

E to G is 50. Minimum distance from start to E is 12.
Start to G is 50 + 12 = 62. From our Chart, A to G has 12. Is 62 < 12? No! So we don't update the chart. We do not push G onto the queue. We then pop the queue and continue...

Processing F at priority 22

No edges for F. Queue is empty. WE’re now done.

The final answer is:

node distance prev
A 0 A
B 3 A
C 7 D
D 4 A
E 12 D
F 22 E
G 12 B

Where the shortest distance from A to any of the nodes are listed under the distance column.

For example, say I want to know what is the shortest distance from A to G. It is 12 by the way of A, B, G. The 12 shows that A to G takes 12 steps and G’s previous is B. We look at B and its previous is A. So we know that it goes A -> B -> G.

I want to know what is the shortest distance from A to C. Looking at the chart, A to C has min distance of 7, by the way of A, B, C.
C’s previous is B. And B’s previous is A.

Code Explanation

So we have a bunch of dictionaries to represent our chart data. We have everything from distance, prev, and nodes. We have _edges to represent the adjacency list, and _visited to show which ones have been visited.

We set up a minimum priority queue.

And initialize everything first. The start node’s distance is always 0.
We set the other distances to Infinity.
We set the previous nodes to null.
And finally, push the start node onto the queue.

So the basic idea here is that we start the algorithm by dequeueing the starting node.
We continue dequeueing from the priority queue until it is empty.

The point of processing each node as they come in is that we get the minimum priority (distance) first.
Then as we process them, we update the chart. We are done when the queue is empty.

As each node with the minimum priority (distance) comes off the queue, we process their edge list. This can be visually seen by the adjacency list. In our case, it is the _edges dictionary.

So if an edge has not been visited, we process it.
We the get the minimum distance from starting to current node, and then add the distance of the current edge.
If it’s less than what we have in the chart, we update the chart.

This is to say that we found a shorter way. In this case, we push the current node into the queue along with its distance as the priority.
Due to the queue being a minimum priority queue, it will always process the shortest distance nodes.

If it’s not, then this way is longer, and we don’t care for it.

Source Code

Node, Linked List and Minimum Priority Queue

Dijkstra’s Algorithm

Graphs Problems (networks, routes/paths, scheduling…etc)

ref –

  • https://www.geeksforgeeks.org/depth-first-search-or-dfs-for-a-graph/
  • https://www.geeksforgeeks.org/breadth-first-search-or-bfs-for-a-graph/
  • https://stackoverflow.com/questions/14720691/whats-the-purpose-of-bfs-and-dfs

BFS and DFS are graph search algorithms that can be used for a variety of different purposes. Many problems in computer science can be thought of in terms of graphs. For example, analyzing networks, mapping routes, and scheduling are graph problems. Graph search algorithms like breadth-first search are useful for analyzing and solving graph problems.

One common application of the two search techniques is to identify all nodes that are reachable from a given starting node. For example, suppose that you have a collection of computers that each are networked to a handful of other computers. By running a BFS or DFS from a given node, you will discover all other computers in the network that the original computer is capable of directly or indirectly talking to. These are the computers that come back marked.

BFS specifically can be used to find the shortest path between two nodes in an unweighted graph. Suppose, for example, that you want to send a packet from one computer in a network to another, and that the computers aren’t directly connected to one another. Along what route should you send the packet to get it to arrive at the destination as quickly as possible? If you run a BFS and at each iteration have each node store a pointer to its “parent” node, you will end up finding route from the start node to each other node in the graph that minimizes the number of links that have to be traversed to reach the destination computer.

DFS is often used as a subroutine in more complex algorithms. For example, Tarjan’s algorithm for computing strongly-connected components is based on depth-first search. Many optimizing compiler techniques run a DFS over an appropriately-constructed graph in order to determine in which order to apply a specific series of operations. Depth-first search can also be used in maze generation: by taking a grid of nodes and linking each node to its neighbors, you can construct a graph representing a grid. Running a random depth-first search over this graph then produces a maze that has exactly one solution.

This is by no means an exhaustive list. These algorithms have all sorts of applications, and as you start to explore more advanced algorithms you will often find yourself relying on DFS and BFS as building blocks. It’s similar to sorting – sorting by itself isn’t all that interesting, but being able to sort a list of values is enormously useful as a subroutine in more complex algorithms.

We use graphs to present connected nodes.

Graph consist of nodes and edges.

Adjacent – if two nodes are directed connected. They are neighbors.

If it’s a directed graph, and John has an arrow towards Bob, then John is adjacent to Bob. But opposite not true.

Edges also have a weight. If two people interact a lot, they can have a higher weighted edge.

Finding two shortest path to two nodes.

Adjacency Matrix

An adjacency list is a collection of unordered lists used to represent a finite graph.

Each list describes the set of neighbors of a vertex in the graph.

Let’s do Depth First Search on our example graph

The idea is to go forward in depth while there is any such possibility. If not then backtrack.

Whenever we push an unvisited node, we print it.

Before we start, there are two assumptions:

– we start at node C
– nodes are inserted in alphabetical order.

We use a stack.

So we start at C. We push C onto our stack, and print it.

stack: C
output: C

C is connected to three nodes, A, B, and D. We do this alphabetically so we push A onto the stack, and print it.

stack: C A
output: C A

At A, we have two nodes, B and E. We push B onto our stack, and print it.

stack: C A B
output: C A B

B has one neighbor E. So we push E onto the stack, and print it.

stack: C A B E
output: C A B E

Since E has no connecting neighbors, we pop it.

stack: C A B
output: C A B E

We are now at B. B’s only neighbor of E has already been visited so we pop B.

stack: C A
output: C A B E

A’s neighbors B and E have all been visited. So we push A.

stack: C
output: C A B E

Now we’re at C.

C’s A has been visited, and B has also been visited. But B is unvisited so we push D onto the stack, then print it.

stack: C D
output: C A B E D

D has no unvisited nodes so we pop it.

stack: C
output: C A B E D

C has no unvisited nodes so we pop.

stack:
output: C A B E D

Stack is empty so we’re done.

Source Code

We create a list that has a tail pointing to the latest item we insert. That way, everything is order. Whatever you added first, will be processed first as head is referencing it. Tail will always reference the very last item you add.

Create a Graph with x number of nodes.

We use recursion on the local stack frame in order to simulate a stack.

Each list that is referenced by an Adj[key] is pushed onto the stack. As we process each node down this list, we get more keys, say key2.

We then push list referenced by Adj[key2] onto the local stack frame by recursively calling the function. We keep doing this until we exhaust this list.

Then the function gets popped off the local stack frame.

We look at our graph, and add the neighbors that are connected to our node. We always do this before we run the DFS function.

After adding all the nodes, and specifying how many nodes we have in the constructor, we are ready to call this function.

print our adjacency list and the lists that they contain

print the output array

Breadth First Search on the graph

The trick is that you visit a dequeued node. If that node is NOT visited, we add its neighbors to the queue. If that node HAS ALREADY BEEN visited, we keep dequeueing.

We start at C, so we add C to the queue.

queue: [C]

We remove C from the queue and visit it.

output: C
queue: []

We now must add C’s unvisited neighbors tot he queue (in alphabetical order).

output: C
queue: [A, B, D]

We visit A, and remove it from the queue.

output: C A
queue: [B, D]

Since we visited A, we must add A’s unvisited neighbors to the queue. Those neighbors are B and E.

output: C A
queue: [B, D, B, E]

We visit B, and dequeue it.

output: C A B
queue: [D, B, E]

Since we visited B, let’s look at its neighbors. It has one neighbor E. Let’s add it to the queue.

output: C A B
queue: [D, B, E, E]

No we visit D and print. Then dequeue it.

output: C A B D
queue: [B, E, E].

since we visited D, we must add its unvisited neighbors E.

output: C A B D
queue: [B, E, E, E]

Now we dequeue B, but we see that we have already visited it.

output: C A B D
queue: [E, E, E]

We then dequeue E. It has not been visited so we output it.

output: C A B D E
queue: [E, E]

E has no neighbors so we don’t add anything to the queue.

We then dequeue again. Has no neighbors so we continue to dequeue.

output: C A B D E
queue: [E]

Finally, we dequeue the last E. It has already been printed. And it has no neighbors. The queue is now empty so we’re done.

output: C A B D E
queue: []

We first create a queue.

We use the Linked List from the previous example.

So with Breadth first traversal of the graph, the trick is that we start at the initial node.

We then take all the neighbors of that node and add them to the queue. Once we added all those neighbors, we pop the initial node, and then move on to process the next one in queue.

So as you can see, we give it an initial key to start from. In our case it is ‘c’. So fromNodeKey is ‘c’.
We then keep track if they are visited or not via a visited array.

Initially, we only have one node in the queue. So we start processing the while loop. We pop the ‘c’ node and then push all the nodes from its neighbors via an inner while loop.

In this inner loop, we check to see if the key has been processed by associative array access on the visited array. If it has not been visited, we mark it visited and then add it onto our queue to be processed later.

Source Code

Depth First Graph Traversal

Breadth First Traversal for Graphs

Linked List used is same as above.

Mosh Trees Tutorial

Non Linear Data structures – trees

Tree is a data structure that stores element in a hierarchy.

Root – Top node of tree
Root is parent of its children.
Each node can have up to two nodes…this is called a binary tree.

Trees

– represent hierarchical data
– databases
– autocompletion
– compilers (syntax tree to parse expressions)
– compression (JPEG, MP3)

left < node < right At every step of comparison, we throw away half of the tree. This is what we call Logarithmic Time

Lookup
O(log n)

Inserting node
Finding a Node O(log n) + Inserting Node O(1) = O(log n)

Delete Node
O(log n)

Depth First Search (DFS)

pre-order – (left of node)
in-order – (bottom of node) ascending order
post-order – (right of node)

Breadth First – Level by level (BFS)

Recursion

Iteration – loops.
Recursion – no loops.

Base condition terminates Recursion.

Depths and Height

Depths starts at the root

Depths of root is 0. As it goes deeper, it increases.
Depth of 10 is 1.
Depth of 21 is 2.
Depth of 8 is 3.

Depth – # of edges from root to target node.

Height is opposite of depth.

Height starts at the leaf

Height of leafs are 0.
As we go up, height increases. Longest path from target node to leaf.

Height of 6 is 1.

Height of 30 is 1.

Height of 10 is 2. We actually have 3 different paths. 3 to 10. 8 to 10. 21 to 10. Longest path is 2.

Height of 20 is 3, because it’s the longest path.

1 + max(height(L) + height(R))

Post Traversal – visit all leaves first

Find minimum value in BST

If it’s a standard BST, use recursion, and get leftmost leaf.

But if its NOT a BST, values are all over the place, so

Exercises

– height of tree. return Math.max of left subtree and right subtree. That way, you get the height at the root.

– # of nodes at distance k. simple pre-order and print.

– isBST? pass in parameter min, max. Make sure node value stays within range. If not, false. Kick off with -Infinity and +Infinity.

Minimum – isBST? if yes, get leftmost. If not, then must have parameter reference and recurse down, do compare with every value.

Maximum – isBST? if yes, get rightmost. If not, then must have parameter reference and recurse down, do compare with every value.

– Are the two tree equal (Equality Check)?

– traverse BST (Level Order Traversal)

for i to height of tree
let arr = getNodesAtDistance(i)
for j to arr.length
log arr[j];

Additional exercises

1- Implement a method to calculate the size of a binary tree. Solution: Tree.size()

2- Implement a method to count the number of leaves in a binary tree. Solution: Tree.countLeaves()

4- Implement a method to check for the existence of a value in a binary tree using recursion. Compare this method with the find() method. The find() method does the same job using iteration. Solution: Tree.contains()

5- Implement a method to check to see if two values are siblings in a binary tree. Solution: Tree.areSibling()

6- Implement a method to return the ancestors of a value in a List. Solution: Tree.getAncestors()

SOLUTION

Test

HackerRank

Simple Array Sum

reads in input like so:

6 // first # is represents how many #s there are
1 2 3 4 10 11 // your output should be 31

use Array’s reduce.

output:
0 1
1 2
2 3
3 4
4 5
15

answer

Compare the Triplets

Alice and Bob each created one problem for HackerRank. A reviewer rates the two challenges, awarding points on a scale from 1 to 100 for three categories: problem clarity, originality, and difficulty.

The rating for Alice’s challenge is the triplet a = (a[0], a[1], a[2]), and the rating for Bob’s challenge is the triplet b = (b[0], b[1], b[2]).

The task is to find their comparison points by comparing a[0] with b[0], a[1] with b[1], and a[2] with b[2].

If a[i] > b[i], then Alice is awarded 1 point.
If a[i] < b[i], then Bob is awarded 1 point. If a[i] = b[i], then neither person receives a point. Comparison points is the total points a person earned. Given a and b, determine their respective comparison points.

Sample Input 1

17 28 30
99 16 8

bob wins on first compare 17 vs 99
alice wins on second compare 28 vs 16
alice wins on third compare 30 vs 8

Thus output should be alice’s score first, then bob’s score:
2 1

Diagonal Difference

Given a square matrix, calculate the absolute difference between the sums of its diagonals.

input:
3
11 2 4
4 5 6
10 8 -12

so diagonal starting from top left and going down right: 11 + 5 + -12 = 4
diagonal starting from bottom left and going up right: 10 + 5 + 4 = 19

output:
|19-4| = 15

Keep in mind that the matrix can vary in length.

Plus Minus

See how many
positives,
negatives, and
0s

there are in an array.
Calculate their ratio according to the array’s length.
Make sure the results is in 6 decimal places.

Sample Input

6
-4 3 -9 0 4 1

Sample Output

3, 4, 1, so three positives. 3/6 = 0.500000
-4, -9, so two negatives. 2/6 = 0.333333
0, so 1 zero. 1/6 = 0.166667

0.500000
0.333333
0.166667

For rounding to n-th place we use toFixed(n)
The rest, we simply keep counters and keep track of when one of them happens.

another solution:

eval(1 > 0) // 1
eval(-5 < 0) // 1 eval(2 > 5) // -1
eval(0===0) // 1

posCounter is 7
negCounter is 1
zeros is 1

Staircase

input:
4

output:

#
##
###
####

note the spaces in front. Make sure the # appears in the end.

Our solution is to use recursion.

approach is

print backwards

print forward

Min Max Sum

Get the sum of 4 of the 5 items in an array.
Then console out the min and max of it.

Sample Input

1 2 3 4 5
Sample Output

10 14

Workers and jobs

Given an array of orders [10, 20, 30], where each order takes one robot to complete, we have a starting # of robots, say 60.
Calculate how many orders can be completed.

We send 10 to do the first one. 20 to do the second one. We have 30 robots left. The last job takes 30 robots so works out perfectly.
In this case, the answer is (3) where all three orders can be completed.

Let’s say we have an array of orders [20, 20, 40, 10]. We only have 50 robots this time. The return answers should be 3. Because we can only afford to do 3 jobs. Job 1, 2, and 4.

Knuth Morris Pratt

Longest Prefix Suffix (lps)

Longest Prefix Suffix indicates how many matching prefix/suffix strings there are.

In other words, it tell us how many prefix matches we have in the pattern’s current character, which acts as a suffix.

For example, in the example below:

01234 (index)
abcya (pattern)
00001 (lps)

at index 4, we have lps of 1. The prefix is ‘a’. The suffix is ‘a’.
[a]bcy[a]

going further, lps[5] is 2, because the prefix is ‘ab’. The suffix is ‘ab’.
[ab]cy[ab]

012345 (index)
abcyab (pattern)
000012 (lps)

at index 6, we have 3.

0123456 (index)
abcyabc (pattern)
0000123 (lps)

The prefix is ‘abc’. The suffix is ‘abc’.
[abc]y[abc]

No match situation

012 (index)
abc (pattern)
000 (lps)

lps[0] defaults to 0 because there is no suffix to compare with a prefix.

Then when we get ab, the ‘a’ is the prefix, and the ‘b’ is the suffix.
We compare them, they do not match. Thus, lps[1] is 0.

Then we compare a with c. Where ‘a’ is the prefix, and the ‘c’ is the suffix.
They do not match, thus, lps[2] = 0.

How to calculate the LPS array

The source code is like so:

Let’s go through it on a pattern of a a b a a.

We start off j = 0, i = 1. We need to have a parallel matching scheme. Thus, both index will be going down the array in a parallel fashion.
Note that lps[0] is always 0 because at index 0, we only have 1 character.

When going down the array in parallel, we come to 2 situation:

1) Match – we have a match on characters at index i and j. In this case,

1) we move j++ forward
2) and have lps[i] = j;
2) Then, i++.

The i++, j++ is to move the parallel matching forward. The lps[i] = j is so that our lps takes on the j’s index. The reason why we want our lps array to have j as values is because index j is an indicator of how many matches we currently have. i is simply the index of the substring we are checking.

2) Mismatch –

  • mismatch where j != 0
  • In this case, we assign j to lps[j-1]. Then compare values at j and i again.

  • mismatch where j == 0
  • Here, j has reached back to 0. There is a mismatch at i and thus, lps[i] takes on 0.
    We then move i forward and start the parallel matching all over again.

We continue on in the main while loop

until i has reached the end of the pattern.

Thus, the final result is:

0 1 2 3 4 (index)
a a b a a (pattern)
0 1 0 1 2 (lps)

The KMP matching algorithm

Now that we have the lps array, we can process the KMP algorithm.

First let’s set it up:

Basically, we have i as the index of text, and j as index of pattern.
If the character of text at index i matches character of pattern at index j, we simply move forward both i and j.

This way, we iterate down both arrays and compare characters in such fashion.

In our example, notice we move down and the characters match up like so: “a b c x a b c..” However, you should think of it as characters match taking place for prefix “abc”, then a substring “x”, then suffix “abc”. Eventually, we hit a mismatch at “x” and “z” at index 7.

A mismatch is when the characters of pattern and text are not equal, AND i is still within the text.length.

Then we check to see if pattern index j is at 0 or not.
If j is at 0, it means we are at the very beginning of the pattern and cannot go backwards anymore. Thus, we simply have text index i move forward to go on with the next match.

However, if j is not 0, it means, it needs to jump back to a certain point on the pattern. The number of steps to jump back is decided by the lps array at j-1.

Now, in the standard Naive way, j comes back to index 0 (the pattern’s beginning). i goes to index 1, then they start comparing again. This is too slow and cumbersome.

The secret to KMP is that we have a Prefix Suffix array that tells us how far back j should jump on the pattern array.

The LPS array will tell j which index to jump so that we don’t have to start from the beginning of the pattern.

The LPS array will have us jump back to a certain niche between two matching substrings, the prefix and suffix.

The reason why we can jump in between the suffix and prefix substring is because on the previous match, we ran through:

prefix + substring1 + suffix
“abc” + “x” + “abc”

The suffix of the previous match (“abc”) matches our current prefix pattern (“abc”). Thus, we always start at the beginning of the substring (“x”).

Let’s continue with our example.

In our previous match, the prefix is “abc”. The substring is “x”. The suffix is “abc”.

Hence, for our next match, we don’t need to match up the prefix again. We already know that the suffix of our previous run is already a match with our current pattern’s prefix of “abc”. Hence we skip that part entirely.

Instead, j just jumps straight to the “x”. In other words, the number of steps to jump back is decided by the lps array at j-1. That is basically how the lps array helps us save steps.

After our index j jumps to index 3, we continue matching.
text[7] = x, pattern[3] = x √
text[8] = a, pattern[4] = a √

until our index j finally runs to end of the pattern. This signifies that we have a complete pattern match.

We want to see where the index of the match happened in text, so we simply do i – j, which basically means at the last character match at index i, just subtract pattern.length in order to get the index i of where the match starts. In our case, it is i(12) – j(8) = 4.

Boyer Moore (good suffix heuristic)

source code – https://github.com/redmacdev1988/boyerMoore/blob/master/index.js

The idea of prefix and suffix borders

A border is a substring which is both proper suffix and proper prefix.

For example, in string ‘ccacc’,’cc’ is a border because it appears in both end of string but ‘cca’ is not a border.

In string, ‘cacc’, ‘c’ is a border because it appears in both end of the string. ‘ca’ is not a border.

Let’s look at example where our pattern is aacaa, and an already calculated Suffix Border Index array.

At pattern index 0, we have string aacaa. Its border is aa.
Hence, the prefix border is ‘aa’ at [0][1].
The suffix border is ‘aa’ at [3][4].

We write the beginning index of the suffix border (3) into our border array at index 0.

Thus, for the border array at index 0, we write 3. This means that for the string at pattern index 0, the suffix border starts at index 3.

Let’s move on to pattern index 1. We have string acaa. The prefix border is ‘a’ at index 1. The suffix border ‘a’ at index 4.
Hence, for the border array at index 1, we write 4. This means that for the string at pattern index 1, the suffix border starts at index 4.

At pattern index 2, we have string ‘caa’. There is no border string. Thus, for the value in our border array, we simply write 5, which represents DNE.

At pattern index 3, we have ‘aa’. The prefix border is ‘a’ at pattern index 3. The suffix border is ‘a’ at pattern index 4. This means that for the string at pattern index 3, the suffix border starts at index 4.

Finally, we come to the last character of pattern index 4 ‘a’. It is only 1 character and does not have any border. So we simply assign the length of the pattern to represent DNE.

How to calculate the Suffix Border Index table

1) to start off, we get the length of the pattern. This value will represent DNE for all the pattern substrings that do not have borders.

2) The idea is that we process it going backwards using two indexes i and j.
We simply have i go backwards. For each i iteration, we get index j and process it to check if the character at i and character at j matches.

3) If it matches, we record a suffix border index. If it doesn’t match, we simply assign it as length of the pattern to designate the suffix index DNE.

  • i is index of the pattern substring.
  • j is index of the suffix border.
  • _borderPosition represents an array that contains all suffix border positions

Let’s look at it in detail:

Given pattern ‘aacaa’.

i = 5, j = 6.
border position for 5 is 6, because index 5 does not exist.

0 1 2 3 4 5 (index)
a a c a a X (pattern)
0 0 0 0 0 6 (suffix border index)

We then start going backwards for index i.

i is 5, j is 6

(5 > 0) √
(6 <= 5) X

our j index of 6 is larger than pattern length, so we skip.
decrement i, j to 4, 5.

_borderPosition[4] = 5. This means at pattern index 4, the border index is 5, which denotes DNE. It means the substring at pattern index 4 does not have a border.

0 1 2 3 4 5 (index)
a a c a a X (pattern)
0 0 0 0 5 6 (suffix border index)

i is 4, j is 5

(4 > 0) √
(5 <= 5) √

p[4-1] “a”
p[5-1] “a”

There’s a match so we don’t process index j anymore. We jump out of the loop and decrement i, j to 3, 4.

_borderPosition[3] = 4. This means at pattern index 3, which is substring ‘aa’, the suffix border index is 4.

0 1 2 3 4 5 (index)
a a c a a X (pattern)
0 0 0 4 5 6 (suffix border index)

i is 3, j is 4

(3 > 0) √
(4 <= 5) √

p[3-1] “c”
p[4-1] “a”

There’s no match so we keep processing index j.
We keep processing by going backwards on the _borderPosition array.

j = _borderPosition[4] = 5.

(3 > 0) √
(5 <= 5) √

p[3-1] “c”
p[5-1] “a”

j = _borderPosition[5] = 6.

(3 > 0) √
(6 <= 5) X

skip out of inner while loop
i–; j–; so i is now 2, j is now 5.
_borderPosition[2] = 5.

0 1 2 3 4 5 (index)
a a c a a X (pattern)
0 0 5 4 5 6 (suffix border index)

The substring ‘caa’ at index 2, does not have a suffix border index. So we give is value 5 to denote DNE.

i is 2, j is 5

(2 > 0) √
(5 <= 5) √

Notice j starts over again from the end. The reason we do this is so that we try to find borders. We go back to trying to see if there’s a match for “a”.
if there is, then the current index i has a prefix border.

p[2-1] “a”
p[5-1] “a”

i–;j–‘ 1, 4
_borderPosition[1] = 4;

0 1 2 3 4 5 (index)
a a c a a X (pattern)
0 4 5 4 5 6 (suffix border index)

We then go back one more step to see if there’s additional matches. For every match that happens, we record in the border position of that
match at j.

i is 1, j is 4

(1 > 0) √
(4 <= 5) √

p[1-1] = “a”
p[4-1] = “a”

_borderPosition[0] = 3

0 1 2 3 4 5 (index)
a a c a a X (pattern)
3 4 5 4 5 6 (suffix border index)

Why do we need Suffix borders?

Naive algorithm jumps 1 step at a time. But this is too slow.
Other ways, we jump pattern.length at a time. But we will miss out on potential matches in the middle.

Thus the suffix border array is created so we know where the potential matches are in the middle. WE calculate the suffix border so that we can jump to the correct prefix border for the next potential match.

Setting up the Shift array with Suffix Index Array

So given _borderPositions array as

0 1 2 3 4 5 (index)
a a c a a X (pattern)
3 4 5 4 5 6 (suffix border index)

This means that at index 0 of the pattern, 3 is the index in which the suffix border happens.

Given that if our shift array value is 0, this means it was not assigned.
In this case, we give it the default of the first value of the _borderPositions.

This is so that when we find a match, We know how many steps to skip down. For example, in our case, when a successful search is found, we always skip 3 steps (_shift[0] which is the _borderPosition[0]).

This is so that we can successfully start at the index of the next potential match.

After processing, our shift array looks like this:

0 1 2 3 4 index
3 3 3 3 1 shift

What this means is that while trying match our pattern char to the text char, if for any reason from index 0 to index 3 is a mismatch, we want our i to shift down 3 spaces. This is because 3 spaces down, there is potential for our pattern to match again due to the suffix border.

However, at index 4, if there’s a mismatch, we only shift i down 1. This is because our very first character is a mismatch so we always need to check for the next one. Also, in our shift array, _shift[4+1] is DNE, so we just use default step 1.

If say, our first character at index 4, matches, then index 3 is a mismatch, we shift i down text array 1 steps because _shift[3+1] = 1.

You can see it with an example like this:

0 1 2 3 4 5 6
b a a c a a b (text)
a a c a a (pattern)

we have a mismatch at index 3 because text[3] “c” != pattern[3] “a”.
Hence we take _shift[j+1] = _shift[4] = 1.

After we shift 1, we get

0 1 2 3 4 5 6
b a a c a a b (text)
_ a a c a a (pattern)

match!

The Search

We do the match going backwards:

  • i is the index we’re going to use for every substring of the text
  • j is the index on the pattern. We use this to compare the pattern char to the text char

if the chars at i and j keeps matching, we continue down until j is -1. This means we have matched every character in the pattern matched up in the text.

Thus, when j < 0, we found a pattern match in text. We then log i in order to let the user know where the match happens. Now, we move i down for the next substring to check for a match. So the issue here is, how many steps do we move i down? Typically, we would say, hey why not just shift the i down 5 steps because that's the length of our pattern. We've compared our pattern already, so its safe to go down 5 steps right? The answer is no. What we need to do is shift down 3 steps to the to _shift[0]. This is the 1st of the suffix border index. We need to do this in order to find substring potential matches.