publicclassQuickFindUF {
privateint[] id;
publicQuickFindUF(int N) {
id =newint[N];
// Set id of each object to itselffor (int i = 0; i < N; i++) id[i]= i;
}
// Check whether p and q are in the same componentpublicbooleanconnected(int p, int q)
{ return id[p]== id[q]; }
publicvoidunion(int p, int q) {
int pid = id[p];
int qid = id[q];
// Change all entries with id[p] to id[q]for (int i = 0; i < id.length; i++)
if (id[i]== pid) id[i]= qid;
}
}
Union too expensive ($N$ array accesses).
Trees are flat, but too expensive to keep them flat.
publicclassQuickUnionUF {
privateint[] id;
publicQuickUnionUF(int N)
{ /* same as in quick-find */ }
privateintroot(int i) {
// Chase parent pointers until reach rootwhile (i != id[i]) i = id[i];
return i;
}
// Check if p and q have same rootpublicbooleanconnected(int p, int q)
{ return root(p) == root(q); }
// Change root of p to point to root of qpublicvoidunion(int p, int q) {
int i = root(p);
int j = root(q);
id[i]= j;
}
}
publicclassWeightedQuickUnionUF {
privateint[] id;
privateint[] sz;
publicWeightedQuickUnionUF(int N) {
id =newint[N];
sz =newint[N];
// Set id of each object to itselffor (int i = 0; i < N; i++) {
id[i]= i;
sz[i]= 1;
}
}
privateintroot(int i) {
while (i != id[i]) {
// Only one extra line of code to halve path length id[i]= id[id[i]];
i = id[i];
}
return i;
}
publicbooleanconnected(int p, int q)
{ /* same as in quick-union */ }
publicvoidunion(int p, int q) {
int i = root(p);
int j = root(q);
if (i == j) return;
// Link root of smaller tree to root of larger tree and update the sz[] arrayif (sz[i]< sz[j]) { id[i]= j; sz[j]+= sz[i]; }
else { id[j]= i; sz[i]+= sz[j]; }
}
}
Find takes time proportional to depth of $p$ and $q$.
System percolates iff top and bottom are connected by open sites.
Percolation
When $N$ is large, theory guarantees a sharp threshold $p^*$. What is the value of $p^*$?
Use Monte Carlo simulation:
Initialize $N$-by-$N$ whole grid to be blocked.
Declare random sites open until top connected bottom.
Vacancy percentage estimates $p^*$.
How to check whether an $N$-by-$N$ system percolates:
Create an object for each site and name them $0$ to $N^2 - 1$.
Sites are in same component if connected by open sites.
Percolates iff any site on bottom row is connected to site on top row (brute-force algorithm: $N^2$ calls to connected()).
Clever trick: introduce $2$ virtual sites connected to top and bottom separately and check if virtual top site is connected to virtual bottom site. (only $1$ call to connected())
Virtual Node
Virtual top/bottom backwash problem:
When the system percolates (virtual top is connected to virtual bottom), every node that is connected to the bottom is also considered to be connected to the top (even though it is not). This is the backwash problem.
A simple way to solve this problem is to use twoWeightedQuickUnionUF object, one with the virtual bottom and the other not. Use the first to decide whether the system percolates and the second to see whether a node is full (connected to the top).
A more memory-efficient way to solve this is to manually maintain two boolean arrays, contop[] and conbot[] to keep track whether the root of a node is connected to the top/bottom. When a node is connected to the top and the bottom at the same time, the system percolates.
publicclassArrayQueueOfStrings {
private String[] s;
privateint N, first, last;
publicArrayQueueOfStrings() {
s =new String[2];
N = 0;
first = 0;
last = 0;
}
publicbooleanisEmpty()
{ return N == 0; }
privatevoidresize(int capacity) {
String[] copy =new String[capacity];
for (int i = 0; i < N; i++)
// Wrap-around copy[i]= s[(first + i) % s.length];
s = copy;
// Reset first and last index first = 0;
last = N;
}
publicvoidenqueue(String item) {
// Double size the array when it is fullif (N == s.length) resize(2*s.length);
s[last++]= item;
// Wrap-aroundif (last == s.length) last = 0;
N++;
}
public String dequeue() {
String item = s[first];
s[first]=null;
N--;
first++;
// Wrap-aroundif (first == q.length) first = 0;
// Halve size the array when it is one-quarter fullif (N > 0 && N == s.length/4) resize(s.length/2);
return item;
}
}
Discover type mismatch errors at compile-time instead of run-time.
Client code can use generic stack for any type of data.
Take Stack as an example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
publicclassStack<Item> {
private Item[] s;
privateint N = 0;
// Generic array creation is not allowed in Java// s = new Item[capacity];publicStack(int capacity)
{ s = (Item[]) new Object[capacity]; }
publicvoidpush(Item item)
{ s[N++]= item; }
public Item pop()
{ return s[--N]; }
}
import java.util.Iterator;
publicclassStack<Item>implements Iterable<Item> {
// ...// Iterable must have a method that returns an Iteratorpublic Iterator<Item>iterator()
{ returnnew ListIterator(); }
// Iterator must have methods hasNext() and next()privateclassListIteratorimplements Iterator<Item> {
private Node current = first;
publicbooleanhasNext() { return current !=null; }
publicvoidremove() { /* not supported */ }
public Item next() {
Item item = current.item;
current = current.next;
return item;
}
}
}
Dequeue: A double-ended queue or deque is a generalization of a stack and a queue that support adding and removing items from either the front or the back of the data structure. Deque.java
Randomized queue: A randomized queue is similar to a stack or queue, except that the item removed is chosen uniformly at random from items in the data structure. RandomizedQueue.java
Client: Write a client program that takes an integer $k$ as a command-line argument; reads in a sequence of strings from standard input using StdIn.readString(); and prints exactly $k$ of them, uniformly at random. Permutation.java
Use reservoir sampling to randomly choose $k$ samples from a list of $n$ elements in a single pass over the items:
Keep the first $k$ items
When the $i$-th item arrives
with probability $k/i$, discard an old item at random and keep the new one
// Only compare dates to other dates <Date>publicclassDateimplements Comparable<Date> {
privatefinalint month, day, year;
publicDate(int m, int d, int y) {
month = m;
day = d;
year = y;
}
publicintcompareTo(Date that) {
if (this.year< that.year ) return-1;
if (this.year> that.year ) return+1;
if (this.month< that.month) return-1;
if (this.month> that.month) return+1;
if (this.day< that.day ) return-1;
if (this.day> that.day ) return+1;
return 0;
}
}
Two useful sorting helper functions:
Less: is item $v$ less than $w$?
Exchange: swap item in array at index $i$ with the one at index $j$.
1
2
3
4
5
6
7
8
privatestaticbooleanless(Comparable v, Comparable w)
{ return v.compareTo(w) < 0; }
privatestaticvoidexch(Comparable[] a, int i, int j) {
Comparable swap = a[i];
a[i]= a[j];
a[j]= swap;
}
Find the minimum entry and exchange with $i$’s entry
Continue next scan from $[i+1, N]$
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
publicclassSelection {
publicstaticvoidsort(Comparable[] a) {
int n = a.length;
for (int i = 0; i < n; i++) {
int min = i;
for (int j = i + 1; j < n; j++)
if (less(a[j], a[min]))
min = j;
exch(a, i, min);
}
}
privatestaticbooleanless(Comparable v, Comparable w)
{ /* as before */ }
privatestaticvoidexch(Comparable[] a, int i, int j)
{ /* as before */ }
}
Scan $j$ from right to left $[i, 0]$, exchange a[j] with each larger entry to its left
Start next scan $[1, N]$
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
publicclassInsertion {
publicstaticvoidsort(Comparable[] a) {
int N = a.length;
for (int i = 0; i < N; i++)
for (int j = i; j > 0; j--)
if (less(a[j], a[j-1]))
exch(a, j, j-1);
elsebreak;
}
privatestaticbooleanless(Comparable v, Comparable w)
{ /* as before */ }
privatestaticvoidexch(Comparable[] a, int i, int j)
{ /* as before */ }
}
A $g$-sorted array remains $g$-sorted after $h$-sorting it.
Use a sequence of increment steps to shellsort the array.
Which increment sequence to use?
Power of two: $1, 2, 4, 8, 16, 32, \dots$ No
Power of two minus one: $1, 3, 7, 15, 31, 63, \dots$ Maybe
$3x + 1$: $1, 4, 13, 40, 121, 364, \dots$ OK. Easy to compute
Sedgewick: $1, 5, 19, 41, 109, 209, 505, 929, 2161, 3905, \dots$ Good. Tough to beat in empirical studies
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
publicclassShell {
publicstaticvoidsort(Comparable[] a) {
int N = a.length;
int h = 1;
while (h < N/3) h = 3*h + 1;
while (h >= 1) {
for (int i = h; i < N; i++)
for (int j = i; j >= h && less(a[j], a[j-h]); j -= h)
exch(a, j, j-h);
h = h/3;
}
}
privatestaticbooleanless(Comparable v, Comparable w)
{ /* as before */ }
privatestaticvoidexch(Comparable[] a, int i, int j)
{ /* as before */ }
}
In iteration $i$, pick integer $r$ between $0$ and $i$ uniformly at random.
Swap $a[i]$ and $a[r]$.
1
2
3
4
5
6
7
8
9
10
11
12
publicclassStdRandom {
// ...publicstaticvoidshuffle(Object[] a) {
int N = a.length;
for (int i = 0; i < N; i++) {
// Between 0 and iint r = StdRandom.uniform(i + 1);
exch(a, i, r);
}
}
}
Robot motion planning: Find shortest path in the plane from $s$ to $t$ to avoids a polygonal obstacle. Shortest path is either a straight line from $s$ to $t$ or it is one of two polygonal chains of convex hull.
Motion Planning
Farthest pair problem: Given $N$ points in the plane, find a pair of points with the largest Euclidean distance between them. Farthest pair of points are extreme points on convex hull.
Farthest Pair
Graham scan:
Choose point $p$ with smallest $y$-coordinate.
Sort points by polar angle with $p$.
Consider points in order; discard unless it create a counterclockwise turn.
Pass through array, merging subarrays of size $1$.
Repeat for subarrays of size $2, 4, 8, 16, \dots$
1
2
3
4
5
6
7
8
9
10
11
12
publicclassMergeBU {
privatestaticvoidmerge(...)
{ /* as before */ }
publicstaticvoidsort(Comparable[] a) {
int N = a.length;
Comparable[] aux =new Comparable[N];
for (int sz = 1; sz < N; sz = sz+sz)
for (int lo = 0; lo < N-sz; lo += sz+sz)
merge(a, aux, lo, lo+sz-1, Math.min(lo+sz+sz-1, N-1));
}
}
Given a set of $n$ distinct points in the plane, find every (maximal) line segment that connects a subset of $4$ or more of the points.
Point data type: Create an immutable data type Point that represents a point in the plane. Point.java
Brute force: Examines $4$ points at a time and checks whether they all lie on the same line segment (check the three slope between them are equal), returning all such line segments. BruteCollinearPoints.java
Fast, sorting-based solution: For each point $p$, calculate the slope it makes with every other point and sort. The 3 other collinear points must have equal slopes w.r.t. $p$. FastCollinearPoints.java
publicclassQuick {
privatestaticintpartition(Comparable[] a, int lo, int hi) {
int i = lo, j = hi+1;
while (true) {
// Scan i from left to right so long as a[i] < a[lo]while (less(a[++i], a[lo]))
if (i == hi) break;
// Scan j from right to left so long as a[j] > a[lo]while (less(a[lo], a[--j]))
if (j == lo) break;
if (i >= j) break;
// Exchange a[i] with a[j] exch(a, i, j);
}
// Exchange a[lo] with a[j] exch(a, lo, j);
return j;
}
publicstaticvoidsort(Comparable[] a) {
// Shuffle needed for performance guarantee StdRandom.shuffle(a);
sort(a, 0, a.length- 1);
}
privatestaticvoidsort(Comparable[] a, int lo, int hi) {
if (hi <= lo) return;
int j = partition(a, lo, hi);
sort(a, lo, j-1);
sort(a, j+1, hi);
}
}
Quicksort is an in-place sorting algorithm but is not stable.
publicclassQuick {
privatestaticfinalint CUTOFF = 10;
privatestaticintpartition(Comparable[] a, int lo, int hi)
{ /* as before */ }
publicstaticvoidsort(Comparable[] a)
{ /* as before */ }
privatestaticvoidsort(Comparable[] a, int lo, int hi) {
// Cutoff to insertion sortif (hi <= lo + CUTOFF - 1) {
Insertion.sort(a, lo, hi);
return;
}
// Choose the median item to be the pivotint m = medianOf3(a, lo, (hi+lo)/2, hi);
swap(a, lo, m);
int j = partition(a, lo, hi);
sort(a, lo, j-1);
sort(a, j+1, hi);
}
}
publicclassHeap {
publicstaticvoidsort(Comparable[] a) {
int N = a.length;
// Heap-orderedfor (int k = N/2; k >= 1; k--)
sink(a, k, N);
// Exchange the maximum with the bottom and sink it downwhile (N > 1) {
exch(a, 1, N);
sink(a, 1, --N);
}
}
privatestaticvoidsink(Comparable[] a, int k, int N)
{ /* as before */ }
privatestaticbooleanless(Comparable[] a, int i, int j)
{ /* as before */ }
privatestaticvoidexch(Comparable[] a, int i, int j)
{ /* as before */ }
}
We use the A search algorithm* to solve this problem. We require:
A search node of the game to be a board.
The number of moves made to reach the board.
The predecessor search node.
The search process:
Insert the initial search node (the initial board, $0$ moves, and a null predecessor search node) into a priority queue.
Delete min priority search node from the priority queue, and insert all neighboring search nodes back.
Repeat until the search node dequeued corresponds to a goal board.
Choices of priority function:
Hamming priority function: the number of blocks in the wrong position, plus the number of moves made so far to get to the search node.
Manhattan priority function: the sum of the Manhattan distances (sum of the vertical and horizontal distance) from the blocks to their goal position, plus the number of moves so far to get to the search node.
Optimization:
To avoid search node corresponding to the same board enqueued on the priority queue many times, don’t enqueue a neighbor if its board is the same as the board of the predecessor search node.
Pre-compute the Manhattan priority when constructing the search node and save it in an instance variable.
Board: Creat an immutable data type Board that models an $n$-by-$n$ board with sliding tiles. Board.java
publicclassST<Key, Value> {
ST() // create a symbol tablevoidput(Key key, Value val) // put key-value pair into the table// (remove key from table if value is null) Value get(Key key) // value paired with key// (null if key is absent)voiddelete(Key key) // remove key (and its value) from tablebooleancontains(Key key) // is there a value paired with key?booleanisEmpty() // is the table empty?intsize() // number of key-value pairs in the table Iterable<Key>keys() // all the keys in the table}
// Use final since it is typically unsafe to use equals() with inheritancepublicfinalclassDateimplements Comparable<Date> {
privatefinalint month;
privatefinalint day;
privatefinalint year;
// ...publicbooleanequals(Object y) {
// Optimize for true object equalityif (y ==this) returntrue;
// Check for nullif (y ==null) returnfalse;
// Objects must be in the same classif (y.getClass() !=this.getClass()) returnfalse;
// Cast is guaranteed to succeed Date that = (Date) y;
if (this.day!= that.day ) returnfalse;
if (this.month!= that.month) returnfalse;
if (this.year!= that.year ) returnfalse;
returntrue;
}
}
Return value corresponding to given key, or null if no such key.
1
2
3
4
5
6
7
8
9
10
11
public Value get(Key key) {
Node x = root;
while (x !=null) {
int cmp = key.compareTo(x.key);
if (cmp < 0) x = x.left;
elseif (cmp > 0) x = x.right;
// cmp == 0elsereturn x.val;
}
returnnull;
}
Minimum: smallest key in table. Keep search x.left() until it is null, then return x.
Maximum: Largest key in table. Keep search x.right() until it is null, then return x.
Floor: Largest key $\le$ a given key.
Ceiling: Smallest key $\ge$ a given key.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
public Key floor(Key key) {
Node x = floor(root, key);
if (x ==null) returnnull;
return x.key;
}
private Node floor(Node x, Key key) {
if (x ==null) returnnull;
int cmp = key.compareTo(x.key);
if (cmp == 0) return x;
if (cmp < 0) return floor(x.left, key);
// Could be on the right Node t = floor(x.right, key);
if (t !=null) return t;
elsereturn x;
}
Search is similar to binary search trees. Recursively compare search key against keys in node. The only difference is for 3-node, when a search key needs to be compared with these two keys.
B-tree and its variants are widely used for file systems and databases to reduce the time for probing.
Generalize 2-3 trees by allowing up to $M-1$ key-link pairs per node. Choose $M$ as large as possible so that $M$ links fit in a page.
At least 2 key-link pairs at root.
At least $M/2$ key-link pairs in other nodes.
External nodes contain client keys.
Internal nodes contain copies of keys to guide search (each key is a copy of min key in subtree).
A search or an insertion in a B-tree of order $M$ with $N$ keys requires between $\log_{M-1} N$ and $\log_{M/2} N$ probes. In practice, the number of probes is at most 4. e.g. $M=1024; N=62 \text{ billion}$, $\log_{M/2} N \le 4$.
Insert: add $(x,y)$ to list for corresponding square.
Range search: examine only squares that intersect 2d range query.
Grid size is the key to performance. If too small, it will waste space; if too large, there will be too many points per square. A reasonable choice is $\sqrt{N}$-by-$\sqrt{N}$. Grid implementation is fast and simple solution for evenly-distributed points but not for clustering.
BST but alternate using x and y coordinates as key.
Range search: Find all points in a query axis-aligned rectangle.
Check if point in node lies in given rectangle.
Recursively search trees (if any could fall in rectangle).
If query rectangle does not intersect the rectangle corresponding to a subtree, prune this subtree.
Nearest neighbor search: Find closest point to query point.
Recursively search trees.
If closest point discovered so far is closer than the distance between the query point and the rectangle corresponding to a subtree, prune this subtree.
publicclassSeparateChainingHashST<Key, Value> {
privateint M = 97; // number of chainsprivate Node[] st =new Node[M]; // array of chains ...
publicvoidput(Key key, Value val) {
int i = hash(key);
for (Node x = st[i]; x !=null; x = x.next)
if (key.equals(x.key)) {
x.val= val;
return;
}
st[i]=new Node(key, val, st[i]);
}
public Value get(Key key) {
int i = hash(key);
for (Node x = st[i]; x !=null; x = x.next)
if (key.equals(x.key))
return (Value) x.val;
returnnull;
}
}
publicclassSeparateChainingHashST<Key, Value> {
privateint M = 30001;
private Value[] vals = (Value[]) new Object[M];
private Key[] keys = (Key[]) new Object[M];
...
publicvoidput(Key key, Value val) {
int i;
for (i = hash(key); keys[i]!=null; i = (i+1) % M)
if (keys[i].equals(key))
break;
keys[i]= key;
vals[i]= val;
}
public Value get(Key key) {
for (int i = hash(key); keys[i]!=null; i = (i+1) % M)
if (keys[i].equals(key))
return vals[i];
returnnull;
}
}
Typical choice: $M \sim 2N$, mean displacement will be $\sim 3/2$