# Algorithm I

Review of Princeton Algorithms I course on Coursera.

My solution to the homework

**Course Overview**

topic | data structures and algorithms |
---|---|

data types | stack, queue, bag, union-find, priority queue |

sorting | quicksort, mergesort, heapsort |

searching | BST, red-black BST, hash table |

graphs | BFS, DFS, Prim, Kruskal, Dijkstra |

strings | radix sorts, tries, KMP, regexps, data compression |

advanced | B-tree, suffix array, maxflow |

# Union-Find

Given a set of N objects, we want to implement the following two function.

`Union`

: connect two objects.`Find`

: is there a path connecting the two objects?

initialize | union | connected | |
---|---|---|---|

Quick-find | \(N\) | \(N\) | \(1\) |

Quick-union | \(N\) | \(N\) | \(N\) |

Weighted QU | \(N\) | \(\lg N\) | \(\lg N\) |

## Quick-Find

`Initialization`

: integer array`id[]`

of length N`Union(p, q)`

: change all entries whose id equal`id[p]`

to`id[q]`

`Find`

: check if`p`

and`q`

have the same id.

1 | public class QuickFindUF { |

- Union too expensive (\(N\) array accesses).
- Trees are flat, but too expensive to keep them flat.

## Quick-Union

`Initialization`

: integer array`id[]`

of length N.`id[i]`

is parent of`i`

`Union(p, q)`

: change the root id of`p`

to the root id of`q`

`Find`

: check if`p`

and`q`

have the same root.

1 | public class QuickUnionUF { |

- Find too expensive (could be \(N\) array accesses).
- Trees can get tall.

## Weighted Quick-Union

`Initialization`

: addition array`sz[i]`

to count number of objects in the tree rooted at`i`

`Union(p, q)`

: change the root of smaller tree to root of larger tree and update`sz[]`

`Find`

: same as quick-union- Path compression: we can add extra line of code to make the tree more flat when computing the root of a node

1 | public class WeightedQuickUnionUF { |

- Find takes time proportional to depth of \(p\) and \(q\).
- Union takes constant time, given roots.
- Depth of any node is at most \(\lg N\).

## Applications

- Pixels in a digital photo.
- Computers in a network.
- Friends in a social network.
- Transistors in a computer chip.
- Elements in a mathematical set.
- Variable names in Fortran program.
- Metallic sites in a composite system.

## HW1 Percolation

- \(N\)-by-\(N\) grid of sites.
- Each site is open with probability \(p\).
- System percolates iff top and bottom are connected by open sites.

When \(N\) is large, theory guarantees a sharp threshold \(p^*\). What is the value of \(p^*\)?

Use Monte Carlo simulation:

- Initialize \(N\)-by-\(N\) whole grid to be blocked.
- Declare random sites open until top connected bottom.
- Vacancy percentage estimates \(p^*\).

How to check whether an \(N\)-by-\(N\) system percolates:

- Create an object for each site and name them \(0\) to \(N^2 - 1\).
- Sites are in same component if connected by open sites.
- Percolates iff any site on bottom row is connected to site on top row (brute-force algorithm: \(N^2\) calls to
`connected()`

). - Clever trick: introduce \(2\) virtual sites connected to top and bottom separately and check if virtual top site is connected to virtual bottom site. (only \(1\) call to
`connected()`

)

**Virtual top/bottom backwash problem:**

- When the system percolates (virtual top is connected to virtual bottom), every node that is connected to the bottom is also considered to be connected to the top (even though it is not). This is the backwash problem.
- A simple way to solve this problem is to use
**two**`WeightedQuickUnionUF`

object, one with the virtual bottom and the other not. Use the first to decide whether the system percolates and the second to see whether a node is full (connected to the top). - A more memory-efficient way to solve this is to manually maintain two boolean arrays,
`contop[]`

and`conbot[]`

to keep track whether the root of a node is connected to the top/bottom. When a node is connected to the top and the bottom at the same time, the system percolates.

Solution Percolation.java

# Bags, Queues, and Stacks

- Operations: insert, remove, iterate.
- Stack: LIFO (last in first out).
- Queue: FIFO (first in first out).
- Bag: Adding items to a collection and iterating (order doesn't matter)

## Stack

### Linked-List Implementation

1 | public class LinkedStackOfStrings { |

### Resizing-Array Implementation

1 | public class ArrayStackOfStrings { |

### Resizing-Array vs. Linked-List

**Linked-list implementation:**

- Every operation takes constant time in the worst case.
- Use extra time and space to deal with the links.

**Resizing-array implementation:**

- Every operation takes constant amortized time.
- Less wasted space.

## Queues

### Linked-List Implementation

1 | public class LinkedQueueOfStrings { |

### Resizing-Array Implementation

1 | public class ArrayQueueOfStrings { |

## Generics

- Avoid casting in client.
- Discover type mismatch errors at compile-time instead of run-time.
- Client code can use generic stack for any type of data.

Take Stack as an example

1 | public class Stack<Item> { |

## Iterators

- Support iteration over stack items by client, without revealing the internal representation of the stack.
- Implement the
`java.lang.Iterable`

interface supported by Java.

### Linked-List Implementation

1 | import java.util.Iterator; |

### Array Implementation

1 | import java.util.Iterator; |

### Java Support

Java supports elegant client code if `Iterable`

interface is implemented.

1 | // Shorthand code |

## Applications

- Parsing in a compiler.
- Java virtual machine.
- Undo in a word processor.
- Back button in a Web browser.
- PostScript language for printers.
- Implementing function calls in a compiler

### Two-Stack Algorithm

To evaluate infix expressions. \[ \displaylines{(\ 1\ +\ (\ (\ 2\ +\ 3\ )\ *\ (\ 4\ *\ 5\ )\ )\ ) } \]

- Value: push onto the value stack.
- Operator: push onto the operator stack.
- Left parenthesis: ignore.
- Right parenthesis: pop operator and two values; push the result of applying that operator to those values onto the operand stack.

1 | public class Evaluate { |

- Dijkstra's two-stack algorithm computes the same value if the operator occurs after the two values.

\[ \displaylines{(\ 1\ (\ (\ 2\ 3\ +\ )\ (\ 4\ 5\ *\ )\ *\ )\ +\ ) } \]

- All of the parentheses are redundant.

\[ \displaylines{1\ 2\ 3\ +\ 4\ 5\ *\ *\ + } \]

## HW2 Deques and Randomized Queues

**Dequeue:**A double-ended queue or deque is a generalization of a stack and a queue that support adding and removing items from**either the front or the back**of the data structure. Deque.java**Randomized queue:**A randomized queue is similar to a stack or queue, except that the item removed is chosen**uniformly at random**from items in the data structure. RandomizedQueue.java**Client:**Write a client program that takes an integer \(k\) as a command-line argument; reads in a sequence of strings from standard input using`StdIn.readString()`

; and prints exactly \(k\) of them, uniformly at random. Permutation.java

Use reservoir sampling to randomly choose \(k\) samples from a list of \(n\) elements in a single pass over the items:

- Keep the first \(k\) items
- When the \(i\)-th item arrives

- with probability \(k/i\), discard an old item at random and keep the new one
- with probability \(1 - k/i\), discard the new item

# Elementary Sorts

inplace? | stable? | worst | average | best | remarks | |
---|---|---|---|---|---|---|

selection | ✔ | \(N^2/2\) | \(N^2/2\) | \(N^2/2\) | \(N\) exchanges | |

insertion | ✔ | ✔ | \(N^2/2\) | \(N^2/4\) | \(N\) | use for small \(N\) or partially ordered |

shell | ✔ | \(N\) | tight code, subquadratic | |||

merge | ✔ | \(N\lg N\) | \(N\lg N\) | \(N\lg N\) | \(N\lg N\) guarantee, stable | |

quick | ✔ | \(N^2/2\) | \(2N\ln N\) | \(N\lg N\) | \(N\lg N\) probabilistic guarantee fastest in practice | |

3-way quick | ✔ | \(N^2/2\) | \(2N\ln N\) | \(N\) | improves quicksort in presence of duplicate keys | |

heap | ✔ | \(2N\lg N\) | \(2N\lg N\) | \(N\lg N\) | \(N\lg N\) guarantee, in-place |

## Comparable

**Interface:**

- Built-in comparable types: Integer, Double, String, Date, File, ...
- User-defined comparable types. Implement the
`Comparable`

interface supported by Java.

1 | // Only compare dates to other dates <Date> |

**Two useful sorting helper functions:**

- Less: is item \(v\) less than \(w\)?
- Exchange: swap item in array at index \(i\) with the one at index \(j\).

1 | private static boolean less(Comparable v, Comparable w) |

## Selection Sort

- Scan from left to right \([i, N]\)
- Find the minimum entry and exchange with \(i\)'s entry
- Continue next scan from \([i+1, N]\)

1 | public class Selection { |

## Insertion Sort

- Scan \(i\) from left to right \([0, N]\)
- Scan \(j\) from right to left \([i, 0]\), exchange
`a[j]`

with each larger entry to its left - Start next scan \([1, N]\)

1 | public class Insertion { |

## Shellsort

- Insertion sort with stride length \(h\).
- A \(g\)-sorted array remains \(g\)-sorted after \(h\)-sorting it.
- Use a sequence of increment steps to shellsort the array.

Which increment sequence to use?

- Power of two: \(1, 2, 4, 8, 16, 32, \dots\)
**No** - Power of two minus one: \(1, 3, 7, 15, 31, 63, \dots\)
**Maybe** - \(3x + 1\): \(1, 4, 13, 40, 121, 364, \dots\)
**OK. Easy to compute** **Sedgewick**: \(1, 5, 19, 41, 109, 209, 505, 929, 2161, 3905, \dots\)**Good. Tough to beat in empirical studies**

1 | public class Shell { |

## Shuffle

Rearrange array so that result is a uniform random permutation.

**Shuffle sort**

- Generate a random real number for each array entry.
- Sort the array.

**Knuth shuffle**

- In iteration \(i\), pick integer \(r\) between \(0\) and \(i\) uniformly at random.
- Swap \(a[i]\) and \(a[r]\).

1 | public class StdRandom { |

## Convex Hull

The convex hull of a set of \(N\) points is the smallest perimeter fence enclosing the points.

**Convex hull applications**

**Robot motion planning**: Find shortest path in the plane from \(s\) to \(t\) to avoids a polygonal obstacle. Shortest path is either a straight line from \(s\) to \(t\) or it is one of two polygonal chains of convex hull.

**Farthest pair problem**: Given \(N\) points in the plane, find a pair of points with the largest Euclidean distance between them. Farthest pair of points are extreme points on convex hull.

**Graham scan:**

- Choose point \(p\) with smallest \(y\)-coordinate.
- Sort points by polar angle with \(p\).
- Consider points in order; discard unless it create a counterclockwise turn.

# Mergesort

- Divide array into two halves.
- Recursively sort each half.
- Merge two halves.

1 | public class Merge { |

Java assert statement can be enable or disable at runtime

1 | java -ea Myprogram // enable assertions |

## Practical Improvements

- Use insertion sort for small subarrays since mergesort has too much overhead for tiny subarrays.
- Stop if the biggest item in first half \(\le\) smallest item in second half.
**Eliminate**the copy to the auxiliary array by switching the role of the input and auxiliary array in each recursive call.

1 | public class Merge { |

## Bottom-Up Mergesort

- Pass through array, merging subarrays of size \(1\).
- Repeat for subarrays of size \(2, 4, 8, 16, \dots\)

1 | public class MergeBU { |

## Comparator

**Interface:**

- Sorting using natural order by implementing the
`Comparable`

interface. - Sorting using an alternate order by implementing the
`Comparator`

interface. - Implement the
`compare()`

method.

1 | public class Student { |

**To use with Java system sort:**

- Create Comparator object.
- Pass as second argument to
`Arrays.sort()`

1 | String[] a; |

**To support comparators in our sort implementations:**

- Use
`Object`

instead of`Comparable`

. - Pass
`Comparator`

to`sort()`

and`less()`

and use it in`less()`

.

1 | public static void sort(Object[] a, Comparator comparator) { |

## Stability

A stable sort preserves the relative order of items with equal keys. The following is a counter example.

**Stable:**Insertion sort, Mergesort**Unstable:**Selection sort, Shellsort

## HW3 Collinear Points

Given a set of \(n\) distinct points in the plane, find every (maximal) line segment that connects a subset of \(4\) or more of the points.

**Point data type:** Create an immutable data type `Point`

that represents a point in the plane. Point.java

**Brute force:** Examines \(4\) points at a time and checks whether they all lie on the same line segment (check the three slope between them are equal), returning all such line segments. BruteCollinearPoints.java

**Fast, sorting-based solution:** For each point \(p\), calculate the slope it makes with every other point and sort. The 3 other collinear points must have equal slopes w.r.t. \(p\). FastCollinearPoints.java

# Quicksort

- Shuffle the array
- Partition so that, for some \(j\)
- entry \(a[j]\) is in place
- no larger entry to the left of \(j\)
- no smaller entry to the right of \(j\)

- Sort each piece recursively.

1 | public class Quick { |

Quicksort is an **in-place** sorting algorithm but is **not stable**.

## Practical Improvements

- Use insertion sort for small subarrays.
- Choose the median item as the pivot

1 | public class Quick { |

## Quick Select

Use `partition`

to find a \(k^{th}\) smallest item in a given array.

1 | public static Comparable select(Comparable[] a, int k) { |

Quick select takes **linear** time on average with the random shuffle provides a probabilistic guarantee.

## 3-Way Partitioning

When all keys equal, the quicksort needs ~\(1/2 N^2\) compares. We need to modify quicksort to deal with duplicate elements.

Partition array into \(3\) parts so that:

- Entries between \(lt\) and \(gt\) equal to partition item \(v\).
- No larger entries to left of \(lt\).
- No smaller entries to right of \(gt\).

1 | private static void sort(Comparable[] a, int lo, int hi) { |

# Priority Queues

Remove the largest (or smallest) item in deletion.

implementation | insert | del max | max |
---|---|---|---|

unordered array | \(1\) | \(N\) | \(N\) |

ordered array | \(N\) | \(1\) | \(1\) |

binary heap | \(\lg N\) | \(\lg N\) | \(1\) |

## Array Implementation

### Unordered Array Implementation

1 | public class UnorderedMaxPQ<Key extends Comparable<Key>> { |

### Ordered Array Implementation

1 | public class OrderedArrayMaxPQ<Key extends Comparable<Key>> { |

## Binary Heap

- Parent's key no smaller than children's keys.
- If child's key becomes larger than its parent's key, then repeatedly exchange child's key with parent's key.
- If parent's key becomes smaller than one (or both) of its children's, then repeatedly exchange parent's key with larger child's key.
- When
**insert**a key, add node at end, then swim it up. - When
**delete**the maximum, exchange root with node at end, then sink it down. - Index starts from \(1\).

1 | public class MaxPQ<Key extends Comparable<Key>> { |

## Heapsort

- Create max-heap with all \(N\) keys and repeatedly remove the maximum key

1 | public class Heap { |

## HW4 8 Puzzle

A \(3\)-by-\(3\) grid with \(8\) square blocks labeled \(1\) through \(8\) and a blank square. The goal is to rearrange the blocks so that they are in order, using as few moves as possible. You are permitted to slide blocks horizontally or vertically into the blank square. Here is an example: \[
\displaylines{\begin{matrix}
& 1 & 3 \\
4 & 2 & 5 \\
7 & 8 & 6
\end{matrix}
\quad
\Rightarrow
\quad
\begin{matrix}
1 & & 3 \\
4 & 2 & 5 \\
7 & 8 & 6
\end{matrix}
\quad
\Rightarrow
\quad\begin{matrix}
1 & 2 & 3 \\
4 & & 5 \\
7 & 8 & 6
\end{matrix}
\quad
\Rightarrow
\quad\begin{matrix}
1 & 2 & 3 \\
4 & 5 & \\
7 & 8 & 6
\end{matrix}
\quad
\Rightarrow
\quad\begin{matrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 &
\end{matrix}
}
\] We use the **A* search algorithm** to solve this problem. We require:

- A search node of the game to be a board.
- The number of moves made to reach the board.
- The predecessor search node.

The search process:

- Insert the initial search node (the initial board, \(0\) moves, and a null predecessor search node) into a priority queue.
- Delete min priority search node from the priority queue, and insert all neighboring search nodes back.
- Repeat until the search node dequeued corresponds to a goal board.

Choices of priority function:

**Hamming priority function**: the number of blocks in the wrong position, plus the number of moves made so far to get to the search node.**Manhattan priority function**: the sum of the Manhattan distances (sum of the vertical and horizontal distance) from the blocks to their goal position, plus the number of moves so far to get to the search node.

Optimization:

- To avoid search node corresponding to the same board enqueued on the priority queue many times, don't enqueue a neighbor if its board is the same as the board of the predecessor search node.
- Pre-compute the Manhattan priority when constructing the search node and save it in an instance variable.

**Board:** Creat an immutable data type `Board`

that models an \(n\)-by-\(n\) board with sliding tiles. Board.java

**Solver:** Solver.java

# Symbol Tables

**Key-value pair abstraction:**

- Insert a value with specified key.
- Search for the corresponding value with a given key.

## API

1 | public class ST<Key, Value> { |

## Implementing Equals for User-Defined Types

1 | // Use final since it is typically unsafe to use equals() with inheritance |

# Binary Search Trees

- A BST is a binary tree in symmetric order.
- A binary tree is either empty or two disjoint binary trees (left and right).
- Each node has a key and every node's key is
- larger than all keys in its left subtree.
- smaller than all keys in its right subtree.

## Basic Operations

**Search:**If less, go left; if greater, go right; if equal, search hit.**Insert:**If less, go left; if greater, go right; if null, insert.

### Node

A Node is comprised of four fields:

- A key and value
- A reference to the left and right subtree

1 | private class Node { |

### Get

Return value corresponding to given key, or `null`

if no such key.

1 | public Value get(Key key) { |

### Put

Search for key, then

- If key in tree, reset value.
- If not in tree, add new node.

1 | public void put(Key key, Value val) |

## Ordered Operations

### Max, Min, Floor, Ceiling

- Minimum: smallest key in table. Keep search
`x.left()`

until it is null, then return`x`

. - Maximum: Largest key in table. Keep search
`x.right()`

until it is null, then return`x`

. - Floor: Largest key \(\le\) a given key.
- Ceiling: Smallest key \(\ge\) a given key.

1 | public Key floor(Key key) { |

### Subtree Counts

In each node, we store the number of nodes in the subtree rooted at that node.

1 | private class Node { |

### Rank

The number of keys \(<\) k.

1 | public int rank(Key key) |

### Inorder Traversal

- Traverse left subtree.
- Enqueue key.
- Traverse right subtree.

1 | public Iterable<Key> keys() { |

## Deletion

### Deleting The Minimum

- Go left until finding a node with a null left link.
- Replace that node by its right link.
- Update subtree count.

1 | public void deleteMin() |

### Hibbard Deletion

Search for node \(t\) containing key \(k\).

- If no children: Delete \(t\) by setting parent link to null
- If one child: Delete \(t\) by replacing parent link
- If two children:
- Find the minimum in \(t\)'s right subtree \(x\).
- Delete \(x\) in \(t\)'s right subtree.
- Put \(x\) in \(t\)'s spot.

Update subtree counts.

1 | public void delete(Key key) |

# Balanced Search Trees

## 2-3 Tree

Allow 1 or 2 keys per node

- 2-node: one key, two children.
- 3-node: two key, three children.

Symmetric order. Inorder traversal yields keys in ascending order.

Perfect balance. Every path from root to null link has same length.

### Search

Search is similar to binary search trees. Recursively compare search key against keys in node. The only difference is for 3-node, when a search key needs to be compared with these two keys.

### Insertion

Insertion into a 2-node at bottom will change it to a 3-node.

Insertion into a 3-node at bottom:

- Add new key to 3-node to create temporary 4-node.
- Move middle key in 4-node into parent and split the other two keys.
- Repeat up the tree, as necessary.
- If you reach the root and it's a 4-node, split it into three 2-nodes.

This insertion maintains symmetric order and perfect balance.

### Implementation

Direct implementation is complicated, because:

- Maintaining multiple node types is cumbersome.
- Need multiple compares to move down tree.
- Need to move back up the tree to split 4-nodes.
- Large number of cases for splitting.

Instead, we choose a better way to do it, the red-black BST.

## Red-Black BST

Represent 2-3 tree as a BST by splitting 3-node to two 2-node with a red link connecting them.

- No node has two red links connected to it.
- Every path from root to null link has the same number of black links.
- Red links lean left.

### Representation

Each node is pointed to by precisely one link from its parent and can encode color of the link.

1 | private static final boolean RED = true; |

### Elementary Operations

**Left rotation**. Orient a right-leaning red link to lean left.

1 | private Node rotateLeft(Node h) { |

**Right rotation**. Orient a left-leaning red link to (temporarily) lean right (To solve two red links in a row).

1 | private Node rotateRight(Node h) { |

**Color flip**. Recolor to split a (temporary) 4-node.

1 | private void flipColors(Node h) { |

### Search

Search is the same as for elementary BST (ignore color) but runs faster because of better balance.

### Insertion

- Right child red, left child black: rotate left.
- Left child, left-left grandchild red: rotate right.
- Both children red: flip colors.

1 | private Node put(Node h, Key key, Value val) { |

## B-Tree

B-tree and its variants are widely used for file systems and databases to reduce the time for probing.

Generalize 2-3 trees by allowing up to \(M-1\) key-link pairs per node. Choose \(M\) as large as possible so that \(M\) links fit in a page.

- At least 2 key-link pairs at root.
- At least \(M/2\) key-link pairs in other nodes.
- External nodes contain client keys.
- Internal nodes contain copies of keys to guide search (each key is a copy of min key in subtree).

A search or an insertion in a B-tree of order \(M\) with \(N\) keys requires between \(\log_{M-1} N\) and \(\log_{M/2} N\) probes. In practice, the number of probes is at most 4. e.g. \(M=1024; N=62 \text{ billion}\), \(\log_{M/2} N \le 4\).

# Geometric Applications of BSTs

## 1d Range Search

Extension of ordered symbol table. Besides insertion, search and deletion, we require **Range count** and **Range search**.

Range count: number of keys between \(k_1\) and \(k_2\).

1 | public int size(Key lo, Key hi) { |

Range search: find all keys between \(k_1\) and \(k_2\).

- Recursively find all keys in left subtree (if any could fall in range).
- Check key in current node.
- Recursively find all keys in right subtree (if any could fall in range).

## Line Segment Intersection

Given \(N\) horizontal and vertical line segments, find all intersections.

Sweep-line algorithm: reduces 2d line intersection to 1d range search.

- x-coordinates define events.
- h-segment (left endpoint): insert y-coordinate into BST.
- h-segment (right endpoint): remove y-coordinate from BST.
- v-segment: range search for interval of y-endpoints.

## Kd Trees

Start from 2d trees. It is the extension of ordered symbol-table to 2d keys.

### Space-Partitioning Trees

- Grid: Divide space uniformly into squares.
- 2d tree: Recursively divide space into two halfplanes.
- Quadtree: Recursively divide space into four quadrants.
- BSP tree: Recursively divide space into two regions.

### Grid Implementation

- Divide space into \(M\times M\) grid of squares.
- Create list of points contained in each square.
- Use 2d array to directly index relevant square.
- Insert: add \((x,y)\) to list for corresponding square.
- Range search: examine only squares that intersect 2d range query.

Grid size is the key to performance. If too small, it will waste space; if too large, there will be too many points per square. A reasonable choice is \(\sqrt{N}\)-by-\(\sqrt{N}\). Grid implementation is fast and simple solution for evenly-distributed points but not for clustering.

### 2d Tree

BST but alternate using x and y coordinates as key.

**Range search:** Find all points in a query axis-aligned rectangle.

- Check if point in node lies in given rectangle.
- Recursively search trees (if any could fall in rectangle).
- If query rectangle does not intersect the rectangle corresponding to a subtree, prune this subtree.

**Nearest neighbor search:** Find closest point to query point.

- Recursively search trees.
- If closest point discovered so far is closer than the distance between the query point and the rectangle corresponding to a subtree, prune this subtree.

### Applications

- Ray tracing
**2d range search**- Flight simulators
- N-body simulation
- Collision detection
- Astronomical databases
**Nearest neighbor search**- Adaptive mesh generation
- Accelerate rendering in Doom
- Hidden surface removal and shadow casting

## Interval Search Trees

1d interval search:

- Insert an interval \((lo, hi)\).
- Search for an interval.
- Delete an interval.
- Given an interval, find all intervals in data structure that intersects it.

Non-degeneracy assumption: No two intervals have the same left endpoint.

Create BST, where each node stores an interval. Use left endpoint as BST key and store max endpoint in subtree rooted at node.

To insert an interval \((lo, hi)\), first insert into BST using \(lo\) as the key, then update max in each node on search path.

To find intersections for query interval \((lo, hi)\):

- If interval in node intersects query interval, return it.
- Else if left subtree is null, go right.
- Else if max endpoint in left subtree is less than \(lo\), go right.
- Else go left

## Rectangle Intersection

Find all intersections among a set of \(N\) orthogonal rectangles.

Non-degeneracy assumption: All x and y coordinates are distinct.

Sweep-line algorithm: reduces 2d orthogonal rectangle intersection search to 1d interval search.

- x-coordinates of left and right end points define events.
- Maintain set of rectangles that intersect the sweep line in an interval search tree (using y-intervals of rectangle).
- Left endpoint: interval search for y-interval of rectangle; insert y-interval.
- Right endpoint: remove y-interval

## HW5 Kd-Trees

Write a data type to represent a set of points in the unit square using a 2d-tree to support efficient range search and nearest-neighbor search.

Two immutable data types are provided, the `Point2D`

represents points in the plane and `RectHV`

represents axis-aligned rectangles.

**Brute-force implementation:** use red-black BST. PointSET.java

**2d-tree implementation**: KdTree.java

# Hash Table

Save items in a **key-indexed table** (index is a function of the key)

- Computing the hash function
- Equality test: Method for checking whether two keys are equal
- Collision resolution: Algorithm and data structure to handle two keys that hash to the same array index

## Hash Function

Idealist goal:

- Efficiently computable
- Each table index equally likely for each key

Java library implementations for primitive types: `Integer`

, `Boolean`

, `Double`

, `String`

1 | public final class Integer { |

**Hash code**: an`int`

between \(-2^{31}\) and \(2^{31} - 1\)**Hash function output**: an`int`

between \(0\) and \(M - 1\)

1 | private int hash (Key key) |

## Separate Chaining

Use an array of \(M < N\) linked lists to deal with collisions:

**Hash**: map key to integer \(i\) between \(0\) and \(M-1\)**Insert**: put at front of \(i\)-th chain (if not already there)**Search**: need to search only \(i\)-th chain

1 | public class SeparateChainingHashST<Key, Value> { |

Typical choice: \(M \sim N / 5\)

## Linear Probing

When a new key collides, find next empty slog, and put it there (\(M > N\))

**Hash**: map key to integer \(i\) between \(0\) and \(M-1\)**Insert**: put at table index \(i\) if free, if not try \(i+1\), \(i+2\), etc**Search**: search table index \(i\), if occupied but no match, try \(i+1\), \(i+2\), etc

1 | public class SeparateChainingHashST<Key, Value> { |

Typical choice: \(M \sim 2N\), mean displacement will be \(\sim 3/2\)