CS121

Week 9 - Monday

What did we talk about last time?

Practiced with red-black trees

AVL trees

Balanced add

Takes a potentially unbalanced tree and turns itinto a balanced tree

Step 1

Turn the tree into a degenerate tree (backbone orvine) by right rotating any nodes with left children

Step 2

Turn the degenerate tree into a balanced tree bydoing sequences of left rotations starting at the root

The analysis is not obvious, but it takes O(n)total rotations

How much time does it take to insert n itemswith:

Red-black or AVL tree

Balanced insertion method

Unbalanced insertion + DSW rebalance

How much space does it take to insert n itemswith:

Red-black or AVL tree

Balanced insertion method

Unbalanced insertion + DSW rebalance

Balanced binary trees ensure that accessingany element takes O(log n) time

But, maybe only a few elements are accessedrepeatedly

In this case, it may make more sense torestructure the tree so that these elementsare closer to the root

1.Single rotation

Rotate a child about its parent if an element in achild is accessed, unless it's the root

2.Moving to the root

Repeat the child-parent rotation until the elementbeing accessed is the root

A concept called splaying takes the rotate to the root idea further,doing special rotations depending on the relationship of the childR with its parent Q and grandparent P

Cases:

1.Node R's parent is the root

Do a single rotation to make R the root

2.Node R is the left child of Q and Q is the left child of P(respectively right and right)

Rotate Q around P

Rotate R around Q

3.Node R is the left child of Q and Q is the right child of P(respectively right and left)

Rotate R around Q

Rotate R around P

We are not going deeply into self-organizingtrees

It's an interesting concept, and rigorousmathematical analysis shows that splay treesshould perform well, in terms of Big O bounds

In practice, however, they usually do not

An AVL tree or a red-black tree is generally thebetter choice

If you come upon some special problem whereyou repeatedly access a particular element mostof the time, only then should you consider splaytrees

Student Lecture

We can define a symbol table ADT with a few essentialoperations:

put(Key key, Value value)

▪Put the key-value pair into the table

get(Key key):

▪Retrieve the value associated with key

delete(Key key)

▪Remove the value associated with key

contains(Key key)

▪See if the table contains a key

isEmpty()

size()

It's also useful to be able to iterate over all keys

We have been talking a lot about trees andother ways to keep ordered symbol tables

Ordered symbol tables are great, but we maynot always need that ordering

Keeping an unordered symbol table mightallow us to improve our running time

Balanced binary search trees give us:

O(log n) time to find a key

O(log n) time to do insertions and deletions

Can we do better?

What about:

O(1) time to find a key

O(1) to do an insertion or a deletion

We make a huge array, so big that we’ll havemore spaces in the array than we expect datavalues

We use a hashing function that maps keys toindexes in the array

Using the hashing function, we know whereto put each key but also where to look for aparticular key

Let’s make a hash table to store integer keys

Our hash table will be 13 elements long

Our hashing function will be simply moddingeach value by 13

Insert these keys: 3, 19, 7, 104, 89

104

Find these keys:

19

YES!

88

NO!

16

NO!

104

We are using a hash table for a space/timetradeoff

Lots of space means we can get down to O(1)

How much space do we need?

How do we pick a good hashing function?

What happens if two values collide (map tothe same location)

We want a function that will map data tobuckets in our hash table

Important characteristics:

Efficient: It must be quick to execute

Deterministic: The same data must always mapto the same bucket

Uniform: Data should be mapped evenly acrossall buckets

We want a function h(k) that computes a hashfor every key k

The simplest way of guaranteeing that we hashonly into legal locations is by setting h(k) to be:

h(k) = k mod N where N is the size of the hashtable

To avoid crowding the low indexes, N should beprime

If it is not feasible for N to be prime, we can addanother step using a prime p > N:

h(k) = (k mod p) mod N

Pros

Simple

Fast

Easy to do

Good if you know nothing about the data

Cons

Prime numbers are involved (What’s the nearestprime to the size you want?)

Uses no information about the data

If the data is strangely structured (multiples of p, forexample) it could all hash to the same location

Break the key into parts and combine thoseparts

Shift folding puts the parts together withouttransformations

SSN: 123-45-6789 is broken up and summed 123 +456 + 789 = 1,368, then modded by N, probably

Boundary folding puts the parts togetherreversing every other part of the key

SSN: 123-45-6789 is broken up and summed 123 +654 + 789 = 1,566, then modded by N, probably

Pros

Relatively Simple and Fast

Mixes up the data more than division

Points out a way to turn strings or other non-integer data into an integer that can be hashed

Transforms the numbers so that patterns in thedata are likely to be removed

Cons

Primes are still involved

Uses no special information about the data

Square the key, then take the “middle”numbers out of the result

Example: key = 3,121 then 3,1212 = 9,740,641and the hash value is 406

One nice thing about this method is that wecan make the table size be a power of 2

Then, we can take the log2 N middle bits outof the squared value using bitwise shifts andmasking

Pros

Randomizes the data a lot

Fast when implemented correctly

Primes are not necessary

Cons

Uses no special information about the data

Remove part of the key, especially if it isuseless

Example:

Many SSN numbers for Indianapolis residentsbegin with 313

Removing the first 3 digits will, therefore, notreduce the randomness very much, provided thatyou are looking at a list of SSN’s for Indianapolisresidents