In class today we discussed the path-compression implementation of the Union-Find data structure. The goal was to prove a theorem about average runtime, here where the runtime is averaged over all the calls involved. We wanted to prove the following theorem
It $T(m,n)$ is the worst-case time for at most $m$ Union or find operations over a universe of $n$ elements, then
\[
T(m,n) = O( (m+n)\alpha(n))
\]
where $\alpha(n)$ denotes the inverse Ackermann function
We followed Jeff Erickson's notes on the subject, available as a pdf here
While we didn't make it the whole way through the notes, we did discuss
- The main theorem
- That weaker version of the theorem can boostrap stronger versions, with the main theorem eventually proven by induction
- The perspective change of regarding a sequence of Union and Finds as a sequence of Unions followed by a sequence of Compress operations
- The view of fixing a rank threshold and dividing nodes into two piles based on high and low rank
- The perspective change that allows us to 'shatter' paths of low rank with impunity
- The central recurrence \[ T(F,C) \leq T(F_+,C_+) + T(F_-, C_-) + m_+ +n \] and how weaker version of the thereom come to play.
I hope this discussion orients you in Jeff's Notes on the subject.