0% found this document useful (0 votes)
30 views59 pages

Disjoint Set Data Structures Explained

The document discusses data structures for maintaining disjoint sets, focusing on the UNION-FIND algorithm which supports operations like MAKESET, FIND, and UNION. It highlights applications such as maintaining connected components in graphs and the use of UNION-FIND in Kruskal's minimum spanning tree algorithm. Various methods for implementing these data structures, including reversed trees and shallow threaded trees, are also explored, detailing their efficiency and performance characteristics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views59 pages

Disjoint Set Data Structures Explained

The document discusses data structures for maintaining disjoint sets, focusing on the UNION-FIND algorithm which supports operations like MAKESET, FIND, and UNION. It highlights applications such as maintaining connected components in graphs and the use of UNION-FIND in Kruskal's minimum spanning tree algorithm. Various methods for implementing these data structures, including reversed trees and shallow threaded trees, are also explored, detailing their efficiency and performance characteristics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Data Structures for Disjoint Sets

Dr. Manjanna B

Assistant Professor
CSE, NITK Surathkal

Nov 14, 2024

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 1 / 49
Data Structures for Disjoint Sets

Problem: Want to maintain a dynamic collection of disjoint sets.


Objects: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 2 / 49
Data Structures for Disjoint Sets

Problem: Want to maintain a dynamic collection of disjoint sets.


Objects: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Disjoint sets of objects: {0}, {1}, {2, 3, 9}, {5, 6}, {7}, {4, 8}

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 2 / 49
Data Structures for Disjoint Sets

Problem: Want to maintain a dynamic collection of disjoint sets.


Objects: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Disjoint sets of objects: {0}, {1}, {2, 3, 9}, {5, 6}, {7}, {4, 8}
FIND query: are objects 2 and 9 in the same set?
{0}, {1}, {2, 3, 9}, {5, 6}, {7}, {4, 8}

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 2 / 49
Data Structures for Disjoint Sets

Problem: Want to maintain a dynamic collection of disjoint sets.


Objects: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Disjoint sets of objects: {0}, {1}, {2, 3, 9}, {5, 6}, {7}, {4, 8}
FIND query: are objects 2 and 9 in the same set?
{0}, {1}, {2, 3, 9}, {5, 6}, {7}, {4, 8}
UNION command: merge sets containing 3 and 8.
{0}, {1}, {2, 3, 4, 8, 9}, {5, 6}, {7}

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 2 / 49
Data Structures for Disjoint Sets

Goal: Design e!cent data structure for UNION-FIND.


Each set is represented as a pointer-based data structure, with one
node per element.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 3 / 49
Data Structures for Disjoint Sets

Goal: Design e!cent data structure for UNION-FIND.


Each set is represented as a pointer-based data structure, with one
node per element.
Each set has a unique ‘leader’ element, which identifies the set.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 3 / 49
Data Structures for Disjoint Sets

Goal: Design e!cent data structure for UNION-FIND.


Each set is represented as a pointer-based data structure, with one
node per element.
Each set has a unique ‘leader’ element, which identifies the set.
We want to support the following operations.
MAKESET(x): Create a new set {x} containing the single element x.
The leader of the new set is obviously x.
FIND(x): Find (the leader of) the set containing x.
UNION(A, B): Replace two sets A and B in our collection with their
union A → B.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 3 / 49
Data Structures for Disjoint Sets

Application: Maintaining the connected components of a graph as


new vertices and edges are added.
Connected component: set of mutually connected vertices.

Graph G
a b d f g

c e h i

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 4 / 49
Data Structures for Disjoint Sets

Application: Maintaining the connected components of a graph as


new vertices and edges are added.
Connected component: set of mutually connected vertices.
Connected components: {a, b, c}, {d, e}, {f , g , h, i}

Graph G
a b d f g

c e h i

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 4 / 49
Data Structures for Disjoint Sets

Application: Maintaining the connected components of a graph as


new vertices and edges are added.
Connected components: {a, b, c}, {d, e}, {f , g , h, i}

Graph G
a b d f g

c e h i

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 5 / 49
Data Structures for Disjoint Sets

Application: Maintaining the connected components of a graph as


new vertices and edges are added.
Connected components: {a, b, c}, {d, e}, {f , g , h, i}
Goal: Given vertices f and i determine whether they belong to the
same component.

Graph G
a b d f g

c e h i

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 5 / 49
Data Structures for Disjoint Sets

Application: Maintaining the connected components of a graph as


new vertices and edges are added.
Given a new edge (b, d), if b and d don’t belong to the same
component then merge the two components:
do if FIND(b) ↑= FIND(d)
then UNION(b, d)

Graph G
a b d f g

c e h i

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 6 / 49
Data Structures for Disjoint Sets

Application: Maintaining the connected components of a graph as


new vertices and edges are added.
Given a new edge (b, d), if b and d don’t belong to the same
component then merge the two components:
do if FIND(b) ↑= FIND(d)
then UNION(b, d)
Each union command reduces by 1 the number of components

Graph G
a b d f g

c e h i

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 6 / 49
Data Structures for Disjoint Sets

Another Application:
Kruskal’s minimum spanning tree algorithm relies on UNION-FIND
data structure to maintain the components of the intermediate
spanning forest.

There 3 methods :
Ollogn)
are

(unthreaded) tres
w .
c .

Reversed
-

=>

shallow trees
~
Amortized O Clogn)
threaded
-
--

A combination of both the above ideas

path compression)
=>

L almost constant
MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 7 / 49
Reversed Trees (Method 1)

Sets are represented as trees, in which each node represents a single


element of the set.
e.g., {a, b, c, d}, {p, q, r },

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 8 / 49
Reversed Trees

Sets are represented as trees, in which each node represents a single


element of the set.
e.g., {a, b, c, d}, {p, q, r },
Each node points to another node, called its parent, except for the
leader of each set, which points to itself and thus is the root of the
tree.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 8 / 49
Reversed Trees

MAKESET is trivial and clearly takes !(1) time.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 9 / 49
Reversed Trees

MAKESET is trivial and clearly takes !(1) time.


UNION requires only O(1) time in addition to the two FINDs.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 9 / 49
Reversed Trees

MAKESET is trivial and clearly takes !(1) time.


UNION requires only O(1) time in addition to the two FINDs.
The running time of FIND(x) is proportional to the depth of x in the
tree, in the the worst case, takes !(n) time.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 9 / 49
Reversed Trees
An easy change to our UNION, called union by depth
Ensures trees always have logarithmic depth.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 10 / 49
Reversed Trees
An easy change to our UNION, called union by depth
Ensures trees always have logarithmic depth.
When merging two trees, always make the root of the shallower tree a
child of the deeper one.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 10 / 49
Reversed Trees
An easy change to our UNION, called union by depth
Ensures trees always have logarithmic depth.
When merging two trees, always make the root of the shallower tree a
child of the deeper one.
This requires us to also maintain the depth of each tree.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 10 / 49
Reversed Trees

An easy change to our UNION, called union by depth


Ensures trees always have logarithmic depth.
When merging two trees, always make the root of the shallower tree a
child of the deeper one.
This requires us to also maintain the depth of each tree.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 11 / 49
Reversed Trees

An easy change to our UNION, called union by depth


Prove by induction that for any set leader x, the size of x’s set is at
least 2depth(x) .

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 12 / 49
Reversed Trees

An easy change to our UNION, called union by depth


Prove by induction that for any set leader x, the size of x’s set is at
least 2depth(x) . n > depthi) depth log
Since there are only n elements altogether, the maximum depth of
any set is log n. We conclude that if we use union by depth, both
FIND and UNION run in !(log n) time in the worst case.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 12 / 49
Shallow Threaded Trees

Alternately, each set can be represented by a shallow tree, where the


leader is the root and all the other elements are its children.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 13 / 49
Shallow Threaded Trees

Alternately, each set can be represented by a shallow tree, where the


leader is the root and all the other elements are its children.
MAKESET and FIND are completely trivial, and clearly run in !(1)
time.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 13 / 49
Shallow Threaded Trees

Alternately, each set can be represented by a shallow tree, where the


leader is the root and all the other elements are its children.
MAKESET and FIND are completely trivial, and clearly run in !(1)
time.
UNION algorithm sets all the leader pointers in one set to point to
the leader of the other set. (This requires us to ‘thread’ a linked list
through each set, starting at the set’s leader.)

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 13 / 49
Shallow Threaded Trees

Alternately, each set can be represented by a shallow tree, where the


leader is the root and all the other elements are its children.
MAKESET and FIND are completely trivial, and clearly run in !(1)
time.
UNION algorithm sets all the leader pointers in one set to point to
the leader of the other set. (This requires us to ‘thread’ a linked list
through each set, starting at the set’s leader.)

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 14 / 49
Shallow Threaded Trees

Alternately, each set can be represented by a shallow tree, where the


leader is the root and all the other elements are its children.
MAKESET and FIND are completely trivial, and clearly run in !(1)
time.
UNION algorithm sets all the leader pointers in one set to point to
the leader of the other set. (This requires us to ‘thread’ a linked list
through each set, starting at the set’s leader.)

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 15 / 49
Shallow Threaded Trees

Alternately, each set can be represented by a shallow tree, where the


leader is the root and all the other elements are its children.
MAKESET and FIND are completely trivial, and clearly run in !(1)
time.
UNION algorithm sets all the leader pointers in one set to point to
the leader of the other set. (This requires us to ‘thread’ a linked list
through each set, starting at the set’s leader.)

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 16 / 49
Shallow Threaded Trees

The worst-case running time of UNION is proportional to the size of


the larger set. Thus, if we merge a one-element set with another
n-element set, the running time can be !(n).

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 17 / 49
Shallow Threaded Trees

The worst-case running time of UNION is proportional to the size of


the larger set. Thus, if we merge a one-element set with another
n-element set, the running time can be !(n).
Easy to come up with a sequence of n MAKESET and n ↓ 1 UNION
operations that requires !(n2 ) time to create the set {1, 2, . . . , n}
from scratch.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 17 / 49
Shallow Threaded Trees

We can improve this

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 18 / 49
Shallow Threaded Trees

We can improve this


↓ by reversing the order of arguments to UNION or
↓ by determining which set is smaller and always updating leader pointers
in that set.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 18 / 49
Shallow Threaded Trees

We can improve this


↓ by reversing the order of arguments to UNION or
↓ by determining which set is smaller and always updating leader pointers
in that set.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 19 / 49
Shallow Threaded Trees

To fix this, we add a comparison inside the UNION algorithm to


-

determine which set is smaller. This requires us to maintain the size


-

of each set.
-

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 20 / 49
Shallow Threaded Trees

Theorem 1.
A sequence of m MAKEWEIGHTEDSET, WEIGHTEDUNION, and
FIND operations, n of which are MAKEWEIGHTEDSET, takes
O(m + n log n) time in the worst case.
Union
If : is
changed by weighte inducin
when ladro
a

the atlast factor of 2 By


,

a -

set has
has
changed limes the
containing
-

the leader of
-

a
if se k
&

members
"members After the reg
and the
largest net contains In
3 a .

since there
each takes atleast O(1). were n
of m
ops
= Make weighted
O (m +
nlogn) set calls.

n2
=> logn

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 21 / 49
Shallow Threaded Trees

Theorem 1.
A sequence of m MAKEWEIGHTEDSET, WEIGHTEDUNION, and
FIND operations, n of which are MAKEWEIGHTEDSET, takes
O(m + n log n) time in the worst case.
Then, we can see that each WEIGHTEDUNION has amortized cost
O(log n).

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 21 / 49
Path Compression

Using unthreaded trees, FIND takes logarithmic time and everything


else is contant.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 22 / 49
Path Compression

Using unthreaded trees, FIND takes logarithmic time and everything


else is contant.
Using threaded trees, UNION takes logarithmic amortized time and
everything else is contant.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 22 / 49
Path Compression

Using unthreaded trees, FIND takes logarithmic time and everything


else is contant.
Using threaded trees, UNION takes logarithmic amortized time and
everything else is contant.
A third method that allows us to get both of these operations to
have almost constant running time?
-

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 22 / 49
Path Compression
Key Observation
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 23 / 49
Path Compression
Key Observation
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.
We can change the parent pointers of all the ancestors of x all the
way up to the root.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 23 / 49
Path Compression
Key Observation
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.
We can change the parent pointers of all the ancestors of x all the
way up to the root.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 23 / 49
Path Compression
Key Obervation
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.
We can change the parent pointers of all the ancestors of x all the
way up to the root.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 24 / 49
Path Compression
Key Obervation
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.
We can change the parent pointers of all the ancestors of x all the
way up to the root.
This modification to FIND is called path compression.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 24 / 49
Path Compression
Key Obervation
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.
We can change the parent pointers of all the ancestors of x all the
way up to the root.
This is easiest if we use recursion for the initial traversal up the tree.

aicde
ana

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 25 / 49
Path Compression
Union by Rank
In any FIND operation (in the original unthreaded tree
representation), once we determine the leader of an object x, we can
speed up future FINDs by redirecting x’s parent pointer directly to
that leader.
We can change the parent pointers of all the ancestors of x all the
way up to the root.
This is easiest if we use recursion for the initial traversal up the tree.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 26 / 49
Path Compression

Union by Rank
Our earlier ’depth’ to keep the tree shallow is no longer correct due to
path compression now.
It still ensures that FIND runs in !(log n) time in the worst case.
So, we give it another name: rank.
-

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 27 / 49
Path Compression

Union by Rank
Our earlier ’depth’ to keep the tree shallow is no longer correct due to
path compression now.
It still ensures that FIND runs in !(log n) time in the worst case.

eda
So, we give it another name: rank.
Hence, the algorithm is called union by rank.
ranke be
it
wouldred not
we th
lied P
app
compression
&

&FIND runninlog
MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 27 / 49
Path Compression

Union by Rank
But, we have good reason to suspect that the upper bound !(log n)
for FIND is no longer tight.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 28 / 49
Path Compression

Union by Rank
But, we have good reason to suspect that the upper bound !(log n)
for FIND is no longer tight.
Our new algorithm memoizes the results of each FIND, so if we are
- -

asked to FIND the same item twice in a row, the second call returns
in O(1) time.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 28 / 49
Path Compression

Analysis of path compression


We prove that the amortized cost of FIND is bounded by the iterated
logarithm of n, denoted log→ n, which is the number of times one
must take the logarithm of n before the value is less than 1:
!
→ 1 if n ↔ 2,
log n = →
1 + log (log n) otherwise.

MB (BITS Pilani, Hyderabad) Data Structures for Disjoint Sets May 7, 2021 29 / 49
Analysis of disjoint-set forest with union by rank and path compression R. Inkulu

• Here we consider the disjoint-set forest with union by rank and path compression heuristics. We prove the
amortized time complexity of m number of make-set, union, find-set operations, in which n are make-set
operations, on an initially empty data strcture is O(m lg∗ n).

• The rank of any node v of a disjoint-set forest is an upper bound on the height of v. For the sake of
completeness, the pseudocode from [CLRS] is listed at the end of this note. The following obvious
properties are derived from the pseudocode:

– None of the make-set, union, and find-set operations cause the rank of any node to decrease.
Only the link operation could change the rank of a node.
– If any node v becomes a child of another node, then onwards, rank of v won’t change. Hence, only
ranks of tree roots could be modified by the link operation.
– The link operation increases the rank of [Link] by at most one. This increase is exactly one whenever
another tree T ′ , whose root’s rank equal to [Link], is linked as a child of T ’s root.
– For any node v, the ranks of nodes that occur along the simple path from v to root strictly increase.
– A node’s parent may change or the parent’s rank may change: the former happens via a path com-
pression whereas the latter occurs when the parent is a root and its rank got increased via a link
operation.
– Each union operation instantiates two find-set operations and at most one link operation. Hence, m
make-set, union, and find-set operations are effectively O(m) make-set, link, and find-set operations.

• Lemma 1: For any tree root v, the number of nodes in Tv is lower bounded by [Link] .
- Proof is by induction on the number of link operations.

As part of induction step, in linking a tree rooted at v ′ as a child of a tree rooted at v, there are two cases to consider: v ′ .rank < [Link]
and v ′ .rank = [Link].

Lemma 2: If x is a non-root node in a tree rooted at v when [Link] is set to r, then from there on, x can
never be in a tree whose root’s rank gets set to r.
- If v is linked as a child of another root, then new root’s rank is either already greater than r or is equal to r + 1 after linking.
When some other tree’s root is linked as a child of v, either [Link] remains same or it increases by one. In the former case, the root of
x does not change, whereas in the latter, the root’s rank is greater than r.

That is, except for v, no root v ′ exists such that x is a descendent of v ′ and the rank of v ′ is r.

Theorem 1: In executing O(m) make-set, link, and find-set operations, in which n are make-set opera-
tions, for any non-negative integer r, there are at most 2nr nodes of rank r.
n
- Suppose there are greater than 2r
nodes of rank r. Then, from Lemma 1 and Lemma 2, the total number of nodes in the disjoint-set
n r
forest is at least (> 2r
)(≥ 2 ), which is strictly greater than n.

Corollary: The rank of any node is at most ⌊lg n⌋.

- Substituting r′ > ⌊lg n⌋ in Theorem 1, the number of nodes of rank r′ is strictly less than 1.

1
(
∗ min{i ≥ 0 : lg(i) (n) ≤ 1} if n > 1,
• The iterated logarithm function, lg n =
0 otherwise.

This is a very slowly growing function after the inverse Ackermann function.

For any non-negative integer r, r is said to be in block-i whenever lg∗ r = i. A node v is in block-i if the
rank of v is in block-i. We say the block id of block-i is i. Since node ranks are integers in [0, ⌊lg n⌋],
block id’s are integers in [0, (lg∗ n) − 1].

• It is immediate, n make-set operations together take O(n) time, and the O(m) link operations together
take O(m) time. ——— (1a)

The find-set is essentially a find-path together with path compression. Since the time for path compression
can be charged to number of nodes visited in a find-path, the time complexity of a find-set operation is the
number of nodes visited in the corresponding find-path. To analyze the amortized time complexity of all
the find-paths among O(m) operations, we categorize nodes along any find-path P :

(i) root and its child on P (these are the nodes whose parents won’t change due to a find-path),
(ii) every node v on P whose parent belongs to a different block to v, and
(iii) every node v on P whose parent belongs to the same block as v.

Since there are O(m) find-paths and each such path has at most two nodes of type-(i), the amortized cost
of accessing all type-(i) nodes together is O(m). ——— (1b)
Since block ids are in [0, (lg∗ n) − 1] and since nodes of ranks along any find-path increase, there are at
most lg∗ n nodes of type-(ii) along any find-path. Since there are O(m) find-paths, the amortized cost of
accessing all type-(ii) nodes together is O(m lg∗ n). ——— (1c)
From here on, we focus on upper bounding the total number of type-(iii) nodes visited due to O(m)
find-path operations.

• Once v is determined to be a type-(ii) node, then it continues to be a type-(ii) node in subsequent find-
paths as well. This is due to v’s parent’s rank would either remains same or increases; in both the cases,
v’s parent is in a different block to v.

Indeed, for a node v with its rank belonging to block-i, the worst case arises when the following two
events occur alternately: a find-path on v and linking root of the tree in which v resides as a child of
another root. Again, in the worst case, with each such find-path on v, v’s parent’s rank could increase.
Since v’s parent’s ranks strictly increase, eventually, the rank of parent of v could belong to a block that
is different from the block to which v belongs. From the definition of type-(ii) nodes, when this happens,
v becomes a node of type-(ii).

Since the number of type-(ii) nodes is upper bounded, it suffices to account for how many times any
type-(iii) node v could be visited among O(m) find-paths before v becomes a type-(ii) node.

• The number of type-(iii) nodes when all the O(m) find-paths with n make-sets and O(m) link operations
P(lg∗ n)−1
considered equals to i=0 (number of nodes whose ranks are in block-i) ∗ (for any node v in block-i,
maximum number of times v’s parent’s rank is incremented by one while v’s parent’s rank continues to
lie in block-i). ——— (2)

2
Let minri be the minimum rank possible in block-i. Also, let maxri be the maximum rank possible in
n n n n
block-i. From Theorem 1, the first term of (2) is at most 2minr i
+ 2minri +1
+ . . . + 2maxri < 2minri −1 =
n
maxri . The last equality is due to the following: since maximum rank possible in any block is a tower of
2s, 2maxri−1 = maxri ; however, minri = maxri−1 + 1.

The second term of (2) is maximized if v has rank minri and its parents’ ranks increase amid find-paths
in increments of one, from minri + 1 to maxri .
P(lg∗ n)−1 n
Hence, (2) is at most i=0 ( maxri ∗ (maxri − (minri + 1) + 1)) = O(n lg∗ n).

• Combining (1a), (1b), (1c), with (2), the amortized time complexity of m make-set, union, find-set oper-
ations in which n are make-set operations is O(m lg∗ n).

References:
Set Merging Algorithms. J. E. Hopcroft and J. D. Ullman. SIAM Journal on Computing, Vol. 2(4): 294-303,
1973. (This note only covers pages 7-8 of this paper.)

3
Appendix

1 make-set(x):
2 set x as x’s parent
3 initialize [Link] to 0

1 union(x’, y’):
2 if ((x ← f ind-set(x′ ))! = (y ← f ind-set(y ′ )) then
3 link(x, y)

1 link(x, y):
2 if [Link] < [Link] then
3 link x as a child of y
4 else
5 link y as a child of x
6 if [Link] is equal to [Link] then
7 increase the rank of y by one

1 find-set(x):
2 foreach node x′ on the simple path from x to root v do
3 //visiting nodes along this path is called a find-path on x
4 make x′ as a child of v //called path compression
5 end

You might also like