# Union-Find and Adapter Design Pattern Earlier today I finished reading chapter 1.5 Case Study: Union-Find in the Algorithms Fourth Edition book by Robert Sedgewick and Kevin Wayne published by Addison-Wesley. Not sure if the authors recommend reading the chapters in order because they tend to use previous material. Last week I was browsing the table of contents and decided to read chapter 4.1. It mentions contents of chapter 1.5. From now on, I will read the book in order ;o)

It is always good to work on a single topic. I am breaking that rule to some extent. I will cover two very similar implementations of the Union-Find algorithm from chapter 1.5. It does give an introductory flavor for what is to come in chapter 4.1 Undirected Graphs.

This chapter covers the data structure and associated algorithm to determine if a set of components are connected due to a union operation. Always start with an API and test code. The API for this algorithm is defined in page 219 and summarized as follows:

 Method Description UF(int N) Initialize N sites with integer names (0 to N – 1) void union(int p, int q) Add connection between p and q int find(int p) Find component identifier for p (0 to N – 1) boolean connected(int p, int q) Return true if p and q are in the same component int count() Number of components

When solving a problem always think on the algorithm and the data structure to use. They tend to go hand in hand as is illustrated in this chapter.

The chapter covers in incremental order, how the algorithm is refined and the data structure updated to support it. In this blog I skip the first implementation (quick-find) and jump to the second (quick-union) and third (weighted quick-union) ones. If interested, I strongly recommend getting and reading the book.

As I mentioned, I typically like to adhere to a single concept per blog post. What motivated me to divert from this approach was that after implementing quick-find I reused the code and implemented quick-union. At that point I decided to copy and paste it in the test code and implement the weighted quick-union. To compare them I decided to use the Adapter Design Pattern. Sorry about this. I will avoid it in the future ;o) I took a quick refresher at the Adapter design pattern on-line and read the associated topic (Adapter page 139 in chapter 4 Structural Patterns in Design Patterns – Elements of Reusable Object-Oriented Software by Erich Gamma et al). I like to keep handy this Design Patterns book. Following are the results of the two last implementations of Union-Find as displayed in the console of the Eclipse IDE:

main <<< N: 625

main <<< elapsedTime: 31 ms

main <<< count: 3 components

main <<< N: 625

main <<< elapsedTime: 10 ms

main <<< count: 3 components

The input file found in the associated Algorithms book web site contains 625 sites (numbers from 0 to 624). The first algorithm with the same data takes 31 ms to execute. The second algorithm takes 10 ms to process the same data. I understand that a single run should never be used, but in this case it seems to run about three times faster. In general, at a minimum run five passes. Drop the fastest and the slowest. Average the remaining three.

The following output is when a larger sample was used:

main <<< N: 1000000

main <<< elapsedTime: 136074 ms

main <<< count: 780805 components

main <<< N: 1000000

main <<< elapsedTime: 688 ms

main <<< count: 780805 components

The first algorithm takes over two minutes while the second one takes less than one second!!! That is a huge improvement. The book talks about the results and how to interpret them. It is worthwhile reading it to better understand the implications of algorithm development and associated data structures. Algorithms like this took many computer scientists many years to define and refine.

The Java code for the test follows:

package edu.princeton.cs.algs4;

import java.io.File;

import java.io.FileNotFoundException;

import java.io.IOException;

import java.util.Scanner;

public class Solution {

/*

* test module

*/

public static void main(String[] args) throws IOException {

// **** set the name for the file ****

//            final String fileName      = “c:\\temp\\tinyUF.txt”;

final String fileName      = “c:\\temp\\mediumUF.txt”;

//            final String fileName      = “c:\\temp\\largeUF.txt”;

// **** check if the file is present and is not a directory ****

File file = new File(fileName);

if (!file.exists() || file.isDirectory()) {

throw new FileNotFoundException(“main <<< fileName ==>” + fileName + “<==”);

}

// **** open the file  ****

Scanner sc    = new Scanner(file);

// **** read the number of components ****

int    N             = sc.nextInt();

System.out.println(“main <<< N: ” + N);

// **** ****

long startTime             = System.currentTimeMillis();

UnionFindQuick uf    = new UnionFindQuick(N);

while (sc.hasNext()) {

int p  = sc.nextInt();

int q  = sc.nextInt();

// **** if connected (skip) ****

if (uf.connectedQ(p,  q)) {

continue;

}

// **** combine components ****

uf.unionQ(p, q);

}

long endTime = System.currentTimeMillis();

// **** close the file ****

sc.close();

// **** ****

System.out.println(“main <<< elapsedTime: ” + (endTime – startTime) + ” ms”);

System.out.println(“main <<< count: ” + uf.countQ() + ” components”);

// **** open the file  ****

sc = new Scanner(file);

// **** read the number of components ****

N = sc.nextInt();

System.out.println(“main <<< N: ” + N);

// **** ****

startTime                         = System.currentTimeMillis();

UnionFindWeight ufw        = new UnionFindWeight(N);

while (sc.hasNext()) {

int p  = sc.nextInt();

int q  = sc.nextInt();

// **** if connected (skip) ****

if (ufa.connectedQ(p,  q)) {

continue;

}

// **** combine components ****

ufa.unionQ(p, q);

}

endTime = System.currentTimeMillis();

// **** close the file ****

sc.close();

// **** ****

System.out.println(“main <<< elapsedTime: ” + (endTime – startTime) + ” ms”);

System.out.println(“main <<< count: ” + ufa.countQ() + ” components”);

}

}

The data file is opened and close twice. In between, the same set of steps is performed.

The following file implements the Union-Find Quick algorithm and data structure:

package edu.princeton.cs.algs4;

public class UnionFindQuick {

/*

* members

*/

private int   id[];

private int   count;

/*

* constructor

*/

public UnionFindQuick(int N) {

this.count    = N;

this.id       = new int[N];

for (int i = 0; i < N; i++) {

this.id[i] = i;

}

}

/*

* add connection between p and q

*/

public void unionQ(int p, int q) {

int pID       = findQ(p);

int qID = findQ(q);

// **** check if p and q are in the same component ****

if (pID == qID) {

return;

}

// **** change value of component (arbitrary) ****

for (int i = 0; i < id.length; i++) {

if (id[i] == pID) {

id[i] = qID;

}

}

// **** decrement count of components ****

count–;

}

/*

* component identifier for p (0 to N – 1)

*/

public int findQ(int p) {

return this.id[p];

}

/*

* return true if p and q are in the same component

*/

public boolean connectedQ(int p, int q) {

return findQ(p) == findQ(q);

}

/*

* number of components

*/

public int countQ() {

return this.count;

}

}

The following source code implements the Weighted Union-Find algorithm:

package edu.princeton.cs.algs4;

public class UnionFindWeight {

/*

* members

*/

private int   id[];

private int   sz[];

private int   count;

/*

* constructor

*/

public UnionFindWeight(int N) {

this.count    = N;

this.id       = new int[N];

for (int i = 0; i < N; i++) {

this.id[i] = i;

}

this.sz              = new int[N];

for (int i = 0; i < N; i++) {

sz[i] = 1;

}

}

/*

* add connection between p and q

*/

public void unionW(int p, int q) {

int i  = findW(p);

int j = findW(q);

// **** check if p and q are in the same component ****

if (i == j) {

return;

}

// **** make smaller root point to larger one ****

if (sz[i] < sz[j]) {

id[i] = j;

sz[j] += sz[i];

} else {

id[j]  = i;

sz[i]  += sz[j];

}

// **** decrement count of components ****

count–;

}

/*

* component identifier for p (0 to N – 1)

*/

public int findW(int p) {

// **** go up the tree to the root ****

while (p != id[p]) {

p = id[p];

}

// **** return the root ****

return p;

}

/*

* return true if p and q are in the same component

*/

public boolean connectedW(int p, int q) {

return findW(p) == findW(q);

}

/*

* number of components

*/

public int countW() {

return this.count;

}

} Last but not least, here is the implementation for the UnionFindAdapter class:

package edu.princeton.cs.algs4;

// **** ****

UnionFindWeight ufw;

/*

*  constructor

*/

this.ufw = ufw;

}

/*

*

*/

public void unionQ(int p, int q) {

ufw.unionW(p, q);

}

/*

*

*/

public int findQ(int p) {

return ufw.findW(p);

}

/*

*

*/

public boolean connectedQ(int p, int q) {

return ufw.connectedW(p, q);

}

/*

*

*/

public int countQ() {

return ufw.countW();

}

}

I also put together the following diagram that illustrates the use and implementation of the Adapter Design Pattern: If you have comments or questions regarding this or any other post in this web site, please do not hesitate and send me an email message.

John

john.canessa@gmail.com

This site uses Akismet to reduce spam. Learn how your comment data is processed.