Design HashMap – 4 of 4

Today is the last Wednesday of 2020. New year 2021 will be here this Friday. My best wishes for the New Year. It seems that most of us will be receiving the COVID-19 vaccine in 2021 so we should be able to reach worldwide heard immunity. We need to learn and understand how the pandemic started and make sure we do not allow this to repeat. Millions human beings have died due to COVID-19 and the economies of the world have been severely impacted. Let’s make sure we work together to make it happen.

On a side note one of my wife sisters and family live in Israel. Israel has a population of about 9M people. She and her family got the first COVID-19 Pfizer vaccine earlier today. None experienced side effects. The Israeli government plans of having the entire population vaccinated in 10 days. They want to achieve herd immunity as soon as possible to get back to a normal life. Good for them!!!

I am working on some code for work, so after breakfast, will take a shower and will get the task completed today. It will be a long and stressful day.

I apologize if I have spent too much time and four posts (including this one) on the design and implementation of a hash map. To be honest with you, I was planning on one more pass but I had enough. Hopefully it should become obvious how to make that final pass on your own. I will just provide the steps and the reasoning behind.

This post covers the design and implementation of a HashMap using Java. The original problem comes courtesy of LeetCode 706. Design HashMap. I will use a Windows 10 computer and will develop the code using the VSCode IDE.

Design a HashMap without using any built-in hash table libraries.

To be specific, your design should include these functions:

o put(key, value) : Insert a (key, value) pair into the HashMap.
  If the value already exists in the HashMap, update the value.
o get(key): Returns the value to which the specified key is mapped, 
  or -1 if this map contains no mapping for the key.
o remove(key) : Remove the mapping for the value key if this map contains the mapping for the key.

Note:

o All keys and values will be in the range of [0, 1000000].
o The number of operations will be in the range of [1, 10000].
o Please do not use the built-in HashMap library.

We have seen the requirements several times. For this reason we will just continue.

main <<< hm: [c:4 e:1 l:0.25] (1,1) 
main <<< hm: [c:4 e:2 l:0.5] (1,1) (2,2)
main <<< hm.get(1): 1
main <<< hm.get(3): -1
main <<< hm: [c:4 e:2 l:0.5] (1,1) (2,1)
main <<< hm.get(2): 1
main <<< hm: [c:4 e:1 l:0.25] (1,1)
main <<< hm.get(2): -1
main <<< hm: [c:4 e:2 l:0.5] (1,1) (2,4)
main <<< hm: [c:8 e:3 l:0.375] (1,1) (2,4) (3,6)
main <<< hm: [c:8 e:4 l:0.5] (1,1) (2,4) (3,6) (4,8)
main <<< hm: [c:8 e:5 l:0.625] (1,1) (2,4) (3,6) (4,8) (5,10)
main <<< hm: [c:16 e:6 l:0.375] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12)
main <<< hm: [c:16 e:7 l:0.4375] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14)
main <<< hm: [c:16 e:8 l:0.5] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16)
main <<< hm: [c:16 e:9 l:0.5625] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18)
main <<< hm: [c:16 e:10 l:0.625] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20)
main <<< hm: [c:16 e:11 l:0.6875] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20) (11,22)
main <<< hm: [c:32 e:12 l:0.375] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20) (11,22) (12,24)
main <<< hm: [c:32 e:13 l:0.40625] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20) (11,22) (12,24) (13,26)
main <<< hm: [c:32 e:14 l:0.4375] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20) (11,22) (12,24) (13,26) (14,28)
main <<< hm: [c:32 e:15 l:0.46875] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20) (11,22) (12,24) (13,26) (14,28) (15,30)        
main <<< hm: [c:32 e:16 l:0.5] (1,1) (2,4) (3,6) (4,8) (5,10) (6,12) (7,14) (8,16) (9,18) (10,20) (11,22) (12,24) (13,26) (14,28) (15,30) (16,32) 

This is not the output of the sample code provided by LeetCode. That said; the first few lines are and they seem to match the expected output.

After the last get() operation, we will enter a loop inserting key:value pairs into our HashMap. Note that when we display the contents we have additional information being displayed.

The ‘c’ is associated with the capacity of the HashMap. It seems we start with a value of 4 and when we get to the end of the test, the capacity has been increased to 32.

The ‘e’ is associated with the number of entries in our HashMap. Note that this value matches the number of key:value pairs in the HashMap.

The ‘l’ is associated with a decimal. It describes the load of the HashMap. As we increase the number of entries, the load increases. At some point the capacity increases causing the load factor to decrease.

At this time you might wish to take a piece of paper and pen and draw the array with the entries and figure out how it is growing. More important is to figure out how the entries in the map are placed after the map has grown in capacity. I will be here waiting for you …

… Glad that you came back. Now let’s take a look at the test scaffolding implemented in the following class:

/**
 * Design a hashmap in Java.
 * Fourth and last approach.
 * Please note that the related project names are NOT in ascending order.
 */
public class DesignHashMap3 {

    /**
     * Test scaffolding
     * 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException {

        /*
        // !!!! experiment with the BST !!!!
        // **** open a buffered reader ****
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // **** create and populate an array with input line ****
        int arr[] = Arrays.stream(br.readLine().split(",")).mapToInt(Integer::parseInt).toArray();

        // ???? ????
        System.out.println("main <<< arr: " + Arrays.toString(arr));

        // **** close the buffered reader ****
        br.close();

        // **** shuffle the array ****
        Integer[] tmp       = Arrays.stream(arr).boxed().toArray(Integer[]::new);
        List<Integer> lst   = Arrays.asList(tmp);
        Collections.shuffle(lst);
        int[] sarr          = lst.stream().mapToInt(i -> i).toArray();

        // **** use array instead of sorted array ****
        // int[] sarr = Arrays.copyOf(arr, arr.length);

        // ???? ????
        System.out.println("main <<< sarr: " + Arrays.toString(sarr));

        // **** root for BST ****
        BST tree = new BST();

        // **** populate BST ****
        for (int key : sarr) {
            tree.insert(key, key * 2);
        }

        // **** in-order traversal of the BST ****
        System.out.print("main <<< inOrder: ");
        tree.inOrder();

        // **** display the depth of the BST ****
        System.out.println("main <<< depth: " + tree.depth());

        // **** delete all nodes from the BST ****
        for (int k = -1; k <= arr.length; k++) {

            // ???? ????
            System.out.println("main <<< k: " + k);

            // **** delete this node ****
            tree.delete(k);

            // **** in-order traversal of the BST ****
            System.out.print("main <<< inOrder: ");
            tree.inOrder();

            // **** display the depth of the BST ****
            System.out.println("main <<< depth: " + tree.depth());   
        }
                
        // !!!! experiment with the hashmap !!!!
        System.out.println();
        */


        // **** create my hash map ****
        MyHashMap hm = new MyHashMap();
     
        // **** put key:value pair ****
        hm.put(1, 1);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());

        // **** put key:value pair ****
        hm.put(2, 2);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(1): " + hm.get(1));
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(3): " + hm.get(3));
     
        // **** put key:value pair ****
        hm.put(2, 1);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(2): " + hm.get(2));
     
        // *** remove key:value pair for specified key ****
        hm.remove(2);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(2): " + hm.get(2));

        // **** put key:value pairs ****
        for (int key = 2; key <= 16; key++) {

            hm.put(key, key * 2);

            // ???? ????
            System.out.println("main <<< hm: " + hm.toString());
        }
    }
  }

Note that I have commented out the part that I was using in the previous associated posts that was responsible of loading integers into our HashMap. What I have left in are the sample operations that LeetCode provided to illustrate how our code will be tested.

After the provided test code, we generate a loop and insert some key:value pairs. Keep in mind that the main topic in this port is to grow our HashMap in order to keep get() and put() operations at O(1) execution time. This is not fully achievable because now and then we will need to grow and shrink the capacity of the HashMap.

In this post we will not deal with shrinking the capacity of the HashMap. I leave it as an exercise.

/**
 * Represents a node in the binary search tree (BST).
 */
public class BSTNode {

    // **** class members ****
    int     key;
    int     value;
    BSTNode left;
    BSTNode right;

    /**
     * Constructor
     */
    BSTNode() {
    }

    /**
     * Constructor
     */
    BSTNode(int key, int value) {
        this.key    = key;
        this.value  = value;
    }

    /**
     * Constructor
     */
    BSTNode(int key, int value, BSTNode left, BSTNode right) {
        this.key    = key;
        this.value  = value;
        this.left   = left;
        this.right  = right;
    }

    /**
     * Return a string representation
     */
    @Override
    public String toString() {
        return "(" + this.key + "," + this.value + ") ";
    }
}

I believe the BSTNode class is the same as we had in the previous post in this series. Of importance is that we keep the key:value pairs in this objects which are then inserted or removed in the buckets of the HashMap. Please keep the purpose of these nodes in mind as we move forward.

/**
 * BST class
 */
public class BST {

	// **** root of BST ****
	BSTNode	tree;

    /**
     * constructor
     */
	public BST() {
		this.tree = null;
	}
	
	/**
	 * constructor
	 */
	public BST(int key, int value) {
		this.tree = new BSTNode(key, value);
	}

	/**
	 * Insert value into BST.
	 * Entry point for recursive method.
	 */
	public BSTNode insert(int key, int value) {

		// **** check if tree is empty ****
		if (this.tree == null) {
			this.tree = new BSTNode(key, value);
			return this.tree;
		}

		// **** look for the place in the BST to insert the new node ****
		insert(this.tree, key, value);

		// **** return the BST ****
		return this.tree;
	}

	/**
	 * Insert value into BST.
     * Recursive call.
	 */
	private void insert(BSTNode root, int key, int value) {

		// **** add this node as the root of the BST ****
		if (root == null) {
			root = new BSTNode(key, value);
			return;
		}
		
		// **** deal with the left child ****
		if (key <= root.key) {
			if (root.left == null) {
				root.left = new BSTNode(key, value);
				return;
			} else {
				insert(root.left, key, value);
			}
		}
		
		// **** deal with the right child ****
		else {
			if (root.right == null) {
				root.right = new BSTNode(key, value);
				return;
			}
			else {
				insert(root.right, key, value);
			}
		}
	}

	/**
	 * In-order tree traversal.
	 * Entry point for recursive method.
	 */
	public void inOrder() {
		inOrder(this.tree);
		System.out.println();
	}

	/**
	 * In-order BST traversal.
	 * Recursive call
	 */
    private void inOrder(BSTNode node) {
        if (node != null) {
            inOrder(node.left);
            System.out.print(node.toString());
            inOrder(node.right);
        }
	}

	/**
	 * Find the depth of a BT.
	 * Entry point for recursive call
	 */
	public int depth() {
		return depth(this.tree);
	}
	
	/**
	 * Determine the depth of a BT.
	 * Recursive call
	 */
	private int depth(BSTNode root) {

		// **** end condition ****
		if (root == null)
			return 0;

		// **** ****
		return Math.max(depth(root.left) + 1, depth(root.right) + 1);
	}

	/**
	 * find the in order successor (min value)
	 */
	private BSTNode inOrderSuccessor(BSTNode root) {
		if (root.left == null)
			return root;
		else
			return inOrderSuccessor(root.left);
	}

	/**
	 * Search for node and delete it from the BST.
	 * Entry point for recursive call.
	 */
	public BSTNode delete(int key) {

		// **** ****
		BSTNode node = delete(this.tree, key);

		// **** update BST root (if needed) ****
		if (tree != null && tree.key == key)
			tree = node;

		// **** return BST root node ****
		return node;
	}

	/**
	 * Search for node and delete it from the BST
	 * Recursive call.
	 */
    private BSTNode delete(BSTNode root, int key) {
     
        // **** check for a null root ****
        if (root == null) {
            return root;
        }
         
        // **** search for the node on the left side of the BST ****
		if (key < root.key) {
			root.left = delete(root.left, key);
		}

		// **** search for the node on the right side of the BST **** 
		else if (key > root.key) {
            root.right = delete(root.right, key);
        }
         
        // **** found node to delete ****
        else {
                         
            // **** node to delete has both children ****
            if ((root.left != null) && (root.right != null)) {
                 
                // **** save the root (being deleted) ****
                BSTNode temp = root;        
                 
                // **** find the in order successor (min val on right tree) ****
                BSTNode node = inOrderSuccessor(temp.right);
 
                // **** replace the value of the root with one from the node ****
				root.key 	= node.key;
				root.value 	= node.value;
                     
                // **** delete the node ****
                root.right = delete(root.right, root.key);
            }
 
            // **** node to delete only has a right child ****
            else if (root.left == null) {
                return root = root.right;
            }
 
            // **** node to delete only has left child ****
            else if (root.right == null) {
                return root = root.left;
            }
        }
         
        // **** return root node ****
        return root;
	}
	

	/**
	 * In-order BST traversal.
	 * Collects in a list the nodes in the specified BST.
	 * Recursive call.
	 */
    public void toList(BSTNode node, List<BSTNode> lst) {
        if (node != null) {

			// **** traverse left sub tree ****
			toList(node.left, lst);
			
			// **** add node to list ****
			lst.add(node);
			
			// **** traverse right tree ****
            toList(node.right, lst);
        }
	}

	/**
	 * Look for the specified node in the BST.
	 * Entry point for recursive call.
	 */
	public BSTNode contains(int key) {

		// **** root of BST ****
		BSTNode node = this.tree;

		// **** check if key in BST ****
		return contains(node, key);
	}

	/**
	 * Look for the specified node in the BST.
	 * Recursive call.
	 */
	private BSTNode contains(BSTNode node, int key) {

		// **** end condition ****
		if (node == null)
			return null;

		// **** found node with key ****
		if (node.key == key)
			return node;

		// **** recurse left ****
		BSTNode f = contains(node.left, key);

		// **** node found ****
		if (f != null)
			return f;

		// **** recurse right ****
		f = contains(node.right, key);

		// **** return node (if found) ****
		return f;
	}
}

The BST class is used to implement the buckets of the map in the HashMap. In our last implementation we saw how by implementing buckets we can address key collisions. We could have done it via a linked list, but the traversal of a linked list is in O(n) and the traversal of a BST is O(log(n)) which is much faster but not O(1) as we would like it to be. Once again, it is important to understand the use of the BST as buckets in our HashMap implementation.

I do not recall making changes to this class either so let’s move on. If you need additional information regarding a specific method, please leave me a note bellow and I will respond as soon as possible.

/**
 * Designs and implements a HashMap without using 
 * any built-in hash table libraries.
 * 
 * Runtime: 13 ms, faster than 73.67% of Java online submissions
 * Memory Usage: 43.2 MB, less than 74.40% of Java online submissions
 */
class MyHashMap {
  
    // **** constant(s) ****
    final int INITIAL_CAPACITY      = 4;        // 4000 used in LeetCode
    final double LOAD_FACTOR        = 0.75;
    final double SHRINK_LOAD_FACTOR = 0.4;

    // **** class members ****
    private BST[]   map;
    public int      capacity;
    public int      entries;

    // **** constructor ****
    public MyHashMap() {
        this.map        = new BST[INITIAL_CAPACITY];
        this.capacity   = INITIAL_CAPACITY;
        this.entries    = 0;
    }

    // **** map a key to a bucket number ****
    private int hashFc(int key) {
        return key & (capacity - 1);
    }

    // **** computes load of hashmap ****
    private double load() {
        return (double)this.entries / (double)this.capacity;
    }

    // **** key will always be non-negative ****
    public void put(int key, int value) {
 
        // **** ****
        int entCnt = this.entries;

        // **** compute the bucket number ****
        int bucket = hashFc(key);

        // **** set the key value pair in the map ****
        if (this.map[bucket] == null) {
            this.map[bucket] = new BST(key, value);
            this.entries++;
        } else {

            // **** check if this bucket contains the key ****
            BSTNode f = this.map[bucket].contains(key);

            // **** update value associated with existing key - value pair ****
            if (f != null) {
                f.value = value;
            }

            // **** insert new key - value pair ****
            else {
                this.map[bucket].insert(key, value);
                this.entries++;
            }
        }

        // **** check the load of the map and increase capacity (if needed) ****
        if (this.entries != entCnt) {
            grow();
        }
    }

    // **** returns the value to which the specified key is mapped, 
    //      or -1 if this map contains no mapping for the key ****
    public int get(int key) {
 
        // **** compute the bucket number ****
        int bucket = hashFc(key);

        // **** return the associated value ****
        if (this.map[bucket] == null) {
            return -1;
        } else {

            // **** get node associated with specified key ****
            BSTNode n = this.map[bucket].contains(key);

            // **** key:value pair not in bucket ****
            if (n == null)
                return -1;

            // **** ****
            return n.value;
        }
    }

    // **** remove the mapping of the specified value key 
    //      if this map contains a mapping for the key ****
    public void remove(int key) {
 
        // **** compute the bucket number ****
        int bucket = hashFc(key);

        // **** check if no key:value pairs in this bucket ****
        if (this.map[bucket] == null)
            return;

        // **** remove key:value pair node from bucket (if contained) ****
        this.map[bucket].delete(key);

        // **** decrement count of entries ****
        this.entries--;

        // **** check and shrink the map (if needed) ****
        shrink();
    }

    // **** check the load of the map and increase the capacity (if needed) ****
    private void grow() {

        // **** determine if we need to grow the map ****
        double load = load();
        if (load < LOAD_FACTOR)
            return;

        // **** grow the map ****
        this.map = Arrays.copyOf(this.map, this.capacity *= 2);
    }

    // **** shrinking the capacity of the map left
    //      as an exercise for the ambitious reader ****
    private void shrink() {

        // **** determine if we need to shrink the map ****
        double load = load();
        if (this.capacity >= INITIAL_CAPACITY && load < SHRINK_LOAD_FACTOR)
            return;
    }

    // **** for testing purpose only ****
    @Override
    public String toString() {

        // **** initialization ****
        StringBuilder sb    = new StringBuilder();
        List<BSTNode> lst   = new LinkedList<BSTNode>();

        // **** collect hashmap info ****
        sb.append("[c:" + this.capacity);
        sb.append(" e:" + this.entries);
        sb.append(" l:" + load() + "] ");

        // **** traverse the map ****
        for (int bucket = 0; bucket < map.length; bucket++) {

            // **** skip BST if null ****
            if (this.map[bucket] == null)
                continue;

            // **** collect nodes into list ****
            this.map[bucket].toList(this.map[bucket].tree, lst);
        }

        // **** comparator by key ****
        Comparator<BSTNode> compareByKey = (BSTNode n1, BSTNode n2) -> {
            if (n1.key > n2.key)
                return 1;
            else if (n1.key < n2.key)
                return -1;
            else 
                return 0;
        };

        // **** sort list ****
        Collections.sort(lst, compareByKey);

        // **** build string ****
        lst.forEach( n -> {
            sb.append(n.toString());
        });

        // **** return string ****
        return sb.toString();
    }
}

This is where most of the changes were made. Note the INITIAL_CAPACITY is set to 4. This is done to force resizing in a short time period. The class members include a number of entries. Note that the map[] is of type BST. This has not been changed from the previous pass.

The hashFc() method has changed in order to take into account the capacity. This will become clear when we see how the HashMap grows.

The load() method is used to return the percentage of the map [] that is currently being used. Please take a moment to look at the ‘l’ values as we grow the map in the test output.

The put() method is similar to the previous version. Note that we now store the entry count and use it in the last statements to verify if a new entry was added or not. If so, we check and grow the map[] as needed.

The get() method has not changed from the previous version.

The remove() method has a new line to invoke the shrink() method. The shrink method does nothing at this time. It should check if we can reduce the capacity of the map[] array and if needed do so. I am leaving the implementation of the shrink() method as an exercise for you.

The grow() method is new. We first check if the current load in the map[] is big enough to increase the capacity of the map[] array. If needed, we do it in a single line of code.

At this time please reflect on how the grow() method works. Think about the use of the value returned by the load() method.

Moving on, the shrink() method is used to reduce the capacity of the map[]. I decided to leave it as an exercise. I have spent a lot of time and four posts on the HashMap problem. Enough is enough.

Not much to say about the toString() method. It is there so we can look at the contents of the HashMap. As you can see a lot of the work is done to process the buckets in the map[] array. Do you have any thoughts on this?

I guess that is it for HashMap implementations. I have something else to add. Years ago I implemented a HashMap using the C programming language for a product that is currently operational in the cloud. This was my personal motivation to implement a HashMap using the Java programming language.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to help out with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private e-mail message. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, become proficient, refresh your knowledge and enhance your developer toolset.

One last thing, many thanks to all 5,333 subscribers to this blog!!!

Keep safe during the COVID-19 pandemic and help restart the world economy. I believe we can all see the light at the end of the tunnel.

Regards;

John

john.canessa@gmail.com

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.