Design HashMap – Third approach

Good day and hopefully you had a nice long weekend.

It seems that COVID-19 vaccines have been inoculated to about two million people in the USA. Apparently the vaccine is being well received by recipients. The issue at this point seems to be able to determine the efficacy against different virus mutations and how long our bodies will maintain resistance to it. In my humble opinion these two factors will take time to determine and must be backed up the actual (non edited) and unbiased data.

My sister sent me a link to the article “History isn’t just for patriots” by Daniel Immerwahr published by The Washington Post. I enjoy reading (mostly computer science related books, papers and articles) but also enjoy reading different subjects in order to understand the world which is inhabited by humans. In general we are extremely complex and hard to understand.

Reading the article I detected a taste of political bias. If you are interested I encourage you to read the article and form your own opinion.

At some point the author makes a comparison between history and geometry, in specific to the Pythagorean Theorem. I asked my wife when was the last time she recalls using it, if any, and for what purpose. She had no recollection of ever using it outside the geometry class while attending middle school. On the other hand, I use it often due that I design and develop software and have interest in algorithms and data structures among other computer science subjects.

On the other hand, we both have read several history books from different countries and at different times. To me unbiased history, which is very hard to find, provides facts that we can process and formulate an idea of what had happened in a certain part of the world during a specific period in time. With that information I hope to be able to better understand opinions from others. It is a fact that the winners write history.

Now a day we have a different type of influence. It comes in different forms and sources. For example I read the article in question that was published by The Washington Post and was sent to me by my sister who currently is living and teaching in China. After understanding the sources I need to read the article in an unbiased manner attempting to understand the set of points the author is providing and to understand the logic behind them. I have made my opinion and hopefully if you have a chance to read the article, you will also be able to generate your own educated opinion.

OK, enough chitchat. Let’s get to the main subject for this post.

I have never before spent so much time on a topic in this blog. I have studied and worked with different HashMap implementations. We are dealing with the third one. I am sure there are many others with yet better approaches. Let’s take a look at this approach and later discuss the advantages and disadvantages. The idea is not to implement your own HashMap. In my humble opinion in most cases it is better to use the HashMap implementation provided by the programming language or by a well known library.

We will attempt a new pass on the LeetCode 706. Design HashMap problem. So far both of the previous implementations were accepted by LeetCode. If interested in them please take a look at Design HashMap and Design HashMap – Second approach posts in this blog.

Design a HashMap without using any built-in hash table libraries.

To be specific, your design should include these functions:

o put(key, value) : Insert a (key, value) pair into the HashMap.
  If the value already exists in the HashMap, update the value.
o get(key): Returns the value to which the specified key is mapped, 
  or -1 if this map contains no mapping for the key.
o remove(key) : Remove the mapping for the value key if this map contains the mapping for the key.

Note:

o All keys and values will be in the range of [0, 1000000].
o The number of operations will be in the range of [1, 10000].
o Please do not use the built-in HashMap library.

The requirements are simple. I would say that the design should state AT LEAST or AT A MINIMUM these functions. In this pass we will have additional methods which should help with the issue of collisions. More on this as we look at the code.

class MyHashMap {

    /** Initialize your data structure here. */
    public MyHashMap() {
        
    }
    
    /** value will always be non-negative. */
    public void put(int key, int value) {
        
    }
    
    /** Returns the value to which the specified key is mapped, or -1 if this map contains no mapping for the key */
    public int get(int key) {
        
    }
    
    /** Removes the mapping of the specified value key if this map contains a mapping for the key */
    public void remove(int key) {
        
    }
}

/**
 * Your MyHashMap object will be instantiated and called as such:
 * MyHashMap obj = new MyHashMap();
 * obj.put(key,value);
 * int param_2 = obj.get(key);
 * obj.remove(key);
 */

MyHashMap hashMap = new MyHashMap();
hashMap.put(1, 1);          
hashMap.put(2, 2);         
hashMap.get(1);            // returns 1
hashMap.get(3);            // returns -1 (not found)
hashMap.put(2, 1);          // update the existing value
hashMap.get(2);            // returns 1 
hashMap.remove(2);          // remove the mapping for 2
hashMap.get(2);            // returns -1 (not found)

We see the layout of the MyHashMap class. In addition we are provided with an overview of how our solution will be tested. The example set of calls does not illustrate the issue of collisions. Collisions are associated with the implementation approach and it is a function of the capacity of the HashMap versus the possible values for the keys. In the Notes section we are given a range of key values [0, 1000000]. Note that the number of operations is in the range [1, 10000]. If we allocate space for 1000000 keys, based on the number of operations, we can figure out that most of the entries in our map data structure will be empty (or null). If we make the space quite smaller, then we will get collisions. A collision is defined when two keys map to the same bucket.

0,1,2,3,4,5,6,7
main <<< arr: [0, 1, 2, 3, 4, 5, 6, 7]
main <<< sarr: [5, 6, 3, 2, 4, 0, 1, 7]
main <<< inOrder: (0,0) (1,10) (2,20) (3,30) (4,40) (5,50) (6,60) (7,70)
main <<< depth: 5
main <<< k: -1
main <<< inOrder: (0,0) (1,10) (2,20) (3,30) (4,40) (5,50) (6,60) (7,70)
main <<< depth: 5
main <<< k: 0
main <<< inOrder: (1,10) (2,20) (3,30) (4,40) (5,50) (6,60) (7,70)
main <<< depth: 4
main <<< k: 1
main <<< inOrder: (2,20) (3,30) (4,40) (5,50) (6,60) (7,70)
main <<< depth: 3
main <<< k: 2
main <<< inOrder: (3,30) (4,40) (5,50) (6,60) (7,70)
main <<< depth: 3
main <<< k: 3
main <<< inOrder: (4,40) (5,50) (6,60) (7,70)
main <<< depth: 3
main <<< k: 4
main <<< inOrder: (5,50) (6,60) (7,70)
main <<< depth: 3
main <<< k: 5
main <<< inOrder: (6,60) (7,70)
main <<< depth: 2
main <<< k: 6
main <<< inOrder: (7,70)
main <<< depth: 1
main <<< k: 7
main <<< inOrder:
main <<< depth: 0
main <<< k: 8
main <<< inOrder:
main <<< depth: 0

main <<< hm: (1,1) (2,2)
main <<< hm.get(1): 1
main <<< hm.get(3): -1
main <<< hm: (1,1) (2,1)
main <<< hm.get(2): 1
main <<< hm: (1,1)
main <<< hm.get(2): -1

This is a test for the HashMap we will be developing. We will develop the solution using the Java programming language on a Windows 10 computer using the VSCode IDE. Due to this fact we will need a test scaffolding. When we have a candidate for a solution we will have to copy it to the LeetCode site and run it. The test code IS NOT PART OF THE SOLUTION.

In this test we present a set of numbers in ascending order. We could have placed random numbers or better yet provide a limit and populate an array.

After reading the list of integers we populate and array and display it. So far all seems to be working fine.

The next line seems to display a different array with the values shuffled. If you present values in ascending or decreasing order to some data structures, some issues may come up. We will discuss them as we take a look at the code.

It seems that we load the shuffled values into some type of data structure (spoil alert; to a binary search tree). The associated value for a key appears to be set to the key multiplied by 10.

We then seem to perform an in order traversal of the BST. The depth of the BST is then displayed.

It seems that we enter a loop in which keys are specified in ascending order and we are removing the associated node from the BST. Note that we display the contents of the BST after removing the node with the specified key and then the current depth. All appears to be working fine. Note that we attempted deleting non existing nodes for the first and last passes of the loop.

We then repeat the tests suggested by LeetCode using our HashMap implementation. They seem to work as expected.

/**
 * Design a hashmap in Java.
 * Third approach.
 * Please note that the project names are NOT in ascending order.
 */
public class DesignHashMap1 {

    /**
     * Test scaffolding
     * 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException {

        // !!!! experiment with the BST !!!!
        // **** open a buffered reader ****
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // **** create and populate an array with input line ****
        int arr[] = Arrays.stream(br.readLine().split(",")).mapToInt(Integer::parseInt).toArray();

        // ???? ????
        System.out.println("main <<< arr: " + Arrays.toString(arr));

        // **** close the buffered reader ****
        br.close();

        // **** shuffle the array ****
        Integer[] tmp       = Arrays.stream(arr).boxed().toArray(Integer[]::new);
        List<Integer> lst   = Arrays.asList(tmp);
        Collections.shuffle(lst);
        int[] sarr          = lst.stream().mapToInt(i -> i).toArray();

        // **** use array instead of sorted array ****
        // int[] sarr = Arrays.copyOf(arr, arr.length);

        // ???? ????
        System.out.println("main <<< sarr: " + Arrays.toString(sarr));

        // **** root for BST ****
        BST tree = new BST();

        // **** populate BST ****
        for (int key : sarr) {
            tree.insert(key, key * 10);
        }

        // **** in-order traversal of the BST ****
        System.out.print("main <<< inOrder: ");
        tree.inOrder();

        // **** display the depth of the BST ****
        System.out.println("main <<< depth: " + tree.depth());

        // **** delete all nodes from the BST ****
        for (int k = -1; k <= arr.length; k++) {

            // ???? ????
            System.out.println("main <<< k: " + k);

            // **** delete this node ****
            tree.delete(k);

            // **** in-order traversal of the BST ****
            System.out.print("main <<< inOrder: ");
            tree.inOrder();

            // **** display the depth of the BST ****
            System.out.println("main <<< depth: " + tree.depth());   
        }
  

        // !!!! experiment with the hashmap !!!!
        System.out.println();
        
        // **** create my hash map ****
        MyHashMap hm = new MyHashMap();
     
        // **** put key:value pair ****
        hm.put(1, 1);
     
        // **** put key:value pair ****
        hm.put(2, 2);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(1): " + hm.get(1));
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(3): " + hm.get(3));
     
        // **** put key:value pair ****
        hm.put(2, 1);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(2): " + hm.get(2));
     
        // *** remove key:value pair for specified key ****
        hm.remove(2);
     
        // ???? ????
        System.out.println("main <<< hm: " + hm.toString());
     
        // **** get value associated specified key ****
        System.out.println("main <<< hm.get(2): " + hm.get(2));
    }
  }

The reason we experiment with a BST is due to the fact that each bucket in our HashMap will be implemented using a BST. We could implement it with a balanced BST but we will not do so in this code. We want to experiment with approaches that might seem fine, but in reality are not as useful as they might seem.

In our test code we read the input line and shuffle the array. If we do not we would be inserting values in our HashMap in ascending order. If we implement the buckets using BSTs, then as collisions occur, we will be implementing a higher cost linked list. If you need time to better understand this statement, comment out the shuffling of the values in the array, enable the copy of the arr[] into the sarr[] arrays, and see what happens with the BST. The depth of the tree grows at the same rate as the number of elements entered. You can enter a list of integers in ascending order and the results will be similar. The only difference is that the BST will have a very long left sub tree.

We then create BST which we will use to experiment with.

We populate the BST with the values in the sarr[] array. Note that each node has a key and a value. The value is generated by multiplying the key by 10.

We then perform an in order tree traversal. This implies that the keys will be displayed in ascending order. The results seem to match our expectations.

We then compute and display the depth of the BST. If you have some trouble with the depth changing, my suggestion is to draw the results on a piece of paper while we are adding and then deleting nodes from the BST. The code will help to understand the operations. That is coming in a few more lines.

When done removing nodes in the BST we switch to the HashMap.

We start by creating our HashMap. We add a couple key:value pairs. Note that in these tests we use the same value for the key and the value (we do not multiply it by 10).

We then get and display values from the HashMap. In some cases we attempt to get a value for a non-existing key. At some point we remove a key from the HashMap and then we attempt to get it. Note that we always return a -1 for the value if the key:value pair is not available.

/**
 * Represents a node in the binary search tree (BST).
 */
public class BSTNode {

    // **** class members ****
    int     key;
    int     value;
    BSTNode left;
    BSTNode right;

    /**
     * Constructor
     */
    BSTNode() {
    }

    /**
     * Constructor
     */
    BSTNode(int key, int value) {
        this.key    = key;
        this.value  = value;
    }

    /**
     * Constructor
     */
    BSTNode(int key, int value, BSTNode left, BSTNode right) {
        this.key    = key;
        this.value  = value;
        this.left   = left;
        this.right  = right;
    }

    /**
     * Return a string representation
     */
    @Override
    public String toString() {
        return "(" + this.key + "," + this.value + ") ";
    }
}

As you have noted, we use the BSTNode class to keep track of the key:value pairs in our BSTs which implement the buckets in our HashMap.

The BSTNode class contains four members. We have three constructors. Not sure if in the current state of the code we make use of all constructors. Feel free to experiment using different constructors. At some point during development I did.

We also included a toString() method in order to display the contents of the nodes in an easy to follow format. I decided to put the class in a different file. In general, for production, it is better to have different classes in their own files. This simplifies maintenance and promotes reuse.

/**
 * BST class
 */
public class BST {

	// **** root of BST ****
	BSTNode	tree;

    /**
     * constructor
     */
	public BST() {
		this.tree = null;
	}
	
	/**
	 * constructor
	 */
	public BST(int key, int value) {
		this.tree = new BSTNode(key, value);
	}

	/**
	 * Insert value into BST.
	 * Entry point for recursive method.
	 */
	public BSTNode insert(int key, int value) {

		// **** check if tree is empty ****
		if (this.tree == null) {
			this.tree = new BSTNode(key, value);
			return this.tree;
		}

		// **** look for the place in the BST to insert the new node ****
		insert(this.tree, key, value);

		// **** return the BST ****
		return this.tree;
	}

	/**
	 * Insert value into BST.
     * Recursive call.
	 */
	private void insert(BSTNode root, int key, int value) {

		// **** add this node as the root of the BST ****
		if (root == null) {
			root = new BSTNode(key, value);
			return;
		}
		
		// **** deal with the left child ****
		if (key <= root.key) {
			if (root.left == null) {
				root.left = new BSTNode(key, value);
				return;
			} else {
				insert(root.left, key, value);
			}
		}
		
		// **** deal with the right child ****
		else {
			if (root.right == null) {
				root.right = new BSTNode(key, value);
				return;
			}
			else {
				insert(root.right, key, value);
			}
		}
	}

	/**
	 * In-order tree traversal.
	 * Entry point for recursive method.
	 */
	public void inOrder() {
		inOrder(this.tree);
		System.out.println();
	}

	/**
	 * In-order BST traversal.
	 * Recursive call
	 */
    private void inOrder(BSTNode node) {
        if (node != null) {
            inOrder(node.left);
            System.out.print(node.toString());
            inOrder(node.right);
        }
	}

	/**
	 * Find the depth of a BT.
	 * Entry point for recursive call
	 */
	public int depth() {
		return depth(this.tree);
	}
	
	/**
	 * Determine the depth of a BT.
	 * Recursive call
	 */
	private int depth(BSTNode root) {

		// **** end condition ****
		if (root == null)
			return 0;

		// **** ****
		return Math.max(depth(root.left) + 1, depth(root.right) + 1);
	}

	/**
	 * find the in order successor (min value)
	 */
	public BSTNode inOrderSuccessor(BSTNode root) {
		if (root.left == null)
			return root;
		else
			return inOrderSuccessor(root.left);
	}

	/**
	 * Search for node and delete it from the BST.
	 * Entry point for recursive call.
	 */
	public BSTNode delete(int key) {

		// **** ****
		BSTNode node = delete(this.tree, key);

		// **** update BST root (if needed) ****
		if (tree != null && tree.key == key)
			tree = node;

		// **** return BST root node ****
		return node;
	}

	/**
	 * Search for node and delete it from the BST
	 * Recursive call.
	 */
    private BSTNode delete(BSTNode root, int key) {
     
        // **** check for a null root ****
        if (root == null) {
            return root;
        }
         
        // **** search for the node on the left side of the BST ****
		if (key < root.key) {
			root.left = delete(root.left, key);
		}

		// **** search for the node on the right side of the BST **** 
		else if (key > root.key) {
            root.right = delete(root.right, key);
        }
         
        // **** found node to delete ****
        else {
                         
            // **** node to delete has both children ****
            if ((root.left != null) && (root.right != null)) {
                 
                // **** save the root (being deleted) ****
                BSTNode temp = root;        
                 
                // **** find the in order successor (min val on right tree) ****
                BSTNode node = inOrderSuccessor(temp.right);
 
                // **** replace the value of the root with one from the node ****
				root.key 	= node.key;
				root.value 	= node.value;
                     
                // **** delete the node ****
                root.right = delete(root.right, root.key);
            }
 
            // **** node to delete only has a right child ****
            else if (root.left == null) {
                return root = root.right;
            }
 
            // **** node to delete only has left child ****
            else if (root.right == null) {
                return root = root.left;
            }
        }
         
        // **** return root node ****
        return root;
	}
	
	/**
	 * In-order BST traversal.
	 * Collects in a list the nodes in the specified BST.
	 * Recursive call.
	 */
    public void toList(BSTNode node, List<BSTNode> lst) {
        if (node != null) {

			// **** traverse left sub tree ****
			toList(node.left, lst);
			
			// **** add node to list ****
			lst.add(node);
			
			// **** traverse right tree ****
            toList(node.right, lst);
        }
	}

	/**
	 * Look for the specified node in the BST.
	 * Entry point for recursive call.
	 */
	public BSTNode contains(int key) {

		// **** root of BST ****
		BSTNode node = this.tree;

		// **** check if key in BST ****
		return contains(node, key);
	}

	/**
	 * Look for the specified node in the BST.
	 * Recursive call.
	 */
	private BSTNode contains(BSTNode node, int key) {

		// **** end condition ****
		if (node == null)
			return null;

		// **** found node with key ****
		if (node.key == key)
			return node;

		// **** recurse left ****
		BSTNode f = contains(node.left, key);

		// **** node found ****
		if (f != null)
			return f;

		// **** recurse right ****
		f = contains(node.right, key);

		// **** return node (if found) ****
		return f;
	}
}

The BST class is used to implement a BST. We will be using a BST on each bucket of our HashMap to hold the key:value pairs. Each BST will handle the collisions in a bucket.

We have a couple constructors. A copy of the root of a tree is kept in the tree variable. Perhaps we could have named it root instead.

The insert operation is implemented using two methods. The first one is public and the second is private. Please feel free to experiment modifying the code to use one method only and reducing the arguments. Note that different approaches will in most cases exhibit different behaviors which may or may not show different values.

The code traverses the BST looking for a spot in which to insert the new node. Note that the BST may or may not be balanced. In future posts I will cover problems that require us to balance a BST using different algorithms.

The in order traversal is also implemented with a public and a private methods. We have used in order traversal in many posts in this blog.

The depth() method has also appeared in several posts. In this case it uses the public and private set of methods. We use this to experiment with the structure of BSTs when dealing with ordered values for the keys.

The inOrderSuccessor() method returns the next node in a BST that will accessed during an in order traversal.

The delete operation in a BST is also implemented with a public and a private method. The idea is to first locate the node in the BST and then delete it.

The toList() method returns a list generated by traversing the BST in order.

The contains() method is used to return the node with the specified key. For example, if we have one or more collision in a bucket of our HashMap, we will need to find the node holding the key:value pair of interest. This operation is also implemented using a public and a private method.

/**
 * Designs and implements a HashMap without using 
 * any built-in hash table libraries.
 * 
 * Runtime: 12 ms, faster than 85.67% of Java online submissions.
 * Memory Usage: 42.8 MB, less than 83.70% of Java online submissions.
 */
class MyHashMap {
  
    // **** constant(s) ****
    final int INITIAL_CAPACITY = 10000;

    // **** class members ****
    private BST[]   map;
    public int      capacity;

    // **** constructor ****
    public MyHashMap() {
        this.map        = new BST[INITIAL_CAPACITY];
        this.capacity   = INITIAL_CAPACITY;
    }

    // **** map a key to a bucket number ****
    private int hashFc(int key) {
        return key % capacity;
    }

   // **** key will always be non-negative ****
   public void put(int key, int value) {
 
        // **** compute the bucket number ****
        int bucket = hashFc(key);

        // **** set the key value pair in the map ****
        if (this.map[bucket] == null) {
            this.map[bucket] = new BST(key, value);
        } else {

            // **** check if this bucket contains the key ****
            BSTNode f = this.map[bucket].contains(key);

            // **** update value associated with existing key - value pair ****
            if (f != null) {
                f.value = value;
            }

            // **** insert new key - value pair ****
            else {
                this.map[bucket].insert(key, value);
            }
        }
    }

    // **** returns the value to which the specified key is mapped, 
    //      or -1 if this map contains no mapping for the key ****
    public int get(int key) {
 
        // **** compute the bucket number ****
        int bucket = hashFc(key);
 
        // **** return the associated value ****
        if (this.map[bucket] == null) {
            return -1;
        } else {

            // **** get node associated with specified key ****
            BSTNode n = this.map[bucket].contains(key);

            // **** key:value pair not in bucket ****
            if (n == null)
                return -1;

            // **** ****
            return n.value;
        }
    }

    // **** remove the mapping of the specified value key 
    //      if this map contains a mapping for the key ****
    public void remove(int key) {
 
        // **** compute the bucket number ****
        int bucket = hashFc(key);

        // **** check if no key:value pairs in this bucket ****
        if (this.map[bucket] == null)
            return;

        // **** remove key:value pair node from bucket (if contained) ****
        this.map[bucket].delete(key);
    }

    // **** for testing purpose only ****
    @Override
    public String toString() {

        // **** initialization ****
        StringBuilder sb    = new StringBuilder();
        List<BSTNode> lst   = new LinkedList<BSTNode>();

        // **** traverse the map ****
        for (int bucket = 0; bucket < map.length; bucket++) {

            // **** skip BST if null ****
            if (this.map[bucket] == null)
                continue;

            // **** collect nodes into list ****
            this.map[bucket].toList(this.map[bucket].tree, lst);
        }

        // **** comparator by key ****
        Comparator<BSTNode> compareByKey = (BSTNode n1, BSTNode n2) -> {
            if (n1.key > n2.key)
                return 1;
            else if (n1.key < n2.key)
                return -1;
            else 
                return 0;
        };

        // **** sort list ****
        Collections.sort(lst, compareByKey);

        // **** build string ****
        lst.forEach( n -> {
            sb.append(n.toString());
        });

        // **** return string ****
        return sb.toString();
    }
}

Finally we get to the implementation of the MyHashMap class.

We start by defining the initial capacity of our HashMap. Note that the capacity in our case is just the number of buckets in our map. Such number will not change in this implementation.

The constructor initializes the map array and the capacity. In this implementation the capacity will be constant and set to the value of the INITIAL_CAPACITY.

We then declare the hashFc() method. This function returns the bucket in which the key:value pair will be stored in our map. In this problem we know that our keys and in the range [0, 1000000]. If we would not have a hash function and a way to handle collisions, we would have to use a map of 1,000000 entries. We can now use a much smaller size in our map since we are able to handle collisions.

So why use buckets or better yet why do we use a HashMap instead of a BST? The reason is due to the execution time. A HashMap should find the associated value for any key in O(1). If we use a BST we are now experiencing O(log(n)) execution time.

I believe the different methods in this class are properly documented and there is not much we could add. That said; if you have a comment or question please let me know.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to help out with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private e-mail message. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, become proficient, refresh your knowledge and enhance your developer toolset.

One last thing, many thanks to all 5,295 subscribers to this blog!!!

Keep safe during the COVID-19 pandemic and help restart the world economy. I believe we can all see the light at the end of the tunnel.

Regards;

John

john.canessa@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.