Least Recently Used Cache – Java

It is a gloomy Sunday in the Twin Cities of Minneapolis and St. Paul. The current temperature is 23F. The forecast calls for snow. Looks like a perfect day to stay home and enjoy some good food and watch a couple movies. It is a thoughtful wishing, but at 09:15 AM my wife and I will be heading to Trader Joe’s in St. Paul to get some veggies and bread. Yesterday we made strawberry ice cream. My son requested a few servings for his family. We will be delivering it later this morning. Hopefully the roads will not be too slippery.

As you might already know, the presidential candidates debated last week. Yesterday morning my wife was checking the news and social networks as she does every morning. She tells me that such candidate won the debate. What does that mean? He arrived first in a 5K run? He scored more hoops? He swam 10 laps faster? In this case winning is completely subjective. Some sources gave the upper hand to one of the candidates while others gave it to his opponent. Sorry but I do not like politicians nor politics. They are irrational.

Since we briefly touched on politics, please do not forget to vote in the upcoming presidential elections. I believe it is a civic duty. Just make sure you are informed and based on that cast your vote for the candidate you believe will be best for our country.

In this post we will extensively cover LeetCode problem 146 LRU Cache. Caches are used by most (never generalize) production software. Since most applications hold large amounts of data caching it all is in most cases, not doable.

I have been dealing with caches for a long time. I work on a storage server which has different types of caches. A LRU (Least Recently Used) cache is the one that uses a particular algorithm. How the algorithm is implemented has an impact on the performance of the cache and the system using it.

In a nutshell, a LRU cache needs to keep no more that a specified number of objects. When the cache fills, the algorithm must evict the least recently used object (hence the name) and replace it with a new object. In order to keep the objects in the cache, we must do something to move them to the start of a queue. The least used will be dropped when new objects arrive.

In my first data structures class at Cornell University, we had to implement a simulation with queues. The class was a lot of fun and I learned a lot. At work I have implemented a library that supports queues, stacks and other related objects. When I saw the problem in LeetCode I just had to spend time with it.

The library (among others) that I designed, for performance reasons, was implemented in C / C++. Some operating systems attach a queue data structure to some of their objects so they can easily can be managed in queues. Since I decided to use Java for this problem, I knew it was going to be hard to get good performance out of the code. So be it.

Design a data structure that follows the constraints of a Least Recently Used (LRU) cache.

Implement the LRUCache class:

o LRUCache(int capacity) Initialize the LRU cache with positive size capacity.
o int get(int key) Return the value of the key if the key exists, otherwise return -1.
o void put(int key, int value) Update the value of the key if the key exists. 
  Otherwise, add the key-value pair to the cache.
  If the number of keys exceeds the capacity from this operation, evict the least recently used key.

Constraints:

1 <= capacity <= 3000
0 <= key <= 3000
0 <= value <= 104
At most 3 * 104 calls will be made to get and put.

Follow up:

Could you do get and put in O(1) time complexity?

The specified capacity is given for each test but it should not exceed 3000 objects. We are provided with a single well explained example. Note that the challenge is to get the operations to perform in O(1) time complexity. I should go back and look at the solution(s) provided by LeetCode. I will do so after I am done with this post. As I mentioned earlier today, my wife and I have to go grocery shopping.

class LRUCache {

    public LRUCache(int capacity) {
        
    }
    
    public int get(int key) {
        
    }
    
    public void put(int key, int value) {
        
    }
}

/**
 * Your LRUCache object will be instantiated and called as such:
 * LRUCache obj = new LRUCache(capacity);
 * int param_1 = obj.get(key);
 * obj.put(key,value);
 */

This problem is different than most. We are not going to implement a single function, we need to implement a class and within it we need at least three methods. The LRUCache() is the constructor. It will be called with the capacity. Our LRU cache will hold at most the number of elements specified in the capacity.

The put() method is used to insert into our LRU cache a value pair. Note that when we insert an element, it has to be brought up to the top (least recently used) in any mechanism we implement. Do not forget that if the LRU cache is full, we need to evict the least recently used object before we insert the new one.

The get() method is used to read the specified object from our LRU cache. If the object is found in the LRU cache, in addition to returning the associated value, we need to move to the top of our mechanism the object because we just used (least recently used) it.

For this problem I will use the Java programming language, the VSCode IDE and will develop the solution on a Windows 10 machine. You can develop your code directly on LeetCode or in your own computer using your own tools. Since I like to develop software on my computer, I also had to provide a test scaffolding.

"LRUCache","put","put","get","put","get","put","get","get","get"
[2],[1,1],[2,2],[1],[3,3],[2],[4,4],[1],[3],[4]

null
null
null
1
null
-1
null
-1
3
4


"LRUCache","put","put","get","put","put","get"
[2],[2,1],[2,2],[2],[1,1],[4,1],[2]

null
null
null
2
null
null
-1


"LRUCache","put","put","put","get","get","put","get","put","get","get","get","get"
[3],[1,1],[2,2],[3,3],[1],[2],[4,4],[2],[5,5],[1],[3],[4],[5]

null
null
null
null
1
2
null
2
null
-1
-1
4
5

The data for each test case is provided in two lines. The first specifies the operations (the three we have to implement in our class) and the second line the parameters used for reach associated call.

I would like to note that in practice you always want to get the best accurate requirements. Based on the provided documentation, I do not know if a key-value pair may contain keys that differ from the values. Perhaps that is made obvious by reading the constraints carefully.

After we perform one of the three operations, we need to display something to be able to track progress and verify our solution. That seems to be expected and easy to implement.

    /**
     * Test scaffolding.
     */
    public static void main(String[] args) {
        
        // **** open scanner ****
        Scanner sc = new Scanner(System.in);

        // **** read operation(s) ****
        String[] ops = sc.nextLine().trim().split(",");

        // **** read argument(s) ****
        String str = sc.nextLine().trim();

        // **** close scanner ****
        sc.close();

        // **** remove leading and trailing "s ****
        for (int i = 0; i < ops.length; i++) {
            ops[i] = ops[i].substring(1, ops[i].length() - 1);
        }

        // **** split the arguments ****
        String[] strArgs = str.split("\\],\\[");

        // **** take care of the first '[' ****
        if (strArgs[0].startsWith("["))
            strArgs[0] = strArgs[0].substring(1);

        // **** take care of the last ']'s ****
        String s = strArgs[strArgs.length - 1];
        while (s.endsWith("]"))
            s = s.substring(0, s.length() - 1);
        strArgs[strArgs.length - 1] = s;

        // ???? ????
        for (int i = 0; i < ops.length; i++)
            System.out.println("main <<< i: " + i + 
                                " op ==>" + ops[i] + "<== arg ==>" + strArgs[i] + "<==");
        System.out.println();

        // **** LRU cache ****
        LRUCache lruCache = null;

        // **** loop performing all operations ****
        int i = -1;
        for (String op : ops) {

            // **** increment index ****
            i++;

            // **** create and initializa the LRU cache ****
            if (op.equals("LRUCache")) {
                lruCache = new LRUCache(Integer.parseInt(strArgs[i]));

                // **** inform 'create LRU cache' operation completed ****
                System.out.println(null + "");
            }

            // **** put the specified value into the LRU cache ****
            else if (op.equals("put")) {

                // **** ****
                String arg = strArgs[i];

                // **** extract key and value ****
                String[] argStr = arg.split(",");

                // **** put key and val in the LRU cache ****
                lruCache.put(Integer.parseInt(argStr[0]), Integer.parseInt(argStr[1]));

                // **** inform 'put' operation completed ****
                System.out.println(null + "");
            }

            // **** get the specified value from the LRU cache ****
            else if (op.equals("get")) {

                // **** get and display the value from the LRU cache ****
                System.out.println(lruCache.get(Integer.parseInt(strArgs[i])));
            }
        }
    }

The test scaffolding opens a scanner, reads in the operations and arguments. The scanner is then closed. For ease we split the arguments because we need to operate them separately.

I guess I could have spent more time with regular expressions in order to process the values with just two regular expressions acting on the two arrays, but for simplicity and time, I just processed them using string operations. Note that the test scaffolding is not part of the solution so performance is not that important for us.

Once we are ready, we loop processing each operation with its associated argument. Note that the returned value from each operation is displayed. It seems that the test scaffolding should work.

I knew that using some Java classes might be somewhat complicated and would not be too efficient. That said, I wanted to start from a brute force approach and get to something more efficient. I will not cover my different attempts which were all accepted but they took too long to execute. If you are interested, you can modify the code in my GitHub repository and experiment with it.

Time ran out so I had to leave for groceries. We got back and took some ice cream, a case of soda pop and a couple kites that were in the garage to my son. We were planning on stopping at Target to get some sundries, but it was cold and snowing. We skipped that part of our outing.

OK, let’s continue.

The problem seems to be calling for some type of hash map and perhaps a queue. Java offers an ample set on both categories. I decided to try some combinations. They all worked but too slow.

After the execution time disappointments (you will see the execution times shortly), I decided to go with a custom doubly linked queue. At some point I experimented with an array and dropped it. I can bet that would have worked even faster. Perhaps I will get back to it and give it a try. That would be a task for another day. My wife and I are planning on wonton soup, ham and cheese sandwiches, green grapes, ice cream and a triple espresso for lunch. I can hardly wait.

    /**
     * Supports the queue and hashmap by the LRU cache.
     */
    class QNode {

        // **** class members ****
        int     capacity;
        int     count;
        QNode   next;
        QNode   prev;

        int     key;
        int     value;

        // **** constructor for actual the queue ****
        public QNode(int capacity) {
            this.capacity   = capacity;
            this.count      = 0;

            this.next       = this;
            this.prev       = this;
        }

        // **** constructor for a node in the queue ****
        public QNode(int key, int value) {
            this.key    = key;
            this.value  = value;

            this.next   = null;
            this.prev   = null;
        }

        // **** ****
        @Override
        public String toString() {
            return "(" + key + "," + value + ")"; 
        }
    }

The QNode class will be used to represent the doubly linked list (or queue) and the actual queue. The technique is the one I have used in the library and projects written in C / C++.

The members are self explanatory. The capacity and count are used when instantiating a queue, the key and value are used when instantiating a node in the queue. The next and prev links are used on both situations.

We have two constructors. The first is for a node in the queue while the second is used for the actual queue. We also have a method to represent a queue node. It is not needed for the solution but I used it while experimenting. When I get to replace the hash map for an array, I will leave in debug code showing how I displayed the queue.

    // **** class members ****
    public int      capacity = 0;
    public int      count = 0;

    public QNode[]  arrHM = null;         
    public QNode    queue = null;

    
    // **** used in alternate (slower) implementations ****
    // public HashMap<Integer, Integer>        hm = null;
    // public LinkedList<Integer>              ll = null;
    // public LinkedHashMap<Integer, Integer>  lhm = null;

These are the class members for the LRU cache. The first two are used to specify the capacity and the actual count of nodes. The array replaces the use of a hash map and the queue is used to keep track of the order of the nodes in the RLU cache. As you can see, we are using an array instead of a hash map. The rest of the commented lines are from passes that were accepted but turned out to be too slow for my taste.

    // **** hashmap capacity ****
    final int HM_SIZE = 3001;

Since we are using an array instead of a hash map, we need to define an array of a fixed size to hold all possible key values. See the requirements from LeetCode for this problem.

    /**
     * Constructor
     * Initialize the LRU cache with a positive size capacity.
     */
    public LRUCache(int capacity) {

        // **** sanity check(s) ****
        if (capacity <= 0)
            return;

        // **** initialize class members ****
        this.capacity   = capacity;
        arrHM           = new QNode[HM_SIZE];
        queue           = new QNode(capacity);
    }

We need to create three methods in the LRUCache class. The RLUCache() constructor initializes the data structures. The capacity indicates the maximum number of objects in the LRU cache.

    /**
     * Enqueue node at the head of the queue.
     * If the queue is full delete the node at the tail of the queue.
     */
    public void enqueue(QNode node) {

        // **** check if the queue is full ****
        if (queue.count >= queue.capacity) {

            // **** remove node at the tail of the queue ****
            QNode n = remove(queue.prev);

            // **** remove node from the hashmap ****
            arrHM[n.key] = null;
        }

        // **** insert the node at the head of the queue ****
        node.next = queue.next;
        node.prev = queue;

        node.next.prev = node;
        queue.next = node;

        // **** increment the queue count ****
        queue.count++;
    }

We need to add new nodes when they are inserted, and we need to move (remove and enqueue) when a key is used. We need to bring the key from the current position in the linked list up to the head.

Before we insert an element into the queue, we need to check if the capacity has been exceeded. If so, we need to remove the node at the tail of the queue (least recently used). We also need to flag that the node is no longer in the hash map.

We then insert the new node at the head of the queue because it is the most recently accessed node in our LRU cache. The last thing is to count this node as being in the queue.

    /**
     * Dequeue the node at the tail of the queue.
     * If the queue is empty return null.
     */
    public QNode dequeue() {

        // **** check if the queue is empty ****
        if (queue.count == 0)
            return null;

        // **** node being removed and returned ****
        QNode node = queue.prev;
        
        // **** ****
        queue.prev = node.prev;
        node.prev.next = queue;

        // **** decrement the queue count ****
        queue.count--;

        // **** return node ****
        return node;
    }

The dequeue() method is used to remove the node at the tail of the queue. A queue is a FIFO data structure so enqueue() and dequeue() are the most commonly used methods.

We check if the queue is empty. We point to the tail node in the queue. Remove the tail node from the queue, decrement the count and return the removed node to the caller.

    /**
     * Remove specified node from the queue.
     */
    public QNode remove(QNode node) {

        // **** sanity check(s) ****
        if (queue.count == 0)
            return null;

        // **** update pointers ****
        node.prev.next = node.next;
        node.next.prev = node.prev;

        // **** decrement the queue count ****
        queue.count--;

        // **** return the node removed from the queue ****
        return node;
    }

The remove() method removes a node from the queue. The difference is that you need to pass it a reference to the node you wish to remove. After checking that the queue is not empty, we remove the node , decrement the count in the queue and return the node.

    /**
     * Print the contents of the queue.
     * Traverses queue from left (tail) to right (head).
     * Traverses queue from right (head) to left (tail).
     */
    public void printQ() {
        
        // **** ****
        System.out.println("<<< queue: " + queue.count);

        // **** left (tail) to right (head) ****
        System.out.print("<<< tail => ");
        for (QNode p = queue.prev; p != queue; p = p.prev) {
            System.out.print("(" + p.key + "," + p.value + ") ");
            if (p.prev != queue)
                System.out.print("->");
            else
                System.out.println("<= head");
        }

        // **** right (head) to left (tail) ****
        System.out.print("<<< head => ");
        for (QNode p = queue.next; p != queue; p = p.next) {
            System.out.print("(" + p.key + "," + p.value + ") ");
            if (p.next != queue)
                System.out.print("->");
            else
                System.out.println("<= tail");
        }
    }

The printQ() method is used to print the queue. Note that we first display the count. We then traverse the queue twice. The first pass is from the tail to the head using the prev pointer. On the second pass we traverse it from the head to the tail using the next pointer. This is done to verify that all pointers are correct. This method is not part of the solution.

    /**
     * Get value associated with the specified key from the LRU cache.
     */
    public int get(int key) {

        // **** check if key not in LRU cache ****
        if (arrHM[key] == null)
            return -1;

        // **** get the node ****
        QNode node = arrHM[key];

        // **** check if node NOT head of queue ****
        if (queue.next != node) {

            // **** remove node from the queue ****
            remove(node);

            // **** remove node from the hashmap ****
            arrHM[node.key] = null;

            // **** enqueue node ****
            enqueue(node);

            // **** remove node from the hashmap ****
            arrHM[node.key] = node;
        }

        // **** return the node value ****
        return node.value;
    }

Now that we have all the things we will need in place, we can tackle the get() method.  We check if the key is in the hash map (actually in this case an array). If not found we are done.

We get the node from the array. Of the node is not at the head of the queue, we need to move it to the head. We first remove the node from the queue, flag that the value is not in the hash map, we enqueue the node at the head of the queue, and we insert the node back into the hash map. When done the node is returned.

    /**
     * Put key:value pair into the LRU cache.
     */
    public void put(int key, int value) {

        // **** key not in hashmap ****
        if (arrHM[key] == null) {

            // **** new node ****
            QNode node = new QNode(key, value);

            // **** enqueue node (node at head of the queue) ****
            enqueue(node);

            // **** put the node in hashmap ****
            arrHM[key] = node;
        }

        // **** key in hashmap ****
        else {

            // **** get the node from the hashmap ****
            QNode node = arrHM[key];

            // **** update the node value (if needed) ****
            if (node.value != value)
                node.value = value;
                
            // **** move the node to the head of the queue (if needed) ****
            if (queue.next != node) {
            
                // **** remove node from queue ****
                node = remove(node);        

                // **** enqueue node (node at head of queue) ****
                enqueue(node);
            }
        }
    }

The put() method is used to insert a new key-value pair into the LRU cache. We check if we do not have the key in our hash map (implemented as an array). We create a node, enqueue it in the queue and add the reference to the hash map.

If we have the key in the hash map, we get a node reference from the hashmap, remove the node from the queue because we need to get it to the head of the queue. We also update the value because the key-value pair might have changed. Finally we insert the node at the head of the queue.

/**
 * First attempt:
 * Runtime: 457 ms, faster than 5.04% of Java online submissions.
 * Memory Usage: 47.6 MB, less than 13.73% of Java online submissions.
 * 
 * Second attempt:
 * Runtime: 197 ms, faster than 5.04% of Java online submissions.
 * Memory Usage: 48.1 MB, less than 13.73% of Java online submissions.
 * 
 * Third attempt:
 * Runtime: 78 ms, faster than 14.37% of Java online submissions.
 * Memory Usage: 47.2 MB, less than 13.73% of Java online submissions .
 * 
 * Fourth attempt:
 * Runtime: 10 ms, faster than 99.92% of Java online submissions.
 * Memory Usage: 47.8 MB, less than 12.48% of Java online submissions.
 */

Overall I had four different sets of code. I spent a few minutes after submitting trying to reduce the execution times. In the final attempt which is described in this post, it seems we got very close to the fastest submissions.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to help out with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private e-mail message. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, become proficient, refresh your knowledge and enhance your developer toolset!

One last thing, many thanks to all 3,501 subscribers to this blog!!!

Keep safe during the COVID-19 pandemic and help restart the world economy.

John

E-mail:  john.canessa@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.