Serialize and Deserialize BST

Good morning! I am back. A couple weeks ago I have a selective / optional procedure on my left knee. Once I decided to get it done I was not able to do so due to the COVID-19 pandemic. After two years it is done. I will not go into more details at this time. I need to get my right knee done in the next couple months. This whole thing started a few years ago with some incidents involving a car and later riding a motorcycle. Due to the fact that I will be having the same procedure in my right knee I decided not to get details from the surgeon or Google. After all is said and done I will get more details. That said, I chose the surgeon and facility by asking friends. That was a very good idea.

On a separate note, last week I read “Five things I have learned after solving 500 LeetCode questions” by Federico Mannucci. He is a Software Engineer in Italy, who loves traveling and cooking. Apparently we share some professional and personal interests.

The article contains five points that all software developers should keep in mind. I have purchased many technical books through the years. Have read and annotated them. One of the most important things is to practice solving problems and doing exercises. Reading a book or article is can take you so far. Experimenting with them takes you all the way.

I have only been working on work related stuff for the last two weeks. Staring last Saturday I decided to continue working on additional topics such as this post.

For this post I selected LeetCode 449 Serialize and Deserialize BST problem. The problem is very well described. We are given a binary search tree (BST) and have to implement two methods. The first serializes the BST producing a string. The second is to take the string and return the original BST. This approach is very common when transmitting a data structure over a network connection.

If you are interested please visit LeetCode and get the requirements from their site. Not sure is requirements may change somewhat with time.

Serialization is converting a data structure or object into a sequence of bits 
so that it can be stored in a file or memory buffer, 
or transmitted across a network connection link 
to be reconstructed later in the same or another computer environment.

Design an algorithm to serialize and deserialize a binary search tree. 
There is no restriction on how your serialization/deserialization algorithm should work. 
You need to ensure that a binary search tree can be serialized to a string, 
and this string can be deserialized to the original tree structure.

The encoded string should be as compact as possible.

Constraints:

o The number of nodes in the tree is in the range [0, 104].
o 0 <= Node.val <= 104
o The input tree is guaranteed to be a binary search tree.

The idea is to be able to serialize the BST in such a way that we can then take the sting representing the BST and then be able to reconstruct it.

When I read the requirements I thought of a tree traversal. We could use a depth first traversal or a depth first traversal. If you think about it, many (never generalize) binary tree problems in LeetCode provide the definition of the binary tree with a string in which the nodes are traversed in a depth first approach. Empty spaces (missing nodes) are represented by the “null” string. You can check out the methods in the BST.java code that I use to load a string representing a binary tree and return a binary tree (BST) containing TreeNode nodes. BTW, I have additional fields in my TreeNode data structure (class) which are not used to solve this problem.

So if we already have a mechanism to take a string and generate a BST which would represent the serialized version, we then would be missing a mechanism to get from a BST to a string representation. Further digging into such functionality in the BST.java code, we also have such a method. Let’s not copy and paste code.

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *     int val;
 *     TreeNode left;
 *     TreeNode right;
 *     TreeNode(int x) { val = x; }
 * }
 */
public class Codec {

    // Encodes a tree to a single string.
    public String serialize(TreeNode root) {
        
    }

    // Decodes your encoded data to tree.
    public TreeNode deserialize(String data) {
        
    }
}

// Your Codec object will be instantiated and called as such:
// Codec ser = new Codec();
// Codec deser = new Codec();
// String tree = ser.serialize(root);
// TreeNode ans = deser.deserialize(tree);
// return ans;

This is the signature provided by LeetCode for our problem. We will use the TreeNode class to represent the nodes in the BST and will implement the serialize() and deserialize() methods from scratch.

This is a good time to think about how to proceed in detail. Once we get to some well and defined approach, we can start coding our solution. For inspiration check the LeetCode page with the problem description. I will wait until you are ready… …OK now that you are back let’s take a look at some test cases. Some were provided by LeetCode and others by researching and testing.

I decided to use Java and the VSCode IDE on a Windows 10 machine. Your best and simplest approach is to solve the problem directly on the LeetCode web site using the provided IDE. That way you will not have to generate a test scaffold to collect the input data, convert the data to the format expected by the problem, call the two methods in question and display results.

4,2,6,1,3,5,7
main <<< strArr.length: 7
main <<<        strArr: [4, 2, 6, 1, 3, 5, 7]
main <<< arr: [4, 2, 6, 1, 3, 5, 7]
main <<< DFT: 1 2 3 4 5 6 7
main <<< BFT:
4
2,6
1,3,5,7
main <<< preorder: (4) (2) (1) (3) (6) (5) (7)
main <<< tree ==>4 2 1 3 6 5 7<==
deserialize <<< s ==>4<==
deserialize <<< val: 4 start: -2147483648 end: 2147483647
deserialize <<< s ==>2<==
deserialize <<< val: 2 start: -2147483648 end: 4
deserialize <<< s ==>1<==
deserialize <<< val: 1 start: -2147483648 end: 2
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: -2147483648 end: 1
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: 1 end: 2
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: 2 end: 4
deserialize <<< s ==>6<==
deserialize <<< val: 6 start: 2 end: 3
deserialize <<< s ==>6<==
deserialize <<< val: 6 start: 3 end: 4
deserialize <<< s ==>6<==
deserialize <<< val: 6 start: 4 end: 2147483647
deserialize <<< s ==>5<==
deserialize <<< val: 5 start: 4 end: 6
deserialize <<< s ==>7<==
deserialize <<< val: 7 start: 4 end: 5
deserialize <<< s ==>7<==
deserialize <<< val: 7 start: 5 end: 6
deserialize <<< s ==>7<==
deserialize <<< val: 7 start: 6 end: 2147483647
main <<< DFT: 1 2 3 4 5 6 7
main <<< BFT:
4
2,6
1,3,5,7


2,1,3
main <<< strArr.length: 3
main <<<        strArr: [2, 1, 3]
main <<< arr: [2, 1, 3]
main <<< DFT: 1 2 3
main <<< BFT:
2
1,3
main <<< preorder: (2) (1) (3)
main <<< tree ==>2 1 3<==
deserialize <<< s ==>2<==
deserialize <<< val: 2 start: -2147483648 end: 2147483647
deserialize <<< s ==>1<==
deserialize <<< val: 1 start: -2147483648 end: 2
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: -2147483648 end: 1
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: 1 end: 2
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: 2 end: 2147483647
main <<< DFT: 1 2 3
main <<< BFT:
2
1,3


3,2,4,1,null,null,5
main <<< strArr.length: 7
main <<<        strArr: [3, 2, 4, 1, null, null, 5]
main <<< arr: [3, 2, 4, 1, null, null, 5]
main <<< DFT: 1 2 3 4 5
main <<< BFT:
3
2,4
1,5
main <<< preorder: (3) (2) (1) (4) (5)
main <<< tree ==>3 2 1 4 5<==
deserialize <<< s ==>3<==
deserialize <<< val: 3 start: -2147483648 end: 2147483647
deserialize <<< s ==>2<==
deserialize <<< val: 2 start: -2147483648 end: 3
deserialize <<< s ==>1<==
deserialize <<< val: 1 start: -2147483648 end: 2
deserialize <<< s ==>4<==
deserialize <<< val: 4 start: -2147483648 end: 1
deserialize <<< s ==>4<==
deserialize <<< val: 4 start: 1 end: 2
deserialize <<< s ==>4<==
deserialize <<< val: 4 start: 2 end: 3
deserialize <<< s ==>4<==
deserialize <<< val: 4 start: 3 end: 2147483647
deserialize <<< s ==>5<==
deserialize <<< val: 5 start: 3 end: 4
deserialize <<< s ==>5<==
deserialize <<< val: 5 start: 4 end: 2147483647
main <<< DFT: 1 2 3 4 5
main <<< BFT:
3
2,4
1,5



main <<< strArr.length: 1
main <<<        strArr: []
main <<< arr: [null]
main <<< DFT:
main <<< BFT:

main <<< preorder:
main <<< tree ==><==
main <<< DFT:
main <<< BFT:

The first line of each set of test cases contains the values of the nodes for the BST. Note that we are using a BST and not just a BT (binary tree). What this implies is that we might take advantage of some property to help us in the implementation.

After reading the input line our test code displays the length of the String[] array with the associated values. We then make an Integer[] with the same values. The contents of the Integer[] array are displayed.

It seems that we then must create and populate the specified BST and then display the results of a in order Depth First Traversal (DFT) of the BST. The values should appear in ascending order and they do.

The test code then displays the contents of the BST using a Breadth First Traversal (BFT). Each line represents a level. Note that in some of the test cases the “null” string is not displayed when a node does not have children. THIS CODE IS NOT PART OF THE SOLUTION it is just used to provide us some insights and how are we progressing testing our code as we implement it. So far all seems to be well with the test code.

We also make a call to some method and display the results of a DFT in preorder traversal for the specified BST. This order might help us choose the proper serialization approach. We can compare it with the in order traversal which might not be useful given the requirements. Take a few minutes to think about this… …OK let’s continue.

It seems that our code serializes the contents of the BST and returns the tree string. We could have chosen any separator but we will go with a “ “ blank.

I have implemented two deserialize() methods. We will look them in order, but I have no output for the first implementation since it was somewhat slower than the second. You could just add the code for both implementations and call them as needed. I just jumped directly into the second implementation. We will discuss each implementation shortly.

The bulk of the output that follows comes from the second implementation of the deserialize() method. We will talk about it in a few.

After the BST has been reconstructed from the tree sting we perform an in order DFT and a BFT on the BST. The results seem to match the input as expected.

    /**
     * Test scaffold
     * 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException {

        // **** open buffered reader ****
        BufferedReader br = new BufferedReader(new java.io.InputStreamReader(System.in));
        
        // **** create String[] with values for the BST ****
        String[] strArr = br.readLine().trim().split(",");
 
        // **** close the buffered reader ****
        br.close();

        // ???? ????
        System.out.println("main &lt;&lt;&lt; strArr.length: " + strArr.length);
        System.out.println("main &lt;&lt;&lt;        strArr: " + Arrays.toString(strArr));

        // **** generate an Integer[] to build the BST ****
        Integer[] arr = new Integer[strArr.length];
        for (int i = 0; i &lt; strArr.length; i++) {
            if (strArr[i].equals("null") || strArr[i].isEmpty())
                arr[i] = null;
            else
                arr[i] = Integer.parseInt(strArr[i]);
        }
 
        // ???? ????
        System.out.println("main &lt;&lt;&lt; arr: " + Arrays.toString(arr));
 
        // **** create and populate the BST ****
        BST bst = new BST();
        bst.root = bst.populate(arr);

        // ???? ????
        System.out.println("main &lt;&lt;&lt; DFT: " + bst.inOrder(bst.root));

        // ???? ????
        System.out.println("main &lt;&lt;&lt; BFT: " + bst.levelOrder());

        // ???? ????
        System.out.print("main &lt;&lt;&lt; preorder: ");
        bst.preOrder(bst.root);
        System.out.println();

        // **** instantiate Codec to serialize ****
        Codec ser = new Codec();

        // **** instantiate Codec to deserialize ****
        Codec deser = new Codec();

        // **** serialize BST ****
        String tree = ser.serialize(bst.root);

        // ???? ????
        System.out.println("main &lt;&lt;&lt; tree ==&gt;" + tree + "&lt;==");

        // **** deserialize BST ****
        TreeNode ans = deser.deserialize(tree);

        // **** for ease of use ****
        bst = new BST(ans);
       
        // ???? ????
        System.out.println("main &lt;&lt;&lt; DFT: " + bst.inOrder(bst.root)); 

        // ???? ????
        System.out.println("main &lt;&lt;&lt; BFT: " + bst.levelOrder());
    }

This is the test scaffold for the problem. Note that THIS IS NOT PART OF THE SOLUTION.

We first read the input line and generate and populate an Integer[] array which is then used to generate the BST which will be needed to be presented as an argument to the serialize() method.

We then display the BST in three different ways.

We then follow the approach specified by LeetCode to test our code to be developed. We create a ser and deser instances which will be used to serialize the BST and produce a string. The String will then be passed to the deserialize() method which should return the same input BST.

The last couple lines are used to display the new BST. It seems to match the input BST in all cases.

    /**
     * Encodes a BST to a single string.
     */
    public String serialize(TreeNode root) {
        
        // **** sanity checks ****
        if (root == null)
            return "";

        // **** initialization ****
        StringBuilder sb = new StringBuilder();

        // **** recursive call (implements DFS) ****
        serialize(root, sb);

        // **** return trimmed string ****
        return sb.toString().trim();
    }

As I mentioned earlier, we first need to serialize the BST. This is the utility method that is used to call a recursive one to serialize the provided BST.

We perform a sanity check and then create a string builder. The string builder is used by the recursive method. The recursive method used a Depth First Search / Traversal to add node values to the string builder. After the recursion is completed, we have our serialized BST represented by the string in the string builder.

    /**
     * Auxiliary recursive call implements a preorder DFT.
     */
    private void serialize(TreeNode root, StringBuilder sb) {
        if (root != null) {

            // **** add node to string builder ****
            sb.append(root.val + " ");

            // **** traverse left subtree ****
            serialize(root.left, sb);

            // **** traverse right subtree ****
            serialize(root.right, sb);
        }
    }

This is the recursive method used for serialization. We check if the node is not null. We then append the value of the node followed by a “ “ space / blank (could be any other character). We have visited the node in the BST. We now recursively traverse the left sub tree followed by the right sub tree.

When done, the string builder will have the node values separated by a blank “ “.

    /**
     * Decodes your encoded data to tree.
     * 
     * Entry point for recursion.
     */
    public TreeNode deserialize(String tree) {
        
        // **** sanity checks ****
        if (tree.isEmpty())
            return null;

        // **** split the tree string ****
        String[] strs = tree.split(" ");

        // **** deserialise BST ****
        return deserialize(strs, 0, strs.length - 1);
    }

This and the following code snippet implement the first pass. Both are later replaced with the versions of the second pass.

We start by checking if we have an empty string signaling that the BST was blank (no nodes). There is an example that tests this condition.

We then split the sting into sub strings using the same delimiter we used in serialize. You can try using any other character (i.e., ‘*’) for both serialize and deserialize.

We now call a recursive method which will process all array entries which at this time contain the values of the BST nodes.

    /**
     * Recursive call.
     */
    private TreeNode deserialize(String[] strs, int start, int end) {

        // ???? ????
        System.out.println("deserialize &lt;&lt;&lt; start: " + start + " end: " + end);

        // **** end condition ****
        if (start &gt; end) {
            return null;
        }

        // **** root node ****
        TreeNode root = new TreeNode(Integer.parseInt(strs[start]));

        // ???? ????
        System.out.println("deserialize &lt;&lt;&lt;  root: " + root.val);

        // **** index of value &gt; start ****
        int ndx;

        // **** look for the proper node ****
        for (ndx = start; ndx &lt;= end; ndx++) {
            if (Integer.parseInt(strs[ndx]) &gt; Integer.parseInt(strs[start])) {
                break;
            }
        }

        // ???? ????
        System.out.println("deserialize &lt;&lt;&lt;   ndx: " + ndx);
 
        // **** process left subtree ****
        root.left = deserialize(strs, start + 1, ndx - 1);

        // **** process right sub tree ****
        root.right = deserialize(strs, ndx, end);

        // **** return BST ****
        return root;
    }

This is the recursive method used to deserialize the string for the BST.

We start by checking the end condition. Note that as we traverse the string[] we need to find the nodes that belong to the left or the right sub trees.

We create a node and use the index (ndx) to locate the proper node.

Once we get the index, we recursively traverse the left and right sub trees. Note how the arguments to the recursive call are set.

When all is set and done we return the current root for the BST. On the last call we should have the original BST.

If you are interested in the performance numbers for this implementation you can find them in the code in my GitHub repository. The first set is for the first pass we just covered. The second set for the pass we will visit next.

    /**
     * Decodes your encoded data to tree.
     * 
     * Entry for recursive call.
     */
    public TreeNode deserialize(String tree) {

        // **** sanity check ****
        if (tree.isEmpty())
            return null;

        // **** populate the queue with the serialized data ****
        Queue&lt;String&gt; q = new LinkedList&lt;&gt;(Arrays.asList(tree.split(" ")));

        // **** deserialize the BST ****
        return deserialize(q, Integer.MIN_VALUE, Integer.MAX_VALUE);
    }

This is the entry point for the second implementation of the entry point for the deserialize() method.

We perform a sanity check. This takes care of empty BST strings.

We split the input string and convert the resulting array to a list which we store into the queue.

We then call the recursive method associated with this entry point.

    /**
     * Recursive call.
     */
    private TreeNode deserialize(Queue<String> q, int start, int end) {

        // **** end condition ****
        if (q.isEmpty())
            return null;

        // **** get next string (node value) from the queue ****
        String s = q.peek();

        // ???? ????
        System.out.println("deserialize <<< s ==>" + s + "<==");

        // **** convert to integer ****
        int val = Integer.parseInt(s);

        // ???? ????
        System.out.println("deserialize <<< val: " + val + " start: " + start + " end: " + end);

        // **** skip this node (if needed) ****
        if (val < start || val > end)
            return null;

        // **** remove node from the queue ****
        q.poll();

        // **** generate node for this value ****
        TreeNode root = new TreeNode(val);

        // **** process left subtree ****
        root.left = deserialize(q, start, val);

        // **** process right subtree ****
        root.right = deserialize(q, val, end);

        // **** return root of BST ****
        return root;
    }

We start by checking the end condition.

If we are not there yet, we peek into the queue. We get the value for the next node.

We then check if the node does not belong to the BST at this time.

If it does, we create a node with the specified value.

We recursively visit the left and then the right sub trees. Note that the calls are performed with different ranges.

When all is said and done we return the original BST that was serialized into the associated string.

This method closely represents a Breadth First Search algorithm that we have seen in previous posts e.g., BST Search.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, please do not hesitate and leave me a note below. I will reply as soon as possible.

Keep on reading and experimenting. It is one of the best ways to learn, become proficient, refresh your knowledge and enhance your developer toolset.

Thanks for reading this post and feel free to connect with me John Canessa at LinkedIn.

Regards;

John

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.