Permutation in String

Good morning! Hope you had a nice weekend. My wife and I did. We did some shopping, visited our son, and baked. For some reason last winter we did little baking. Yesterday morning after completing a two-hour block in my home office, I went up, prepared coffee and sat with my wife to chat. The subject of baking came up. Baking is both knowledge and art. In general when cooking or baking we tend not to follow recipes. Yesterday we collected the ingredients for raisin cinnamon bread around 09:00 AM. I got the dough ready around 10:00 AM. The dough proofed for a few hours and early afternoon, while my wife was preparing an avocado with tomatoes and roast beef salad, I buttered four bread molds, poured in the bread batter, waited for about 15 minutes (should be  an hour), and baked the cinnamon raisin bread. It turned out very good, but next week, I will let the bread proof for an hour in the molds. While the bread was still hot, we had two slices each with soft butter and grape jelly made by our daughter in law. It was delicious and to be honest with you, it does not require much work.

OK, let’s get down to the main topic for this post.

As you know I have been attempting to solve a series of problems at LeetCode. This morning I checked the Algorithm I link and all the problems seem to be the same, but my completion status has been erased. Apparently periodically the status is cleared. After this post, I will try to restore my previous state and continue from where I left over the weekend.

In this post we will attempt to solve the LeetCode 567 Permutation in String problem.

I spent time experimenting and moving towards an implementation using a window that would not time out. I had a good time working on this problem.

Given two strings s1 and s2, 
return true if s2 contains a permutation of s1, 
or false otherwise.

In other words, return true if one of s1's permutations is the substring of s2.

Constraints:

o 1 <= s1.length, s2.length <= 104
o s1 and s2 consist of lowercase English letters.

Related Topics:

o Hash Table
o Two Pointers
o String
o Sliding Window

We are given two strings. We need to determine if string `s2` contains a permutation of string `s1`. If it does, our function of interest should return `true`; otherwise it should return `false`.

Note that the contents of our strings are lowercase English characters.

We will attempt to solve this problem using the Java programming language and the VSCode IDE on a Windows computer. You can use Java on different platforms and come with similar solutions.

In our case, I will not be using the on-line IDE provided by LeetCode. Due to this fact, we will have to generate a simple test scaffold which will read the two input strings, populate the variables, call the function of interest and display the output. Note that such code IS NOT PART OF THE SOLUTION.

    public boolean checkInclusion(String s1, String s2) {
        
    }

The signature for the function of interest indicates that we are provided two strings and need to return a boolean. This perfectly matches the requirements for the problem.

In a nutshell we need to compare two strings at a time. The first will be `s1` and the second will be a substring of `s2` with the same size as `s1`. We need to cover all possibilities so our substring needs to be in a moving window that we will start on the left of `s2` and move it one character at a time to the end of the string.

The approach sounds good, but we also need to make sure that the substrings are anagrams of each other. Another way to put this is that the characters and their frequency should match.

With the idea in mind, we can try different data structures which should return valid outputs, will run faster than others, and may use more or less space. Time and space considerations are important when selecting an algorithm.

ab
eidbaooo
main <<< s1 ==>ab<==
main <<< s2 ==>eidbaooo<==
<<< freq: [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
main <<< output: true


ab
eidboaoo
main <<< s1 ==>ab<==
main <<< s2 ==>eidboaoo<==
<<< freq: [1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
<<<  win: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
main <<< output: false

The two test cases were provided by LeetCode.

In each case, our test scaffold needs to read the two input lines and populate strings `s1` and `s2`. In addition we will display the contents of the variables to verify all is well so far.

Our test scaffold seems to call the function of interest which displays some information which should help us follow the algorithm at hand. Please note that such output IS NOT PART OF THE SOLUTION.

When all is said and done our test code displays the output. In the first case the answer is `true`. Since `s1` contains “ab” and `s2` contains the substring “ba” which is an anagram of `s1`, our function should return `true`.

In the second test case, `s1` holds the string “ab”, but in the string `s2`, which holds the characters “eidboaoo”, we cannot find either “ba” or “ab”, so we need to return a `false`.

    /**
     * Test scaffold.
     * @throws IOException
     */
    public static void main(String[] args) throws IOException {
        
        // **** open buffered reader ****
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // **** read `s1` ****
        String s1 = br.readLine().trim();

        // **** read `s2` ****
        String s2 = br.readLine().trim();

        // **** close buffered reader ****
        br.close();

        // ???? ????
        System.out.println("main <<< s1 ==>" + s1 + "<==");
        System.out.println("main <<< s2 ==>" + s2 + "<==");

        // **** call function of interest and display output ****
        System.out.println("main <<< output: " + checkInclusion(s1, s2));
    }

Our test scaffold uses a buffered reader to read the two input lines and assign them respectively to `s1` and `s2`. The contents of both strings are then displayed.

The function of interest is then called, and the returned output is displayed.

I decided not to show the attempts that passed all test cases but were not accepted due to time constraints.

    /**
     * Given two strings s1 and s2, 
     * return true if s2 contains a permutation of s1, 
     * or false otherwise.
     * 
     * Using a HashMap window.
     * 
     * 106 / 106 test cases passed.
     * Status: Accepted
     * Runtime: 14 ms
     * Memory Usage: 40.1 MB
     */
    public static boolean checkInclusion0(String s1, String s2) {
        
        // **** sanity check(s) ****
        if (s1.length() > s2.length()) return false;

        // **** initialization ****
        HashMap<Character, Integer> freq    = new HashMap<>(26);
        HashMap<Character, Integer> win     = new HashMap<>(26);
        int len1                            = s1.length();
        int len2                            = s2.length();

        // **** populate with lowercase characters and counts of zero ****
        for (int i = 0; i < 26; i++) {
            freq.putIfAbsent(Character.valueOf((char)('a' + i)), 0);
            win.putIfAbsent(Character.valueOf((char)('a' + i)), 0);
        }

        // **** populate freq ****
        for (char ch : s1.toCharArray()) {
            Integer val = freq.get(ch);
            freq.put(ch, ++val);
        }

        // ???? ????
        System.out.println("<<< freq: " + freq.toString());
    
        // **** initialize win ****
        for (int i = 0; i < len1; i++) {
            Integer val = win.get(s2.charAt(i));
            win.put(s2.charAt(i), ++val);
        }

        // ???? ????
        System.out.println("<<<  win: " + win.toString());

        // **** move window right comparing hashmap contents ****
        for (int i = 0; i <= len2 - len1; i++) {

            // **** permutation found ****
            if (freq.equals(win)) return true;

            // **** update window (as needed) ****
            if (i < len2 - len1) {

                // **** remove character from win ****
                Integer val = win.get(s2.charAt(i));
                win.put(s2.charAt(i), --val);

                // **** add character to win ****
                val = win.get(s2.charAt(i + len1));
                win.put(s2.charAt(i + len1), ++val);

                // ???? ????
                System.out.println("<<<  win: " + win.toString());
            }
        }

        // **** permutation NOT found ****
        return false;
    }

This approach uses a sliding window which is implemented using a hashmap.

We start by making a sanity check. If the length of `s1` is greater than `s2` we return false.Since we are looking for permutations, `s2` has to be of equal or of larger length than `s1`.

There are 26 lowercase characters in the English language. We initialize two hash maps to a size of 26. The `freq` hashmap will be used to hold the target characters and their associated frequencies. The `win` hashmap will implement the sliding window. We will make sure we always have the same number of characters in the windows as there are in `s1`.

To make the code simpler we start by populating both hash maps with the 26 lowercase characters and assigning them a frequency of zero.

Once both hash maps are initialized, we populate the `freq` hashmap with the characters from the string `s1`. This hashmap will not change during the algorithm. It will be used to test against the contents of the sliding window.

We then initialize the start of the sliding window. Once this is done we start checking and sliding the window over the contents of `s2`.

Our first statement in the loop is to compare the contents of both hash maps. If they match we have found a permutation of `s1` and we can return `true`.

If that is not the case, we need to remove the left character from the sliding window and add the next character to the right. Please take a look at the following two sets of statements in the loop and make sure you follow the logic.

When all is said and done, if we do not find a permutation in the loop, our function of interest would return `false`.

The execution time is displayed in the comments section of this implementation.

    /**
     * Given two strings s1 and s2, 
     * return true if s2 contains a permutation of s1, 
     * or false otherwise.
     * 
     * Using an int[] window.
     * 
     * 106 / 106 test cases passed.
     * Status: Accepted
     * Runtime: 3 ms
     * Memory Usage: 39.2 MB
     * 
     * Runtime: 3 ms, faster than 98.79% of Java online submissions.
     * Memory Usage: 39.2 MB, less than 63.11% of Java online submissions.
     */
    public static boolean checkInclusion1(String s1, String s2) {

        // **** initialization ****
        int len1    = s1.length();
        int len2    = s2.length();
        int len     = len2 - len1;
        int[] freq  = new int[26];
        int[] win   = new int[26];

        // **** sanity check(s) ****
        if (len1 > len2) return false;

        // **** populate freq int[] ****
        for (char ch : s1.toCharArray())
            freq[ch - 'a']++;

        // ???? ????
        System.out.println("<<< freq: " + Arrays.toString(freq));

        // **** initialize window int[] - O(m) ****
        for (int i = 0; i < len1; i++)
            win[s2.charAt(i) - 'a']++;

        // ????? ????
        System.out.println("<<<  win: " + Arrays.toString(win));

        // **** check window contents against freq arrays - O(n) ****
        for (int i = 0; i <= len; i++) {

            // ***** compare win and freq arrays - O(m) ****
            int diff = Arrays.compare(win, freq);

            // **** permutation found ****
            if (diff == 0) return true;    

            // **** move window right (if needed) ****
            if (i < len) {
                win[s2.charAt(i) - 'a']--;
                win[s2.charAt(i + len1) - 'a']++;
            
                // ????? ????
                System.out.println("<<<  win: " + Arrays.toString(win));
            }
        }

        // **** permutation NOT found ****
        return false;
    }

The approach in this implementation of the function of interest uses the same sliding window approach. The difference is that instead of using a co9upe hash maps we will be using a couple of int[] arrays.

After the initialization and sanity check, we populate the `freq` with the frequencies of the characters in string `s1`.

We then populate the `win` array with the frequencies of the first set of characters that match the length of `s1`.

We are now ready to enter a loop that moves the sliding window left to right.

We first check if the arrays contain the same characters with the same frequencies. If so, we return `true`. If that is not the case, on each pass we remove from the `win` array the character on the left and add the character to the right from`s2`.

If we do not find a match, we exit the loop and return `false`.

This function behaves in a similar mode as the previous one. The only difference is that we are now using a couple of int[] arrays instead of a couple of hashmaps.

Performance information is shown in the comments section of the function.

    /**
     * Given two strings s1 and s2, 
     * return true if s2 contains a permutation of s1, 
     * or false otherwise.
     * 
     * Using an int[] window plus an auxiliary function.
     * 
     * 106 / 106 test cases passed.
     * Status: Accepted
     * Runtime: 3 ms
     * Memory Usage: 39.1 MB
     * 
     * Runtime: 3 ms, faster than 98.79% of Java online submissions.
     * Memory Usage: 39.1 MB, less than 77.36% of Java online submissions.
     */
    public static boolean checkInclusion(String s1, String s2) {

        // **** initialization ****
        int len1    = s1.length();
        int len2    = s2.length();
        int len     = len2 - len1;
        int[] freq  = new int[26];
        int[] win   = new int[26];

        // **** sanity check(s) ****
        if (len1 > len2) return false;

        // **** populate freq int[] ****
        for (char ch : s1.toCharArray())
            freq[ch - 'a']++;

        // ???? ????
        System.out.println("<<< freq: " + Arrays.toString(freq));

        // **** initialize window int[] - O(m) ****
        for (int i = 0; i < len1; i++)
            win[s2.charAt(i) - 'a']++;

        // ????? ????
        System.out.println("<<<  win: " + Arrays.toString(win));

        // **** check window contents against freq arrays - O(n) ****
        for (int i = 0; i <= len; i++) {

            // **** check if arrays are equal ****
            if (areEqual(win, freq)) return true;

            // **** move window right (if needed) ****
            if (i < len) {
                win[s2.charAt(i) - 'a']--;
                win[s2.charAt(i + len1) - 'a']++;
            
                // ????? ????
                System.out.println("<<<  win: " + Arrays.toString(win));
            }
        }

        // **** permutation NOT found ****
        return false;
    }

This last attempt is similar to the previous one. The only difference is that in the previous function we used an Arrays method to compare the two arrays, in this implementation we use our own auxiliary function.

/**
 * Auxiliary function.
 */
static boolean areEqual(int[] arr1, int[] arr2) {

    // **** sanity check(s) ****
    // int len1 = arr1.length;
    // int len2 = arr2.length;
    // if (len1 != len2) return false;

    // **** compare array contents ****
    for (int i = 0; i < arr1.length; i++)
        if (arr1[i] != arr2[i]) return false;

    // **** contents of arrays are equal ****
    return true;
}

This auxiliary function does not require sanity checks. It compares the contents of the two arrays. The idea was to see if the execution time would be somewhat shorter. Based on the execution information on the comment section of the caller, this change did not make much of an impact.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

Please note that the code here presented might not be the best possible solution. In most cases, after solving the problem, you can take a look at the best accepted solution provided by the different websites (i.e., HackerRank, LeetCode).

If you have comments or questions regarding this, or any other post in this blog, please do not hesitate and leave me a note below. I will reply as soon as possible.

Keep on reading and experimenting. It is one of the best ways to learn, become proficient, refresh your knowledge and enhance your developer / engineering toolset.

Thanks for reading this post, feel free to connect with me John Canessa at LinkedIn.

Enjoy;

John

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.