Sam and Substrings

It is a sunny Tuesday in the Twin Cities of Minneapolis and St. Paul. The earlier forecast was calling for thunderstorms. I just check the revised forecast and it seems that we might have some rain after dark. That is good news because my wife and I are invited to our oldest son’s house today to have the next version of the burgers we had last Saturday as I commented on the Roads and Libraries post. Will let you know tomorrow how it goes.

As I mentioned in a previous post, a few weeks ago I had elective surgery performed on my left knee. All is going very well with the recovery process. So far most of the physical therapy sessions have been on Tuesdays and Thursdays towards the end of the day (04:30 PM or 05:00 PM). Today the session is at 01:30 PM. My wife changed our daily lunch schedule to accommodate it. We will be having lunch after therapy.

Yesterday we spoke over Skype with a couple friends of us that live in Peru. Things are not going well in Peru or Latin America in general. Seems that communism is creeping at alarming rates. In the next presidential elections to be held on June 06, 2021 there are two candidates. One candidate represents the right and the other the left. The left has mandated voters in different places to take a picture of their secret ballot to allow party representatives to verify that the votes are for the communist candidate. If voters refuse their lives would be in danger. Not much to say about a secret and fair election!

Of course the bottom line is always money. In this case some foreign nations will benefit by getting several mines at a much discounted price. It seems like a repeat of what transpired and is still going on in Venezuela. Such is life.

In this post we will take a stab at the Sam and Substrings problem by HackerRank.

Samantha and Sam are playing a numbers game.
Given a number as a string, no leading zeros, 
determine the sum of all integer values of substrings of the string.

Given an integer as a string, 
sum all of its substrings cast as integers. 
As the number may become large, return the value modulo 10^9 + 7.

Constraint:

o 1 <= ncastasinteger <= 2 * 10^5

The concept is simple. We are given a string of digits and we are supposed to return the sum of all substrings as integers.

If interested in this problem, I would suggest for you to read the current description on the HackerRank web site and go for it.

I will be solving the problem using the Java programming language and the VSCode IDE on a Windows 10 computer. It is a lot simpler to solve it directly on the HackerRank web site using the provided IDE. The nice thing about HackerRank is that they provide the test scaffold which we will use with minor changes.

    /*
     * Complete the 'substrings' function below.
     *
     * The function is expected to return an INTEGER.
     * The function accepts STRING n as parameter.
     */

    public static int substrings(String n) {
    // Write your code here

    }

The signature for the method in question is simple. We are provided with a string on digits and we have to return the sum as required.

1
substrings0 <<< str ==>1<==
substrings0 <<< callCounter: 2
substrings1 <<< a: [1]
substrings1 <<< b: [1]
substrings1 <<< i: 0 char: '1' ans: 1
substrings <<< ans: 1
main <<< output: 1
main <<< output: 1
main <<< output: 1


12
substrings0 <<< str ==>1<==
substrings0 <<< str ==>2<==
substrings0 <<< str ==>12<==
substrings0 <<< callCounter: 3
substrings1 <<< a: [1, 10]
substrings1 <<< b: [1, 11]
substrings1 <<< i: 0 char: '1' ans: 11
substrings1 <<< i: 1 char: '2' ans: 15
substrings <<< ans: 1
substrings <<< ans: 15
main <<< output: 15
main <<< output: 15
main <<< output: 15


123
substrings0 <<< str ==>1<==
substrings0 <<< str ==>2<==
substrings0 <<< str ==>3<==
substrings0 <<< str ==>12<==
substrings0 <<< str ==>23<==
substrings0 <<< str ==>123<==
substrings0 <<< callCounter: 4
substrings1 <<< a: [1, 10, 100]
substrings1 <<< b: [1, 11, 111]
substrings1 <<< i: 0 char: '1' ans: 111
substrings1 <<< i: 1 char: '2' ans: 155
substrings1 <<< i: 2 char: '3' ans: 164
substrings <<< ans: 1
substrings <<< ans: 15
substrings <<< ans: 164
main <<< output: 164
main <<< output: 164
main <<< output: 164


1234
substrings0 <<< str ==>1<==
substrings0 <<< str ==>2<==
substrings0 <<< str ==>3<==
substrings0 <<< str ==>4<==
substrings0 <<< str ==>12<==
substrings0 <<< str ==>23<==
substrings0 <<< str ==>34<==
substrings0 <<< str ==>123<==
substrings0 <<< str ==>234<==
substrings0 <<< str ==>1234<==
substrings0 <<< callCounter: 5
substrings1 <<< a: [1, 10, 100, 1000]
substrings1 <<< b: [1, 11, 111, 1111]
substrings1 <<< i: 0 char: '1' ans: 1111
substrings1 <<< i: 1 char: '2' ans: 1555
substrings1 <<< i: 2 char: '3' ans: 1654
substrings1 <<< i: 3 char: '4' ans: 1670
substrings <<< ans: 1
substrings <<< ans: 15
substrings <<< ans: 164
substrings <<< ans: 1670
main <<< output: 1670
main <<< output: 1670
main <<< output: 1670


5312
substrings0 <<< str ==>5<==
substrings0 <<< str ==>3<==
substrings0 <<< str ==>1<==
substrings0 <<< str ==>2<==
substrings0 <<< str ==>53<==
substrings0 <<< str ==>31<==
substrings0 <<< str ==>12<==
substrings0 <<< str ==>531<==
substrings0 <<< str ==>312<==
substrings0 <<< str ==>5312<==
substrings0 <<< callCounter: 5
substrings1 <<< a: [1, 10, 100, 1000]
substrings1 <<< b: [1, 11, 111, 1111]
substrings1 <<< i: 0 char: '5' ans: 5555
substrings1 <<< i: 1 char: '3' ans: 6221
substrings1 <<< i: 2 char: '1' ans: 6254
substrings1 <<< i: 3 char: '2' ans: 6262
substrings <<< ans: 5
substrings <<< ans: 61
substrings <<< ans: 624
substrings <<< ans: 6262
main <<< output: 6262
main <<< output: 6262
main <<< output: 6262

HackerRank provides two sample test cases.

In a nutshell we are provided with an input string with digits. We need to compute and return the answer.

This is a dynamic programming problem. If we use a brute force approach, we will get the proper answers but the site will reject our code due to timeouts. Our first approach was rejected.

The second and third approaches are based on the same observation which in my opinion is hard to come up with especially when you have a limited amount of time to get a working solution. Such is life.

Note that in all test cases our three implementations return the same answer and the answers are correct.

    /**
     * Test scaffold.
     * 
     * @throws IOException
     */
    public static void main(String[] args) throws IOException {

        // **** open buffered reader and writer ****
        BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in));
        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(System.out));

        // **** read input string ****
        String n = bufferedReader.readLine();

        // ???? display inut string ????
        // bufferedWriter.write("main <<< n ==>" + n + "<== len: " + n.length() + "\n");

        // **** compute result ****
        int result = substrings0(n);

        // **** display result ****
        bufferedWriter.write("main <<< output: " + String.valueOf(result));
        bufferedWriter.newLine();

        // **** compute result ****
        result = (int)substrings1(n);

        // **** display result ****
        bufferedWriter.write("main <<< output: " + String.valueOf(result));
        bufferedWriter.newLine();

        // **** compute result ****
        result = (int)substrings(n);

        // **** display result ****
        bufferedWriter.write("main <<< output: " + String.valueOf(result));
        bufferedWriter.newLine();

        // **** close buffered reader and writter ****
        bufferedReader.close();
        bufferedWriter.close();
    }

This is the slightly edited test scaffold provided by HackerRank.

In a nutshell we read the input string of digits and call three implementations of the solution. The first times out but produces the correct answers. The second is based on an observation and the third is an optimization of the second approach.

    // **** as required by problem ****
    public static final int MOD = 1000000007;

    // ???? call counter ????
    static int callCounter = 0;

To avoid overflows we need to obtain the modulus of values. We will use the constant MOD to represent the value described in the requirements.

The callCounter variable is NOT PART OF THE SOLUTION. It is used to track the number of calls made by a recursive call.

    /**
     * Complete the 'substrings' function below.
     *
     * The function is expected to return an INTEGER.
     * The function accepts STRING n as parameter.
     * 
     * Entry point.
     */
    public static int substrings0(String s) {

        // **** sanity check(s) ****
        if (s == null | s.length() == 0)
            return 0;

        // **** initialization ****
        int ans     = 0;
        callCounter = 0;

        // **** recursive call ****
        ans = substrings0(s, 1, s.length());

        // ???? display call counter ????
        System.out.println("substrings0 <<< callCounter: " + callCounter);

        // **** return answer ****
        return ans;
    }

This function performs some sanity checks and initializes a couple variables. It then calls a recursive function which returns the answer. When all is set and done we display the number of times the recursive function has been called and return the answer to the caller.

    /**
     * Complete the 'substrings' function below.
     *
     * The function is expected to return an INTEGER.
     * The function accepts STRING n as parameter.
     * 
     * Recursive call.
     * 
     * !!!! THIS APPROACH IS TOO SLOW !!!!
     */
    public static int substrings0(String s, int len, int maxLen) {

        // ???? count this call ????
        callCounter++;

        // **** end condition ****
        if (len > maxLen)
            return 0;

        // **** initialization ****
        long result = 0;

        // **** compute sum of substrings O(n) ****
        for (int i = 0; i <= maxLen - len; i++) {

            // **** build substring ****
            String str = s.substring(i, i + len);

            // ???? ????
            System.out.println("substrings0 <<< str ==>" + str + "<==");
          
            // **** add to result ****
            result += (Integer.parseInt(str) % MOD);
        }

        // **** recursive call ****
        result += substrings0(s, len + 1, s.length());

        // **** return result ****
        return (int)result;
    }

Note that we generate every single combination of digits with a single character. From then on combinations with two or more characters are generated recursively.

The code might not be as clear as it could be, but I tried a few modifications to see if I could get it to run as fast as possible to pass all test cases. That did not happen. I decided to do some reading and found that there is a linear mechanism that can be used by using the following formula:

                             n - 1
Value added by s[i]:  s[i] * SUM  10^i * (i + 1)
							 i = 0

As you can see we can generate the next value by using a previous one. This provides a nice linear approach. I will not get into much detail until we see the code for the next two implementations of the method in question.

    /**
     * The function is expected to return an INTEGER.
     * The function accepts STRING s as parameter.
     * 
     * Execution time: O(n)
     */
    public static int substrings1(String s) {

        // **** initialization ****
        int n       = s.length();
        long[] a    = new long[n];          // multiples of 10 
        long[] b    = new long[n];          // weighted sum
        b[0] = a[0] = 1;
        long ans    = 0;

        // **** populate arrays O(n) ****
        for (int i = 1; i < n; i++) {
            a[i] = (a[i - 1] * 10) % MOD;
            b[i] = (a[i] + b[i - 1]) % MOD;
        }

        // ???? ????
        System.out.println("substrings1 <<< a: " + Arrays.toString(a));
        System.out.println("substrings1 <<< b: " + Arrays.toString(b));

        // **** compute the value associated with the string O(n) ****
        for (int i = 0; i < n; i++) {

            // **** add the contribution of this digit in the string ****
            ans += ((s.charAt(i) - '0') * b[n - i - 1] * (i + 1));

            // **** to prevent overflow ****
            ans %= MOD;

            // ???? ????
            System.out.println("substrings1 <<< i: " + i + " char: '" + s.charAt(i) + "' ans: " + ans);
        }

        // **** return answer (as an integer) ****
        return (int)ans;
    }

We start by initializing some variables and a couple arrays. The values in the arrays are then displayed. I urge you to go back to the screen capture that shows the test cases.

We then enter a loop that traverses the characters in the input string. For each character we use the previous value to compute the current one. This implements a linear O(n) approach to compute the answer.

When done we return the answer casted as an integer.

    /**
     * The function is expected to return an INTEGER.
     * The function accepts STRING s as parameter.
     * 
     * Execution time: O(n)
     */
    public static int substrings(String s) {

        // **** initialization ****
        int len     = s.length();
        long ans    = s.charAt(0) - '0';

        // ???? ????
        System.out.println("substrings <<< ans: " + ans);

        long temp   = ans;

        // **** traverse the rest of the string O(n) ****
        for (int i = 1; i < len; i++) {
            temp = temp * 10 + (i + 1) * (s.charAt(i) - '0');
            temp %= MOD;
            ans = (ans + temp) % MOD;

            // ???? ????
            System.out.println("substrings <<< ans: " + ans);
        }

        // **** return answer ****
        return (int)ans;
    }

This is a more efficient implementation of the second approach. We start with some initializations. Note that the two arrays have been replaced by two variables.

The main loop is quite similar to the previous pass when we used arrays.

To read a complete description of the approach you can read the editorial for this problem from HackerRank.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, please do not hesitate and leave me a note below. I will reply as soon as possible.

Keep on reading and experimenting. It is one of the best ways to learn, become proficient, refresh your knowledge and enhance your developer toolset.

Thanks for reading this post, feel free to connect with me John Canessa at LinkedIn.

Regards;

John

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.