String to Integer

Last evening I read the news from September 06, 2020 that Steve Jobs widow has donated $500,000 USD to the Joe Biden presidential campaign. In my opinion this is quite interesting. Allow me to describe my train of thought.

Donald Trump has made it clear that he wants to start taxing all companies (including Apple) that manufacture outside the USA (in particular in China). On the other hand, Joe Biden will let companies to continue to import good without paying tariffs.

Most (never say all) large companies do not pay tariffs when importing their good manufactured OUS. The claim by most (once again, never say all) companies is that they are trying to provide US consumers with the best possible product at the lowest possible price. This might sound good to some people. The question is “do you understand what is actually happening”?

Since the news refers to Apple, I will make my case with Apple. Please keep in

Official portrait of President Donald J. Trump, Friday, October 6, 2017. (Official White House photo by Shealah Craighead)

mind that most large companies follow the same pattern.

When Apple manufactures products (i.e., phones, laptops) in China, Chinese workers and the Chinese Communist Party benefits from the jobs and the intellectual property that Apple has to provide the communist party for the privilege of getting their products manufactured there. When Apple get the products into the USA and avoid tariffs, they are able to pocket the money and be able to build large amounts of cash on-hand. They also are able to pay their executives very well. In addition, stock holders get higher dividends. Everyone so far benefits from manufacturing OUS and avoiding having to pay tariffs. Sounds great so let’s keep this going!

What you are missing is that the USA has lost jobs and tariff income that would help educate and improve the lives of millions of people that need education, stay out of prisons, and give them jobs to allow them to live the American Dream. Instead they are rioting and protesting without understanding what is happening and what matters to make a change.

I am not into politics. I just like to collect diverse data and attempt to understand what is happening. If I missed something, please let me know. I like to debate and discuss when I have gathered enough data to make educated decisions. If I made a mistake or missed some credible and valid data, I will accept my mistake and edit this post.

OK, enough of the news and let’s get to the main subject of this post.

I selected problem 8. String to Integer (atoi) from LeetCode. The description is somewhat long and convoluted. The goal seems quite reasonable. Given a string of characters return the integer value associated with them. The problem is that the string may contain non-integer characters.

Implement atoi which converts a string to an integer.

The function first discards as many whitespace characters as necessary until the first non-whitespace character is found.
Then, starting from this character, takes an optional initial plus or minus sign followed by as many numerical digits as possible, 
and interprets them as a numerical value.

The string can contain additional characters after those that form the integral number, 
which are ignored and have no effect on the behavior of this function.

If the first sequence of non-whitespace characters in str is not a valid integral number, 
or if no such sequence exists because either str is empty or it contains only whitespace characters, 
no conversion is performed.

If no valid conversion could be performed, a zero value is returned.

Note:

Only the space character ' ' is considered as whitespace character.
Assume we are dealing with an environment which could only store integers within the 32-bit signed integer range: [−231,  231 − 1]. 
If the numerical value is out of the range of representable values, 
INT_MAX (231 − 1) or INT_MIN (−231) is returned.

If interested, my suggestion is to visit the LeetCode site and read about the problem making sure you take into account the examples.

I will solve the problem on my computer using Java and the VSCode IDE. My computer runs Windows 10. I will solve the problem and when satisfied will copy the contents of the required function into LeetCode.

42
main <<< str ==>42<==
main <<< myAtoi: 42
main <<<   atoi: 42


   -42
main <<< str ==>   -42<==
main <<< myAtoi: -42
main <<<   atoi: -42


4193 with words
main <<< str ==>4193 with words<==
main <<< myAtoi: 4193
main <<<   atoi: 4193


words and 987
main <<< str ==>words and 987<==
main <<< myAtoi: 0
main <<<   atoi: 0


-91283472332
main <<< str ==>-91283472332<==
main <<< myAtoi: -2147483648
main <<<   atoi: -2147483648


123abc
main <<< str ==>123abc<==
main <<< myAtoi: 123
main <<<   atoi: 123


abc123
main <<< str ==>abc123<==
main <<< myAtoi: 0
main <<<   atoi: 0

3.14159
main <<< str ==>3.14159<==
main <<< myAtoi: 3
main <<<   atoi: 3


12345678901234567890
main <<< str ==>12345678901234567890<==
main <<< myAtoi: 2147483647
main <<<   atoi: 2147483647


main <<< str ==><==
main <<< myAtoi: 0
main <<<   atoi: 0


+
main <<< str ==>+<==
main <<< myAtoi: 0
main <<<   atoi: 0


-
main <<< str ==>-<==
main <<< myAtoi: 0
main <<<   atoi: 0


+-2
main <<< str ==>+-2<==
main <<< myAtoi: 0
main <<<   atoi: 0


-   234
main <<< str ==>-   234<==
main <<< myAtoi: 0
main <<<   atoi: 0


1234
main <<< str ==>1234<==
main <<< myAtoi: 1234
main <<<   atoi: 1234

My test scaffolding seems to start by reading a string. The string is displayed between arrows to make sure non-printable characters are included. The second example appears to have some leading spaces.

The function we need to develop is named myAtoi(). It seems that our test scaffolding calls and displays the result. In addition it seems that a second function named atoi() implements a different version. One way or the other both functions seem to return the same value.

LeetCode provides the first five examples with associated results. The rest come from failed attempts or me just setting some strings with some unique combinations.

Sorry if I went too far with examples. I was experimenting with a couple functions. Both of them were accepted by LeetCode.

    /**
     * Test scaffolding.
     */
    public static void main(String[] args) {
        
        // **** open scanner ****
        Scanner sc = new Scanner(System.in);

        // **** read the string ****
        String str = sc.nextLine();

        // **** close scanner ****
        sc.close();

        // **** display the string ****
        System.out.println("main <<< str ==>" + str + "<==");

        // **** convert string to integer (if possible) and display value ****
        System.out.println("main <<< myAtoi: " + myAtoi(str));

        // **** convert string to integer (if possible) and display value ****
        System.out.println("main <<<   atoi: " + atoi(str));
    }

The test scaffolding is not part of the solution. I had to write a simple driver mechanism to collect the input string, pass it to the function under developments, and display the returned value.

    /**
     * Flags if an integer is positive or negative.
     */
    enum SIGN {
        POS, NEG, UNKNOWN
    }

This enum is used to flag if the value in the string is positive or negative. We could have used a Boolean variable in which false indicates positive and true signals negative. By the way, I did not use UNKNOWN. In general I like to initialize variable with an unknown value and let the code fill in the proper value, that way if a function returns an unknown something did not go as expected.

   /**
     * Implement atoi which converts a string to an integer.
     * 4 ms. faster than 29.18%
     */
    static int myAtoi(String str) {
        
        // **** list of digits ****
        String digits = "0123456789";

        // **** value to be returned ****
        int result = 0;

        // **** trim the string ****
        str = str.trim();

        // **** check initial string ****
        if (str.equals("") || str.equals("+") | str.equals("-"))
           return 0;

        // **** for starters ****
        SIGN sign = SIGN.POS;

        // **** check for leading sign ****
        if ((str.charAt(0) == '-') || (str.charAt(0) == '+')) {

            // **** ****
            if (str.charAt(0) == '-')
                sign = SIGN.NEG;
    
            // **** skip leading sign ****
            str = str.substring(1);

             // **** check if we have an empty string ****
            if (str.equals(""))
                return 0;
        }

        // **** split ****
        String[] words = str.split(" ");

        // **** for ease of use ****
        str = words[0];

        // **** check if we have an empty string ****
        if (str.equals(""))
            return 0;

        // **** check if we have something else but digits ****
        for (int i = 0; i < str.length(); i++) {

            // ***** ****
            char ch = str.charAt(i);

            // **** check if not a digit ****
            if (digits.indexOf(ch, 0) == -1) {

                // **** ****
                str = str.substring(0, i);

                // **** check if we have an empty string ****
                if (str.equals(""))
                    return 0;

                // **** ****
                break;
            }
        }

        // **** convert the integer string to binary (may throw exception) ****
        try {
            result = Integer.parseInt(str);
        } catch (Exception e) {
            if (sign == SIGN.POS)
                return Integer.MAX_VALUE;
            else
                return Integer.MIN_VALUE;
        }

        // **** return positive or negative result ****
        return sign == SIGN.POS ? result : -result;
    }

The solution starts by defining a string with the entire set of integers. We will use it to determine if characters in the input string are integers. We then initialize the result to 0 (zero). Clear the leading and trailing spaces from the string. We perform a set of quick checks and assign the positive value for the result.

We then check if we have a positive or a negative leading sign and skip it. We could have used an index and keep the original input string, but I decided that it would be more readable (and not as efficient) to update the input string.

Since the input string may have multiple words, we split the input string and assign the first word to the input variable. We then check if we have a blank string and return 0.

We enter a loop to check the characters in the string. If the character is not an integer, we extract the first part of the string which contains integers, check if we have a blank string and return a 0, or break out of the loop.

The next step is to parse the string of integers. The only problem would be to encounter an overflow. If we do, we use the sign to return a maximum or minimum integer value. If all is well so far, we have a binary representation without a sign. That is taken care in the last line when we check if the returned value should be positive or negative.

    /**
     * Implement atoi which converts a string to an integer.
     * 3 ms faster than 44.45%
     */
    static int atoi(String str) {

        // **** list of digits ****
        String digits = "0123456789";

        // **** for starters ****
        SIGN sign   = SIGN.POS;
        int result  = 0;

        // **** remove leading and trailing spaces ****
        str = str.trim();

        // **** check for blank strings ****
        if (str.equals(""))
            return 0;

        // **** take care of leading sign ****
        if ((str.charAt(0) == '-') || (str.charAt(0) == '+')) {

            // **** ****
            if (str.charAt(0) == '-')
                sign = SIGN.NEG;
    
            // **** skip leading sign ****
            str = str.substring(1);
        }

        // **** traverse the string ****
        for (int i = 0; i < str.length(); i++) {

            // **** get the current character ****
            char ch = str.charAt(i);

            // **** check if current character is NOT an integer ****
            if (digits.indexOf(ch, 0) == -1) {
                break;
            }

            // **** update the result (does NOT throw exception) ****
            // result *= 10;
            // result += ch - '0';

            // **** update the result (throws exception) ****
            try {
                result = Math.multiplyExact(result, 10);
                result = Math.addExact(result, ch - '0');
            } catch (Exception e) {
                if (sign == SIGN.POS)
                    return Integer.MAX_VALUE;
                else
                    return Integer.MIN_VALUE;
            }
        }

        // **** return positive or negative result ****
        return sign == SIGN.POS ? result : -result;
    }

This function implements a similar but different approach.

We start by defining a string on integers. We set the sign and the initial result. Then we trim the input string and check if we have an empty string. If we do, we return 0.

Next we take care of the sign and update the string.

We enter a loop in which we traverse the string. We process each character until we find one that is not an integer.

We have two ways to compute the result at each step. The first one has the problem that is does not throw an exception. We need to throw an exception to be able to return the maximum or minimum integer value allowed in Java. For this reason we commented out the first approach, and implemented the second one which is equivalent.

When done parsing the contiguous string of integers, we returned the signed value.

Like I said before, both versions were accepted by LeetCode. One seems to be slightly more efficient than the other.

The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address:  john.canessa@gmail.com. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, become proficient, refresh your knowledge and enhance your developer toolset!

One last thing, many thanks to all 2301 subscribers to this blog!!!

Keep safe during the COVID-19 pandemic and help restart the world economy.

John

Twitter:  @john_canessa

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.