First Unique Character in a String

It seems that time is passing faster and faster. The reason seems to be the lack of variety in our lives. The COVID-19 pandemic for most has imposed limiting factors for social activities. I do not mind getting on-line with coworkers, friends and family, but nothing beats in-person interactions. Hopefully things will get better in the next few months. It seems that several pharmaceutical companies are moving along with their tests for their vaccines.

I selected LeetCode problem 387 First Unique Character in a String. It is catalogued as an Easy problem. That said, like most things in life, there are different ways to skin a cat, but one is best given your specific circumstances. I get a kick when I read about competitive programming. I have a hard time understanding the concept. To me it is like giving a brush to Michelangelo and ask him to paint as fast as possible the ceiling of the Sistine Chapel in the Vatican City.

Computer Science is education, practice and an art. When you are faced with a task you need to address the explicit and implicit requirements in order to come up with the best and most elegant solution. Next time you go out looking to purchase a new automobile or house, chances are that you will not pick the one that was designed and built in the shortest possible time.

Given a string, find the first non-repeating character in it and return its index.
If it doesn't exist, return -1.

Note: You may assume the string contains only lowercase English letters.

The requirements seem to be clearly spelled out. That is half the battle. We need to return the index of the first non repeating character in a string. If interested, I suggest for you to go to the LeetCode page and get the requirements and examples first hand.

I always look at the requirements and spend some time generating my solution. In several cases I try an alternate because new approaches worth pursuing may come up. If you get stuck, then start reading about a solution and stop as soon as you get enlightened. Finish your solution and if it does not work, repeat until your solution works or you had to read the entire solution. Remember that you do not want to memorize but learn. And yes, practice makes perfect.

For this problem I will use Java, the VSCode IDE and a Windows 10 computer. All should work on different platforms and with different IDEs. Of course you can skip the test scaffolding and solve the problem directly on the LeetCode web site.

leetcode
main <<< s ==>leetcode<==
main <<< firstUniqChar0: 0
main <<<  firstUniqChar: 0


loveleetcode
main <<< s ==>loveleetcode<==
main <<< firstUniqChar0: 2
main <<<  firstUniqChar: 2


aabbcdd
main <<< s ==>aabbcdd<==
main <<< firstUniqChar0: 4
main <<<  firstUniqChar: 4


aabbccdd
main <<< s ==>aabbccdd<==
main <<< firstUniqChar0: -1
main <<<  firstUniqChar: -1

The idea is to read a line of text in lowercase characters, display the line to make sure all is well, call the function that we will develop and display the results. As you can see, in my case I implemented two approaches. One may be faster than the other, but there is a lot more to this simple metric (i.e., elegance, complexity, maintainability, etc.) that a software development professional must take into account.

BTW the first two examples were provided by LeetCode. The other two I made up while testing my code during development.

class Solution {
    public int firstUniqChar(String s) {
        
    }
}

We need to develop our solution as part of the firstUniqChar() function as required by LeetCode.

    /**
     * Test scaffolding
     */
    public static void main(String[] args) {
        
        // **** open scanner ****
        Scanner sc = new Scanner(System.in);

        // **** read string ****
        String s = sc.nextLine().trim();

        // **** close scanner ****
        sc.close();

        // ???? display string ????
        System.out.println("main <<< s ==>" + s + "<==");

        // **** generate and display result ****
        System.out.println("main <<< firstUniqChar0: " + firstUniqChar0(s));

        // **** generate and display result ****
        System.out.println("main <<<  firstUniqChar: " + firstUniqChar(s));
    }

The test scaffolding is not part of the solution. I tend to develop code on my machine and when satisfied, copy and paste it to the web site for checking.

The idea is quite simple. Open a scanner, read the input string, close the scanner, display the string to make sure all is well and then call the function in question displaying the results. In this case we have a second approach that seems to return the same results. Hopefully that is not a coincidence. Both approaches were accepted by LeetCode.

	/**
     * Given a string, find the first non-repeating character in it and return its index.
     * If it doesn't exist, return -1.
     * 
     * Runtime: 30 ms, faster than 24.99% of Java online submissions.
     * Memory Usage: 39.4 MB, less than 95.22% of Java online submissions .
     * 
     * Runtime:  O(2 * n)
     */
    static int firstUniqChar0(String s) {

        // **** ****
        LinkedHashMap<Character, Integer> freqs = new LinkedHashMap<Character, Integer>();

        // **** traverse string updating frequencies O(n) ****
        for (int i = 0; i < s.length(); i++) {

            // **** for ease of use ****
            char ch = s.charAt(i);

            // **** ****
            if (!freqs.containsKey(ch)) {
                freqs.put(ch, 1);
            } else {
                int count = freqs.get(ch);
                freqs.replace(ch, count, count + 1);
            }
        }

        // **** look for first non-repeating character in the string O(n) ****
        char ch = 0;

        // for (Entry<Character, Integer> entry : freqs.entrySet()) {
        //     if (entry.getValue() == 1) {
        //         ch = entry.getKey();
        //         break;
        //     }
        // }

        for (char c : freqs.keySet()) {
            if (freqs.get(c) == 1) {
                ch = c;
                break;
            }
        }

        // **** unique character NOT found ****
        if (ch == 0)
            return -1;

        // **** find ch in s ****
        return s.indexOf(ch, 0);
    }

A simple way to keep track of frequencies is to use some type of a hash map class. I decided to use the LinkedHashMap because it keeps the entries in the order we are reading them from the string. I thought that might be of interest for this problem.

We enter a loop reading the characters and updating the number of times (frequency) as we traverse the string.

We now enter a second loop in which we locate the first letter with a frequency of 1. That implies that the associated character is unique. Note that I have two variations. The net results seem to match.

After we exit the loop, the ch variable may be set to 0 or to a character. If we did not find what we were looking for in the second loop, then we need to return -1; otherwise we return the first index of the character in the input string.

    /**
     * Given a string, find the first non-repeating character in it and return its index.
     * If it doesn't exist, return -1.
     * 
     * Runtime: 8 ms, faster than 80.02% of Java online submissions.
     * Memory Usage: 39.5 MB, less than 91.98% of Java online submissions.
     * 
     * Runtime:  O(2 * n)
     */
    static int firstUniqChar(String s) {

        // **** initialization ****
        int[] freq      = new int[26];
        int[] firstPos  = new int[26];

        // **** populate arrays O(n) ****
        for (int i = 0; i < s.length(); i++) {

            // **** ****
            freq[s.charAt(i) - 'a']++;

            // **** ***
            if (firstPos[s.charAt(i) - 'a'] == 0)
                firstPos[s.charAt(i) - 'a'] = i + 1;
        }

        // **** look for the first unique character O(26) ****
        int pos = Integer.MAX_VALUE;
        for (int i = 0; i < freq.length; i++) {
            if ((freq[i] == 1) && (firstPos[i] < pos)) {
                pos = firstPos[i];
            }
        }

        // **** return the result ****
        return pos == Integer.MAX_VALUE ? -1 : pos - 1;
    }

This approach is different in that we do not make use of a hash map. We declare two arrays of 26 entries each. Each entry represents a lower case character in monotonically ascending order from ‘a’ to ‘z’.

In the first loop we traverse the input string and update the frequency and keep track of the first time we encounter each character.

In the second loop, we traverse the frequency array looking for an entry with value 1 (unique) and we update the pos variable holding the position in the string. We are looking for the first unique character.

After we are done with the second loop, we can return a -1 if we did not find a unique character or the position (index) of the unique character.

Note that we could have initialized the firstPos array to a different value (-1 instead of the default 0), but by adding one (i + 1) to the value in the firstPos array, we can get the desired results without the additional initialization.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address:  john.canessa@gmail.com. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, become proficient, refresh your knowledge and enhance your developer toolset!

One last thing, many thanks to all 2985 subscribers to this blog!!!

Keep safe during the COVID-19 pandemic and help restart the world economy.

John

Twitter:  @john_canessa

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.