Sherlock and Anagrams

The news is full of articles and posts about the corona virus. Before getting in panic mode, stop by the CDC “Coronavirus Disease 2019 (COVID-19)” web page and get informed.

My wife and I follow and watch several YouTube channels. We are mostly interested in foods to prepare and consume and places to visit on our next holiday. Yesterday we watched a new episode on one channel that deals with food and travel in Sicily, Italy. The channel is hosted by a couple of Americans that have roots in Europe and Italy.

They had some interesting facts about how to make a disinfectant using alcohol, aloe vera and some scents. I was not going to mention the specific video, but as I am going over this post, I feel that it is missing. The video is named “SPECIAL EDITION: Coronavirus update from Sicily!”. Quite appropriate for the times we are living in.

In the video it is mentioned that as the temperature start rising with the arrival of spring, the higher temperatures will reduce the life of the virus on surfaces. The implication was that the spread should reduce considerably. When I heard that I made a reference to my wife about the current cases in South America. The southern hemisphere is in summer as we speak. Most countries have seen hundreds of cases. Not sure how accurate the news might be or how easy is to diagnose the disease, but is happening in warm summer weather.

At that point I navigated to the CDC web page and got what I would like to consider the best possible advice and most accurate information. There is nothing about summer, winter or specific to temperature. I can see how people might assume that since the symptoms are flu like, here in the USA the cases of the common flu (influenza) diminish in summer and get to a maximum during winter. The CDC web site, at least when I visited yesterday, had no reference to temperature.

Now let’s get to the main subject for this post.

I visited the HackerRank web site and selected the “Sherlock and Anagrams” challenge. If you are interested please read the requirements and then keep on reading this post.

The challenge deals with anagrams not palindromes. Wikipedia describes it best:

An anagram is a word or phrase formed by rearranging the letters of a different word or phrase, typically using all the original letters exactly once.

1
mom

2


2
abba
abcd

4
0


2
ifailuhkqq
kkkk

3
10


1
cdcd

5

The test cases are included in the HackerRank web site. I did not create my own.

After reading the requirements for the challenge, I stopped by GitHub and created a repository for the solution. As usual, I created a simple README.md file which will contain the link to this post after it is published.

# **** clone my repo from GitHub ****
C:\Users\John\workspace8>git clone https://github.com/JohnCanessa/SherlockAnagrams.git
Cloning into 'SherlockAnagrams'...
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.

# **** get to the folder created by cloning the repo ****
C:\Users\John\workspace8>cd SherlockAnagrams

# **** list the contents of the new folder ****
C:\Users\John\workspace8\SherlockAnagrams>dir /A
03/09/2020  08:01 AM    <DIR>          .
03/09/2020  08:01 AM    <DIR>          ..
03/09/2020  08:01 AM    <DIR>          .git
03/09/2020  08:01 AM               175 README.md

# **** check status ****
C:\Users\John\workspace8\SherlockAnagrams>git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

# **** type the contents of the README.md file ****
C:\Users\John\workspace8\SherlockAnagrams>type README.md
# SherlockAnagrams
HackerRank problem named Sherlock and Anagrams solved using Java 8

To read about the approach please refer to the following link in my blog:

T.B.D.

I navigated in my Windows computer to the workspace in which I will be cloning the new repo. After cloning the repository I took a look at what was created. All seems as expected.

Using Visual Studio Code as my IDE I started typing my solution using Java 8.

  // **** open scanner ****
  private static Scanner sc = new Scanner(System.in);
  
  /**
   * Test scaffolding.
   */
  public static void main(String[] args) {
    
    // **** read the number of tests ****
    int q = Integer.parseInt(sc.nextLine());

    // **** loop reading the strings ****
    for (int i = 0; i < q; i++) {

      // **** read the next string ****
      String s = sc.nextLine();

      // **** process the string ****
      int result = sherlockAndAnagrams(s);

      // **** display the result ****
      System.out.println(result);
    }

    // **** close scanner ****
    sc.close();
  }

I replicated the test scaffolding making some changes.

The code looks simple. We get a count of strings and then enter a loop reading a string, processing it as needed and displaying the results. When all is done I close the scanner.

  /**
   * Complete the sherlockAndAnagrams function below.
   * The time complexity appears to be O(n^3).
   */
 static int sherlockAndAnagrams(String s) {

    // **** initialize count of anagrams ****
    int count = 0;

    // **** loop once per grouping of characters [1 : s.length() - 1] -> O(n) ****
    for (int g = 1; g < s.length(); g++) { // **** generate the base sub string -> O(n) ****
      for (int i = 0; i < s.length() - g; i++) { // **** starting string **** String bs = s.substring(i, i + g); // **** generate sub strings -> O(n) ****
        for (int j = i + 1; j <= s.length() - g; j++) { // **** generate current sub string **** String cs = s.substring(j, j + g); // **** check if anagram -> O(n log(n)) ****
          if (areAnagram(bs, cs)) {
            count++;
          }
        }
      }
    }

    // **** count of anagrams ****
    return count;
  }

We need to complete a function that given a string it returns the count of anagrams that can be generated from it.

After initializing the count, we loop to we proceed to execute three embedded loops. This algorithm is close to O(n^3). I figured that once the code is up and running I will submit it and if needed think about how to optimize it.

The purpose of the loops is to generate the two strings that we will be using to determine if they are anagrams of each other. If so, we increment the count. When done we return the count.

/**
   * Check if these strings are anagrams.
   * Seems like this method runs in O(n log(n))
   */
  static boolean areAnagram(String sa, String sb) {

    // **** populate character arrays -> O(n log(n)) ****
    char[] aa = sa.toCharArray();
    char[] ab = sb.toCharArray();

    // **** sort character arrays -> O(n log(n)) ****
    Arrays.sort(aa);
    Arrays.sort(ab);

    // **** traverse arrays checking for mismatches -> O(n) ****
    for (int i = 0; i < aa.length; i++) {
      if (aa[i] != ab[i]) {
        return false;
      }
    }

    // **** strings are anagrams ****
    return true;
  }

The auxiliary method is used to determine if the two strings are anagrams. We convert them to character arrays. We sort the arrays and then check that the characters match. I could have started this method by checking that the two strings are not null and have the same length, but in the context of this solution I did not seem such checks necessary. That said; if this would be a production method I would have performed such checks.

The code as is passed all the test cases. No need to optimize.

# **** get status ****
C:\Users\John\workspace8\SherlockAnagrams>git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        Solution.java

nothing added to commit but untracked files present (use "git add" to track)

# **** add SOlution.java to the next commit ****
C:\Users\John\workspace8\SherlockAnagrams>git add Solution.java

# **** check status ****
C:\Users\John\workspace8\SherlockAnagrams>git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   Solution.java

# **** commit the code ****
C:\Users\John\workspace8\SherlockAnagrams>git commit -m "first pass"
[master 28a754d] first pass
 1 file changed, 96 insertions(+)
 create mode 100644 Solution.java

# **** git status ****
C:\Users\John\workspace8\SherlockAnagrams>git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

# **** push the code to the GitHub repo ****
C:\Users\John\workspace8\SherlockAnagrams>git push origin master
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.03 KiB | 117.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/JohnCanessa/SherlockAnagrams.git
   25b2a7c..28a754d  master -> master

# **** get status ****
C:\Users\John\workspace8\SherlockAnagrams>git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

# **** check the log ****
C:\Users\John\workspace8\SherlockAnagrams>git log
commit 28a754da0a1b0cdfd4e491586c0a17aaf268b52e (HEAD -> master, origin/master, origin/HEAD)
Author: JohnCanessa <john.canessa@gmail.com>
Date:   Mon Mar 9 11:13:15 2020 -0500

    first pass

commit 25b2a7cf3760b32ba070cfc9334798628c65c1ce
Author: John Canessa <8019504+JohnCanessa@users.noreply.github.com>
Date:   Mon Mar 9 08:00:57 2020 -0500

    Create README.md

When done with the code we need to commit and push it to GitHub. I believe the comments in the screen capture are sufficient. I always like to make sure things are working as I go. That is the reason you see so many git status calls.

When done I visited my repo in GitHub to double check all was good.

The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address:  john.canessa@gmail.com. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, refresh your knowledge and enhance your developer toolset!

One last thing, thanks to all 434 subscribers to my blog!!!

John

Twitter:  @john_canessa

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.