First Duplicate

This morning in the Twin Cities of Minneapolis and St. Paul we woke up with a temperature of -11 F. The central heating unit worked overtime last night. It was quite dry indoors. Not sure why this happens due to the fact that the central heating as a built in humidifier and air exchange sub systems.

Yesterday morning I read the article “Do You Solve Programming Problems or Complete Exercises? (The Difference Matters.)” by Amy Haddad. She makes a difference between problem-solving and solving exercises. I agree with her interpretation. If interested read her article and possibly comment in her or my post.

Yesterday evening, I was browsing YouTube and ran into the video “Google Coding Interview Question – firstDuplicate” by Nicholas White.

This morning I decided to solve the problem after understanding the requirements and before getting to the solution. To make sure you understand the requirements watch the video and stop it when Nicholas starts talking about the brute force approach in O(n^2). I did not considered the brute force and skipped to attempting two very similar implementations in O(n) and S(n). I used a hash map and then a hash set. They are quite similar.

After I was done I continued watching the video. Nicholas proposed a O(n^2) solution. Then he moved on to a O(n) & S(n) approach. Very similar to what I came up. I thought that was it, but then he moved on describing an approach that used O(n) and S(1); in other words it used the same array by altering values. Once again I stopped the video and attempted to implement the proposed idea. I will watch the end of the video as soon as I comment on my implementation.

1
1 3 2 4 5 3

3


1
2 1 3 5 3 2

3


1
1 2 3 4

-1


1
1 2 1 3 5

1


1
6 4 3 3 2 1

3


1
1 2 1 2 3 3

1


1
2 1 3 5 3 2

3


1
1 2 3 4 5 6

-1


5
1 2 3 7 3 7 1
9 3 5 8 2 9 3 4 1
6 5 4 3 2 1
2 1 3 5 3 2
1 3 2 4 5 3

3
9
-1
3
3

I implemented testing scaffolding similar to what HackerRank provides in their challenges.

  /**
   * Test scaffolding. The values in the array are in the range [1 : n].
   */
  public static void main(String[] args) {

    // **** open scanner ****
    Scanner sc = new Scanner(System.in);

    // **** get number of test cases ****
    int t = sc.nextInt();

    // ???? ????
    System.out.println("main <<< t: " + t);

    // **** get past the EOL ****
    sc.nextLine();

    // **** loop once per test case ****
    for (int i = 0; i < t; i++) {

      // **** read the string of numbers ****
      String s = sc.nextLine();

      // // ???? ????
      // System.out.println("main <<< s ==>" + s + "<==");

      // **** split the numbers in the stirng ****
      String sa[] = s.split(" ");

      // **** declare the array ****
      int[] arr = new int[sa.length];

      // **** populate array of integers ****
      for (int j = 0; j < sa.length; j++) {
        arr[j] = Integer.parseInt(sa[j]);
      }

      // ???? ????
      System.out.println("main <<< arr: " + Arrays.toString(arr));

      // **** get the first duplicate ****
      System.out.println("main <<< firstDuplicate: " + firstDuplicate(arr));
      System.out.println("main <<< firstDuplicate: " + firstDuplicate1(arr));
      System.out.println("main <<< firstDuplicate: " + firstDuplicate2(arr));
    }

    // **** close scanner ****
    sc.close();
  }

The number of tests is read. We then loop once per test. We read a line of integer values, split the string and populate an array on integers. We have three approaches. As I mentioned before, I did not implemented the O(n^2) approach. I was able to figure on my own the first two approaches. I had to listen to the description of the final approach before attempting it.

  /**
   * Find first duplicate in specified array. Use a hashmap. O(n) & S(n).
   */
  static int firstDuplicate(int[] arr) {

    // **** ****
    HashMap<Integer, Integer> hm = new HashMap<Integer, Integer>();

    // **** loop processing the array ****
    for (int i = 0; i < arr.length; i++) {

      // **** check if values in NOT in the hash map ****
      if (hm.containsKey(arr[i])) {
        return arr[i];
      }

      // **** put the value in the hash map ****
      hm.put(arr[i], 1);
    }

    // **** no duplicate found ****
    return -1;
  }

We first declare a hash map to hold the values that we encounter while reading the array. We will stop as soon as we run into a duplicate inside our loop. If we complete the loop then we return a -1 indicating a duplicate was not found.

  /**
   * Find first duplicate in specified array. Use a hash set. O(n) & S(n).
   */
  static int firstDuplicate1(int[] arr) {

    // **** ****
    HashSet<Integer> set = new HashSet<Integer>();

    // **** traverse the array ****
    for (int i = 0; i < arr.length; i++) {
      if (!set.add(arr[i])) {
        return arr[i];
      }
    }

    // **** no duplicate found ****
    return -1;
  }

I could have skipped the previous approach and just stick to this one. They are in essence the same. I just wanted to experiment to see if a light bulb would light up. It did not because the approach presented by Nicolas is not intuitive. Once you implemented it, it makes sense and is simple. Most important is O(n) and S(1).

  /**
   * Find first duplicate in specified array. Use a hash set. O(n) & S(1).
   */
  static int firstDuplicate2(int[] arr) {

    // **** traverse the array ****
    for (int i = 0; i < arr.length; i++) {

      // **** get the index for this value ****
      int j = Math.abs(arr[i]) - 1;

      // // ???? ????
      // System.out.println("firstDuplicate2 <<< j: " + j);

      // **** check if this value (as an index) has been seen ****
      if (arr[j] < 0) {
        return Math.abs(arr[i]);
      }

      // **** flag that the index has been seen ****
      arr[j] *= -1;

      // // ???? ????
      // System.out.println("firstDuplicate2 <<< arr: " + Arrays.toString(arr));

    }

    // **** no duplicate found ****
    return -1;
  }

Note that I left my comments commented out. You might want to run the code with the additional display to follow what is going on.

OK, now that I am almost done with this post will go back and finish watching the video.

I am back!

The solutions are quite similar since they use the same approach. If someone comes with such approach I would think that the developer had seen the solution before. I was very surprised with it. This is a new tool to think about which may apply to future problems.

The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address:  john.canessa@gmail.com. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, refresh your knowledge and enhance your developer toolset!

John

Twitter:  @john_canessa

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.