Sum of Two

I received a notification via email that Nicholas White had put a new video in YouTube. If you are interested, the video is named Google Coding Interview Question – Sum of Two. I enjoy working on problems. Try to get a few each week.

Let’s skip the chit chat and go directly to this problem. The statement for the problem follows:

You have two integer arrays a and b, and an integer target value v.

Determine whether there is a pair of numbers, where one number is taken from a and the other from b, that can be added together to get a sum of v.

Return true if such a pair exists, otherwise return false.

By looking at the problem statement we can quickly determine that we could solve it using a brute force approach. Such approach (which we will look at shortly) is O(n^2). The reason is that we need to traverse the first array and for each value we need to traverse the other array.

I always attempt to solve the problem and then compare results. By peeking without an attempt the learning experience is too weak.

We could attempt to speed up such approach by maybe sorting the values in the arrays and then using indices to traverse them. I just wanted to explore the approach by finding out how Java sorts arrays. I found the article Why java.util.Arrays uses Two Sorting Algorithms. The article states that the class java.util.Arrays uses quicksort (actually dual pivot quicksort in the most recent version) for primitive types such as int and mergesort for objects that implement Comparable or use a Comparator.

I then went and look up the time order for Quicksort. I find easier to look up things rather than memorizing them and making a recall mistake. Memory is fragile. The article states:

Mathematical analysis of quicksort shows that, on average, the algorithm takes O(n log n) comparisons to sort n items.

In the worst case, it makes O(n2) comparisons, though this behavior is rare.

If we use a hash set instead we could get O(n) which is faster than O(n log n).

In a previous post I mentioned that moving forward I am going to use git for developing solutions in this blob. I do not think it is needed due to the fact that all the code is written and then posted. That said, perhaps I may find a better approach in the future and would like to update my GitHub repository. You can always skip the screen captures dedicated to git in this post.

I started by creating a GitHub repository and named it SumOfTwo.

https://github.com/JohnCanessa/SumOfTwo.git

I did create a README.md file which requires the URL of this post. When I am done with the post I will go back to GitHub and just add the URL for this post which I do not have at this time. I will get it after the post is completed in WordPress.

Back at the ranch, I am using one of my Windows 10 machines to develop a solution in Java 8 using the Visual Studio Code IDE.

# **** clone the repository we just created in GitHub ****
C:\Users\John\workspace7>git clone https://github.com/JohnCanessa/SumOfTwo.git
Cloning into 'SumOfTwo'...
remote: Enumerating objects: 3, done.
remote: Counting objects: 100% (3/3), done.
remote: Compressing objects: 100% (2/2), done.
remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0
Unpacking objects: 100% (3/3), done.

# **** check if the SumOfTwo folder was created ****
C:\Users\John\workspace7>dir
02/19/2020  08:02 AM    <DIR>          .
02/19/2020  08:02 AM    <DIR>          ..
01/30/2020  01:32 PM    <DIR>          .metadata
01/31/2020  09:02 AM    <DIR>          .vscode
01/05/2020  07:15 AM    <DIR>          3DSurfaceArea
01/06/2020  11:52 AM    <DIR>          AlmostSorted
02/06/2020  07:16 AM    <DIR>          AndProduct
01/24/2020  12:49 PM    <DIR>          AStarSearch
12/05/2019  01:23 PM    <DIR>          BiggerIsGreater
02/14/2020  03:00 PM    <DIR>          Cipher
12/22/2019  08:35 AM    <DIR>          DeterminingDNAHealth
01/08/2020  07:07 AM    <DIR>          EmasSupercomputer
02/12/2020  02:46 PM    <DIR>          FirstDuplicate
02/03/2020  09:07 AM    <DIR>          FlippingBits
02/16/2020  09:09 AM    <DIR>          GitHubTest
01/31/2020  02:57 PM    <DIR>          guava-mini
01/31/2020  10:44 AM    <DIR>          HelloMaven
02/17/2020  10:43 AM         2,339,925 ij.jar
02/17/2020  10:52 AM    <DIR>          ImageJ
02/04/2020  02:04 PM    <DIR>          JumpingOnTheClouds
01/23/2020  08:05 AM    <DIR>          LarrysArray
12/03/2019  04:26 PM    <DIR>          MatrixLayerRotation
02/03/2020  10:42 AM    <DIR>          MiniMaxSum
01/25/2020  08:07 AM    <DIR>          PriorityQ
01/30/2020  01:42 PM    <DIR>          rtree2
02/03/2020  01:44 PM    <DIR>          SansaXOR
01/19/2020  09:16 AM    <DIR>          StacksReverseOrder
02/19/2020  08:02 AM    <DIR>          SumOfTwo     <===
01/04/2020  07:51 AM    <DIR>          TheGridSearch
12/11/2019  10:09 AM    <DIR>          ToTrieOrNotToTrie

# **** change directory ****
C:\Users\John\workspace7>cd SumOfTwo

# **** check the contents of this folder ****
C:\Users\John\workspace7\SumOfTwo>dir /A
02/19/2020  08:02 AM    <DIR>          .
02/19/2020  08:02 AM    <DIR>          ..
02/19/2020  08:02 AM    <DIR>          .git
02/19/2020  08:02 AM               233 README.md

We move to the workspace7 folder and clone the GitHub repo we just created. The SumOfTwo folder was created. We move to the new folder and list its contents. As expected we only have the README.md file which will be updated when done with this post.

1 2 3
10 20 30 40
42

main <<< a: [1, 2, 3]
main <<< b: [10, 20, 30, 40]
main <<< v: 42
main <<< true
main <<< true


0 0 -5 30212
-10 40 -3 9
-8

main <<< a: [0, 0, -5, 30212]
main <<< b: [-10, 40, -3, 9]
main <<< v: -8
main <<< true
main <<< true

The video suggests two test cases. It seems appropriate due to the complexity of the problem.

We have the contents of the two integer arrays in two separate lines. The third line contains the value. Note that each pass has two results which should match. This will become clear as we look at the actual code.

 /**
   * Test scaffolding.
   */
  public static void main(String[] args) {

    // **** open scanner ****
    Scanner sc = new Scanner(System.in);

    // **** read the contents for array a ****
    String s = sc.nextLine();
    String[] ss = s.split(" ");
    int[] a = new int[ss.length];
    for (int i = 0; i < a.length; i++) {
      a[i] = Integer.parseInt(ss[i]);
    }

    // ???? ????
    System.out.println("main <<< a: " + Arrays.toString(a));

    // **** read the contents for array b ****
    s = sc.nextLine();
    ss = s.split(" ");
    int[] b = new int[ss.length];
    for (int i = 0; i < b.length; i++) {
      b[i] = Integer.parseInt(ss[i]);
    }

    // ???? ????
    System.out.println("main <<< b: " + Arrays.toString(b));

    // **** read the value ****
    int v = sc.nextInt();

    // ???? ????
    System.out.println("main <<< v: " + v);

    // **** ****
    System.out.println("main <<< " + sumOfTwo(a, b, v));
    System.out.println("main <<< " + sumOfTwo1(a, b, v));

    // **** close scanner ****
    sc.close();
  }

We start by opening a scanner. We read and display the contents of the first array. We repeat for the contents of the second array. Then we read the value. We are ready to process the input. Note that I pass the same arguments to two implementations of the solution. As we will see the first one is a brute force approach. The second uses an additional data structure resulting in a much better and efficient algorithm.

  /**
   * Determine wheater there is a pair of numbers, where one number is taken from
   * a and the other from b, that can be added together to get a sum of v. Return
   * true if such a pair exists, otherwise return false.
   * 
   * Brure force approach. O(n^2).
   */
  static boolean sumOfTwo(int[] a, int[] b, int v) {

    // **** traverse both arrays ****
    for (int i = 0; i < a.length; i++) {
      for (int j = 0; j < b.length; j++) {
        if (a[i] + b[j] == v) {
          return true;
        }
      }
    }

    // **** value not found ****
    return false;
  }

We have a double loop. The outer one traverses the values of the first array. For each value the inner loop traverses the second array. If a match is found we return true. If we get through the loops and a match is not found we return false. This is O(n^2) complexity.

Let’s now try a faster algorithm, Note that we skipped any sorting and went directly to the best approach.


  /**
   * Determine wheater there is a pair of numbers, where one number is taken from
   * a and the other from b, that can be added together to get a sum of v. Return
   * true if such a pair exists, otherwise return false.
   * 
   * O(n) approach.
   */
  static boolean sumOfTwo1(int[] a, int[] b, int v) {

    // **** values of array b ****
    HashSet<Integer> hs = new HashSet<Integer>();

    // **** add the values of array b into the hash set (O(n))****
    for (int i = 0; i < b.length; i++) {
      hs.add(b[i]);
    }

    // *** traverse array a looking for a value in the hash set (O(n)) ****
    for (int i = 0; i < a.length; i++) {

      // **** desired value ****
      int x = v - a[i];

      // **** look for it in the hash set ****
      if (hs.contains(x)) {
        return true;
      }
    }

    // **** value not found ****
    return false;
  }

In this approach we use a hash set. We put in all the values of the second array. This is O(n) complexity.

We then traverse the first array and compute the value we need to get from the second array to have a match. If we find the value in the hash map, we return true. This loop also runs in O(n) time.

If a pair is not found we return false as required.

# **** check the status ****
C:\Users\John\workspace7\SumOfTwo>git status
On branch master
Your branch is up to date with 'origin/master'.

Untracked files:
  (use "git add <file>..." to include in what will be committed)

        Solution.java

nothing added to commit but untracked files present (use "git add" to track)

# **** add Solution.java so we can commit it ****
C:\Users\John\workspace7\SumOfTwo>git add Solution.java

# **** check status ****
C:\Users\John\workspace7\SumOfTwo>git status
On branch master
Your branch is up to date with 'origin/master'.

Changes to be committed:
  (use "git reset HEAD <file>..." to unstage)

        new file:   Solution.java

# **** commit the changes ****
C:\Users\John\workspace7\SumOfTwo>git commit -m "brute force and hash set approaches"
[master bf7a801] brute force and hash set approaches
 1 file changed, 113 insertions(+)
 create mode 100644 Solution.java

# **** check status ****
C:\Users\John\workspace7\SumOfTwo>git status
On branch master
Your branch is ahead of 'origin/master' by 1 commit.
  (use "git push" to publish your local commits)

nothing to commit, working tree clean

# **** push the changes to the GitHub repo ****
C:\Users\John\workspace7\SumOfTwo>git push origin master
Enumerating objects: 4, done.
Counting objects: 100% (4/4), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3/3), done.
Writing objects: 100% (3/3), 1.08 KiB | 555.00 KiB/s, done.
Total 3 (delta 0), reused 0 (delta 0)
To https://github.com/JohnCanessa/SumOfTwo.git
   ed19c10..bf7a801  master -> master

# **** check status ****
C:\Users\John\workspace7\SumOfTwo>git status
On branch master
Your branch is up to date with 'origin/master'.

nothing to commit, working tree clean

# **** check the log ****
C:\Users\John\workspace7\SumOfTwo>git log master
commit bf7a80196f0d915ffdf9138672db804a8d2e7bfc (HEAD -> master, origin/master, origin/HEAD)
Author: JohnCanessa <john.canessa@gmail.com>
Date:   Wed Feb 19 09:01:23 2020 -0600

    brute force and hash set approaches

commit ed19c108b6d1e2bb2c792ab4891102ef0a9db34e
Author: John Canessa <8019504+JohnCanessa@users.noreply.github.com>
Date:   Wed Feb 19 08:01:51 2020 -0600

    Create README.md

We are ready to push our code to the GitHub repo. As usual, I like to check status between commands. We need to add the java file in preparation for the commit operation. We commit with a message indicating what was changed. We are ready to push the changes to master. When done we check the log file. Our history matches what we did.

The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address:  john.canessa@gmail.com. I will reply as soon as possible.

Keep on reading and experimenting. It is the best way to learn, refresh your knowledge and enhance your developer toolset!

All (297) | Administrator (1) | Subscriber (296)

John

Twitter:  @john_canessa

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.