306 Subscribers
Today I decided to solve a HackerRank problem. Randomly I selected The Full Counting Sort. If interested read the requirements. I read the requirements and decided to give it a try.
Based on my experience with this problem you might want to follow my advice. Work on the algorithm and make sure it passes the two sample test cases. Once you are done, submit your solution. If you have a valid approach then chances are that your solution will fail test #5, it will time out. I generated up to three different versions of the countSort() function. I could not get past test #5 because it would time out. I spent time reading the discussions and they did not make much sense. I even bought test #5 for some hackos. By the way, the test includes 1,000,000 strings which I could not download no matter how many times I tried. With this problem do not purchase test #5. You will not be able to run it.
I will just go over the steps I took to get to my solution which was accepted.
I went to GitHub, created a repository and named it JohnCanessa/FullCountingSort. Once created, I wrote a one liner README.md file which I have to update with the link to this post after it has been published.
# **** initial contents of my workspace **** C:\Users\John\workspace7>dir 02/19/2020 08:02 AM <DIR> . 02/19/2020 08:02 AM <DIR> .. 01/30/2020 01:32 PM <DIR> .metadata 01/31/2020 09:02 AM <DIR> .vscode 01/05/2020 07:15 AM <DIR> 3DSurfaceArea 01/06/2020 11:52 AM <DIR> AlmostSorted 02/06/2020 07:16 AM <DIR> AndProduct 01/24/2020 12:49 PM <DIR> AStarSearch 12/05/2019 01:23 PM <DIR> BiggerIsGreater 02/14/2020 03:00 PM <DIR> Cipher 12/22/2019 08:35 AM <DIR> DeterminingDNAHealth 01/08/2020 07:07 AM <DIR> EmasSupercomputer 02/12/2020 02:46 PM <DIR> FirstDuplicate 02/03/2020 09:07 AM <DIR> FlippingBits 02/16/2020 09:09 AM <DIR> GitHubTest 01/31/2020 02:57 PM <DIR> guava-mini 01/31/2020 10:44 AM <DIR> HelloMaven 02/17/2020 10:43 AM 2,339,925 ij.jar 02/17/2020 10:52 AM <DIR> ImageJ 02/04/2020 02:04 PM <DIR> JumpingOnTheClouds 01/23/2020 08:05 AM <DIR> LarrysArray 12/03/2019 04:26 PM <DIR> MatrixLayerRotation 02/03/2020 10:42 AM <DIR> MiniMaxSum 01/25/2020 08:07 AM <DIR> PriorityQ 01/30/2020 01:42 PM <DIR> rtree2 02/03/2020 01:44 PM <DIR> SansaXOR 01/19/2020 09:16 AM <DIR> StacksReverseOrder 02/19/2020 08:08 AM <DIR> SumOfTwo 01/04/2020 07:51 AM <DIR> TheGridSearch 12/11/2019 10:09 AM <DIR> ToTrieOrNotToTrie # **** after creating the repo in GitHub, clone it to my machine **** C:\Users\John\workspace7>git clone https://github.com/JohnCanessa/FullCountingSort.git Cloning into 'FullCountingSort'... remote: Enumerating objects: 3, done. remote: Counting objects: 100% (3/3), done. remote: Compressing objects: 100% (2/2), done. remote: Total 3 (delta 0), reused 0 (delta 0), pack-reused 0 Unpacking objects: 100% (3/3), done. # **** workspace folder after cloning **** C:\Users\John\workspace7>dir 02/20/2020 08:08 AM <DIR> . 02/20/2020 08:08 AM <DIR> .. 01/30/2020 01:32 PM <DIR> .metadata 01/31/2020 09:02 AM <DIR> .vscode 01/05/2020 07:15 AM <DIR> 3DSurfaceArea 01/06/2020 11:52 AM <DIR> AlmostSorted 02/06/2020 07:16 AM <DIR> AndProduct 01/24/2020 12:49 PM <DIR> AStarSearch 12/05/2019 01:23 PM <DIR> BiggerIsGreater 02/14/2020 03:00 PM <DIR> Cipher 12/22/2019 08:35 AM <DIR> DeterminingDNAHealth 01/08/2020 07:07 AM <DIR> EmasSupercomputer 02/12/2020 02:46 PM <DIR> FirstDuplicate 02/03/2020 09:07 AM <DIR> FlippingBits 02/20/2020 08:08 AM <DIR> FullCountingSort <=== 02/16/2020 09:09 AM <DIR> GitHubTest 01/31/2020 02:57 PM <DIR> guava-mini 01/31/2020 10:44 AM <DIR> HelloMaven 02/17/2020 10:43 AM 2,339,925 ij.jar 02/17/2020 10:52 AM <DIR> ImageJ 02/04/2020 02:04 PM <DIR> JumpingOnTheClouds 01/23/2020 08:05 AM <DIR> LarrysArray 12/03/2019 04:26 PM <DIR> MatrixLayerRotation 02/03/2020 10:42 AM <DIR> MiniMaxSum 01/25/2020 08:07 AM <DIR> PriorityQ 01/30/2020 01:42 PM <DIR> rtree2 02/03/2020 01:44 PM <DIR> SansaXOR 01/19/2020 09:16 AM <DIR> StacksReverseOrder 02/19/2020 08:08 AM <DIR> SumOfTwo 01/04/2020 07:51 AM <DIR> TheGridSearch 12/11/2019 10:09 AM <DIR> ToTrieOrNotToTrie # **** move to the new directory **** C:\Users\John\workspace7>cd FullCountingSort # **** dontents of the new folder **** C:\Users\John\workspace7\FullCountingSort>dir /A 02/20/2020 08:08 AM <DIR> . 02/20/2020 08:08 AM <DIR> .. 02/20/2020 08:08 AM <DIR> .git 02/20/2020 08:08 AM 147 README.md <=== # **** check on our remotes **** C:\Users\John\workspace7\FullCountingSort>git remote -v origin https://github.com/JohnCanessa/FullCountingSort.git (fetch) origin https://github.com/JohnCanessa/FullCountingSort.git (push)
I opened a command prompt in my Windows 10 machine and got to my workspace7 directory. I will put the code in a sub directory which will be created when I clone my new repository in GitHub.
I cloned my repository and looked at the contents of the workspace7 folder. We now have a new FullCountingSort folder. The new folder contains the README.md file that I created. As you can see we can fetch and push to the remote GitHub repo.
4 0 a 1 b 0 c 1 d - c - d 20 0 ab 6 cd 0 ef 6 gh 4 ij 0 ab 6 cd 0 ef 6 gh 0 ij 4 that 3 be 0 to 1 be 5 question 1 or 2 not 4 is 2 to 4 the - - - - - to be or not to be - that is the question - - - - 10 1 e 2 a 1 b 3 a 4 f 1 f 2 a 1 e 1 b 1 c - - f e b c - a - - 20 88 pe 36 uw 57 ng 58 lt 29 gv 44 be 22 bz 67 yd 20 xm 86 gy 99 dd 35 eg 19 qc 65 jm 6 po 76 xv 72 qm 24 vn 98 mr 11 ab po ab qc - - vn - eg - - - - jm - qm xv - - mr dd
The sample test cases contain a number of entries followed by the actual entries. Each entry consists of a number in the range 0 <= n < 100 followed by a string of length 1<= s <= 10. The caveat is that the first half of the strings we input need to be replaced by the string “-“.
0 -> ab ef ab ef ij to 1 -> be or 2 -> not to 3 -> be 4 -> ij that is the 5 -> question 6 -> cd gh cd gh - - - - - to be or not to be - that is the question - - - - 0 0 0 0 0 0 1 1 2 2 3 4 4 4 4 5 6 6 6 6
The second example with 20 strings is well explained in the requirements page at HackerRank. I believe the representation is a good match of the output of the expected solution.
I decided to implement the solution using the Java 8 programming language with the Visual Studio Code IDE.
The following recommendations came from the discussions which at the time I read them did not make much sense to me.
- StringBuilder, don’t underestimate how important it is when working with Strings.
- I/O is slow, minimize where possible.
- Really think about the aspects that contribute the most to overhead/performance and not try small things which take a lot of time and result in small / negligible gains.
- Scanner is slower than BufferedReader.
I could understand the comments about StringBuilder but all the input was done by the provided test scaffolding. Our task is to implement the countSort() function which does not have to perform input. It does have to output the results.
/** * Original test scaffolding. */ public static void main(String[] args) throws IOException { // **** **** BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in)); // **** **** int n = Integer.parseInt(bufferedReader.readLine().trim()); // **** **** List<List<String>> arr = new ArrayList<>(); // **** **** IntStream.range(0, n).forEach(i -> { try { arr.add(Stream.of(bufferedReader.readLine().replaceAll("\\s+$", "").split(" ")).collect(toList())); } catch (IOException ex) { throw new RuntimeException(ex); } }); // **** **** countSort(arr); // **** **** bufferedReader.close(); }
As usual I copy the test scaffolding provided by HackerRank and edit it adding comments in order to understand how the input is collected and presented to the function / method in which we should provide our code.
In this case the countSort() function, which is the place to implement our solution, receives an array of List<List<String>>. Each string contains a number in the range [ 0 : 99 ] followed a string with a length in the range [ 1 : 10 ]. When our function is called we receive the input data already packaged, at least that is the idea.
The problem is that no matter what I did, test #5 would time out. As I mentioned before, I read the comments in the discussion and it seemed that the only way to get past that test case was to ignore the test scaffolding and implement the solution there. In my humble opinion such approach is unacceptable. The implication is that you can no longer trust a test scaffolding in future problems.
/** * Complete the countSort function below. */ static void countSort(List<List<String>> arr) { // **** instantiate array of string builders **** StringBuilder[] sb = new StringBuilder[100]; for (int i = 0; i < 100; i++) { sb[i] = new StringBuilder(); } // **** to speed up String to Integer conversion **** HashMap<String, Integer> hm = new HashMap<String, Integer>(); for (int i = 0; i <= 99; i++) { String s = Integer.toString(i); hm.put(s, i); } // **** **** int index = 0; // **** set first half of the strings to "- " **** for (int i = 0; i < arr.size() / 2; i++) { index = hm.get(arr.get(i).get(0)); sb[index].append("- "); } // **** append strings with same index to respective string builder **** for (int i = arr.size() / 2; i < arr.size(); i++) { index = hm.get(arr.get(i).get(0)); sb[index].append(arr.get(i).get(1) + " "); } // **** build and output the result string **** StringBuilder result = new StringBuilder(); for (int i = 0; i < 100; i++) { result.append(sb[i].toString()); } System.out.println(result.toString()); }
The implementation of the countSort function passes all the initial test cases. Once you submit it only test #5 times out.
What I did is to edit the test scaffolding incorporating the code that took me hours to implement.
/** * Test scaffolding. UNACCEPTABLE having to replace the test scaffolding. */ public static void main(String[] args) throws IOException { // **** open buffered reader **** BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(System.in)); // **** get the number of elements **** int n = Integer.parseInt(bufferedReader.readLine().trim()); // **** instantiate array of string builders **** StringBuilder[] sb = new StringBuilder[100]; for (int i = 0; i < 100; i++) { sb[i] = new StringBuilder(); } // **** set first half of the strings to "- " **** for (int i = 0; i < n / 2; i++) { // **** read the next line **** String s = bufferedReader.readLine(); // **** **** String[] split = s.split(" "); // **** **** int index = Integer.parseInt(split[0]); // **** **** sb[index].append("- "); } // **** append strings with same index to respective string builder **** for (int i = n / 2; i < n; i++) { // **** read the next line **** String s = bufferedReader.readLine(); // **** **** String[] split = s.split(" "); // **** **** int index = Integer.parseInt(split[0]); sb[index].append(split[1] + " "); } // **** build and output the result string **** StringBuilder result = new StringBuilder(); for (int i = 0; i < 100; i++) { result.append(sb[i].toString()); } System.out.println(result.toString()); // **** close buffered reader **** bufferedReader.close(); }
First we open a buffered reader to read the count of strings and the actual strings. We then instantiate an array of string builders. The array will be used to build the output strings associated with each number.
The next loop takes care of replacing the first half of the input strings with the minus (-) character. Given that we need to display the strings separated by a space, we use the “- “ string instead.
The next loop which is similar to the first takes into account the actual string. Note that in both loops we immediately append the string to the proper string builder.
At this point we are ready to print the results. We could have used a loop to print the string builders as strings, but for I/O reasons we concatenated the individual string builders into a larger one and then printed it in a single operation.
Guess what, this implementation passed all tests on the first submission!!!
# **** check status **** C:\Users\John\workspace7\FullCountingSort>git status On branch master Your branch is up to date with 'origin/master'. Untracked files: (use "git add <file>..." to include in what will be committed) Solution.java <=== nothing added to commit but untracked files present (use "git add" to track) # **** add to next commit **** C:\Users\John\workspace7\FullCountingSort>git add Solution.java # **** check status **** C:\Users\John\workspace7\FullCountingSort>git status On branch master Your branch is up to date with 'origin/master'. Changes to be committed: (use "git reset HEAD <file>..." to unstage) new file: Solution.java # **** commit current software **** C:\Users\John\workspace7\FullCountingSort>git commit -m "first version of a completed solution" [master 8797ee4] first version of a completed solution 1 file changed, 145 insertions(+) create mode 100644 Solution.java # **** push origin to master **** C:\Users\John\workspace7\FullCountingSort>git push origin master Enumerating objects: 4, done. Counting objects: 100% (4/4), done. Delta compression using up to 8 threads Compressing objects: 100% (3/3), done. Writing objects: 100% (3/3), 1.25 KiB | 1.25 MiB/s, done. Total 3 (delta 0), reused 0 (delta 0) To https://github.com/JohnCanessa/FullCountingSort.git 8dd3d15..8797ee4 master -> master # **** check the master (no branches at this time) **** C:\Users\John\workspace7\FullCountingSort>git log -a commit 8797ee4102d8f813c29b2ca891b91c2436ae7fd5 (HEAD -> master, origin/master, origin/HEAD) Author: JohnCanessa <john.canessa@gmail.com> Date: Fri Feb 21 14:05:01 2020 -0600 first version of a completed solution commit 8dd3d1544360eb500b7c790c687831aaf60d6312 Author: John Canessa <8019504+JohnCanessa@users.noreply.github.com> Date: Thu Feb 20 08:05:06 2020 -0600 Create README.md
We are done with the code for the solution. We need to update the GitHub repository.
As usual, we get status. We need to add Solution.java to the list of files to commit. We add Solution.java and check status. Looks like we are ready to commit.
We then commit with a message briefly describing what we committed. We then push origin to master. At this point our version and the one in our GitHub repo are in sync.
Finally we check the log file on master. We have two entries. The first when we created the README.md file and the second when we pushed the solution to the HackerRank problem.
I went to my GitHub repo and verified that the Solution.java file was in the repo and that it matched what we had in our computer.
OK, we are now done with this problem. I will make an entry in the discussions to mention that solution for problems should not be dependent with the test scaffolding. All solutions should be limited to functions and methods created outside of main().
The entire code for this project can be found in my GitHub repository.
If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a private message using the following address: john.canessa@gmail.com. I will reply as soon as possible.
Keep on reading and experimenting. It is the best way to learn, refresh your knowledge and enhance your developer toolset!
John
Twitter: @john_canessa
Hi John,
Why the number in a given arr is assumed never greater than 99? I have seen similar approach used for other problems but don’t know what’s the logic behind. You never know how many elements will be in a given arr and the number associated with each element/string right?
Thanks
Hi Lin,
Thanks for the message.
That are the constraints for the problem.
I assume there are other approaches that might be hindered by the number of entries.
This requirement would allow one to use an array that has a fixed length.
Given that the requirement is present we may use it in our advantage.
Regards;
John
When back to read the problem agian. Found the logic there, 0<=x<100.
Hi Lin,
Thanks for the message.
The problem requirements specifies such constraint among others.
Regards;
John