I was going to have a different introduction (the regular chit chat) to this post, but based on the article “Malevolent Machine Learning” by Chris Edwards on Communications of the ACM 12/2019 Volume 62 Number 12 I changed my mind. Allow me to explain.
A couple years ago I took a few courses from Coursera. The general topic was big data and machine learning. The course of interest is “Neural Networks and Deep Learning” by Andrew Ng deeplearning.ai. It was very interesting class and I highly recommend it.
I tried locating my notes on those courses, but not sure what I did with them. Found my notes. I hoped to have stored the transcriptions to verify what I recall about a specific class. I do not have the transcriptionsTowards the end of the article, Chris Edwards describes the issue with DNN (Deep Neural Networks) and relates it to the “softmax”. There is also a reference to Nicolas Papernot, a research scientist at Google Brain stating that the softmax is not a suitable model for making predictions. If interested in this topic I suggest you read the article and come to your own conclusions.
I will send a message to Andrew Ng and hope to get a response. If I do, I will request permission to make it public in this blog.
OK, enough of ML models. Let’s get down to the purpose of this post. I visited the HackerRank web site and picked up the next challenge with MEDIUM difficulty. To be honest it took me more time than the last DIFFICULT challenge. If you are interested, take a look at the requirements for the Bigger is Better challenge at HackerRank.
The idea is to transform a word so we get a higher value alphabetical order. The caveat is that it should produce the smallest possible value. For example, for w = “abcd” the expected string would be “abdc”. That makes sense because we could return “dcba” by reordering the value in descending order. We would meet the first criteria described in the problem, but would fail for the second one.
I understand ASCII characters and their representation. But I found it easier to use small integers instead of their ASCII values. To experiment I would take w = “abcd” and map ‘a’ to 1, ‘b’ to 2, ‘c’ to 3 and ‘d’ to 4. Perhaps this string is not a good example because the letters happen to be in ascending order. The representation would be 1234, or 1,234 depending which is easier for you to follow.
The second criteria stops us from suggesting 4321 which is larger than 1234, but would not represent the smallest value. If we switch the lowest digits form 1234 to 1243, and map them back to characters, we would have “abdc” which is the expected result.
I thought of tackling the problem using a rolling window. Not sure if that idea lasted more than a few minutes.
I drifted through different approaches. Some seem to work with most of the examples (total of 5 + 6 = 11 words) but failed on others. Have to admit that I did not consult the discussions. I just used the examples.
I decided to use Java as a programming language to solve this problem. I also decided to use the Visual Studio Code IDE. When compared to Microsoft Visual Studio Enterprise 2019 or Eclipse, VS Code falls somewhat short. That said, to write small amounts of code, it is good enough!
As usual, I write my own test scaffolding. I like to use the essence of TDD. I do not spend time writing tests. I just work from the test scaffolding up to the solution. I find it easier to add statements to print variables as I go. In this case I did but towards the end I removed them.
From the HackerRank web site:
“Given a word, create a new word by swapping some or all of its characters.
This new word must meet two criteria:
It must be greater than the original word
It must be the smallest word that meets the first condition”
5 ab bb hefg dhck dkhc ba no answer hegf dhkc hcdk
In this set we have 5 strings. It is clear that the first string should produce “ba”. The second is not possible so the program requires one to print the string “no answer”. The fourth example is pretty much the same as we discussed with the string “abcd”. I have to admit that the descriptions for the 11 examples pretty much none exist. Just take a look at it on the HackerRank web page.
6 lmno dcba dcbb abdc abcd fedcbabcd lmon no answer no answer acbd abdc fedcbabdc
Somewhat more sophisticated strings. Note that the last string is somewhat longer than the rest.
/* * Test scaffolding. */ public static void main(String[] args) { // **** open a scanner **** Scanner sc = new Scanner(System.in); // **** read the number of test casses **** int t = sc.nextInt(); sc.nextLine(); // **** loop reading and processing words **** for (int i = 0; i < t; i++) { // **** read the next word **** String w = sc.next(); // ***** process the word **** System.out.println(biggerIsGreater(w)); } // **** close scanner **** sc.close(); }
For the test scaffolding we need to read the number of words followed by the actual words. The strings should be in lowercase. Once we have the number of words we enter a loop in which we read a word and call the function that process and display the associated new word. As usual, when done we close the scanner. In this case there is no need because we immediately exit the program.
/* * Complete the biggerIsGreater function below. */ static String biggerIsGreater(String w) { // **** check for single character **** if (w.length() <= 1) { return "no answer"; } // **** create a character bucket **** TreeMap<Character, Integer> bucket = new TreeMap<Character, Integer>(); // **** for ease of use **** StringBuilder sb = new StringBuilder(w); // **** go left looking for smaller character **** while (sb.length() > 1) { // **** move last character in string into the bucket **** lastCharToBucket(sb, bucket); // **** check if larger character in bucket **** Map.Entry<Character, Integer> entry = bucket.higherEntry(sb.charAt(sb.length() - 1)); if (entry != null) { break; } } // **** **** if (bucket.isEmpty()) return "no answer"; // **** look in bucket for the smallest character larger than this one **** Map.Entry<Character, Integer> entry = bucket.higherEntry(sb.charAt(sb.length() - 1)); // **** no character found **** if (entry == null) return "no answer"; // ***** for ease of use **** char ch = entry.getKey(); // **** move last character in string into the bucket **** lastCharToBucket(sb, bucket); // **** append character to string and remove from bucket **** appendCharRemoveFromBucket(ch, sb, bucket); // **** loop appending characters (in ascending order) from bucket **** while (!bucket.isEmpty()) { // **** get smallest character from bucket **** ch = bucket.firstKey(); // **** append character to string and remove from bucket **** appendCharRemoveFromBucket(ch, sb, bucket); } // **** return edited string *** return sb.toString(); }
We start by checking if the word contains a single character. If so we are done because such words do not meet requirements. I decided to traverse the word from right to left. Each time I remove the character and put it in a bucket. If the decision is made to stop traversing the string, then we pick up the characters from the bucket in ascending order. The reason is the second criteria. If the bucket contains the characters ‘d’, ‘a’, ‘c’ and ‘b’, we would append the string “abcd” which has a lower lexicographical order than “dcba”.
Note that I chose the TreeMap class because when we look for the next character to append in our string, we want to choose the one with lower lexicographical value. The map allows us to process duplicate characters i.e., w = “bb”.
The main loop is used to traverse the word (in a StringBuilder) from right to left. We pull the last character from the string and put it into the bucket. We check if we should continue removing the next character, or we are done and we need to start picking up characters from the bucket and appending them to the string. When all is done we return the new word.
/* * Remove the last character in the string and put it into the bucket. */ static void lastCharToBucket(StringBuilder sb, TreeMap<Character, Integer> bucket) { // **** put last character into bucket **** if (bucket.containsKey(sb.charAt(sb.length() - 1))) { int count = bucket.get(sb.charAt(sb.length() - 1)); count++; bucket.put(sb.charAt(sb.length() - 1), count); } else { bucket.put(sb.charAt(sb.length() - 1), 1); } // **** remove last character from string **** sb.deleteCharAt(sb.length() - 1); }
This function is used to remove the last character from the string and put it in the bucket. Remember that some characters may repeat so we need to keep track of their count.
/* * Append char to string and remove from bucket (if needed). */ static void appendCharRemoveFromBucket(char ch, StringBuilder sb, TreeMap<Character, Integer> bucket) { // **** append character to string **** sb.append(ch); // **** remove character from bucket or decrement count **** int count = bucket.get(ch); if (count == 1) bucket.remove(ch); else bucket.put(ch, --count); }
This last function is used to append the specified character to the word and then remove it from the bucket. We need to handle the case when characters in the word may repeat multiple times.
The entire code for this project can be found in my GitHub repository.
If you have comments or questions regarding this, or any other post in this blog, or if you would like for me to serve of assistance with any phase in the SDLC (Software Development Life Cycle) of a project associated with a product or service, please do not hesitate and leave me a note below. If you prefer, send me a message using the following address: john.canessa@gmail.com. All messages will remain private.
Keep on reading and experimenting. It is the best way to learn, refresh your knowledge and enhance your developer toolset!
John
Follow me on Twitter: @john_canessa