Have been somewhat busy in the past couple weeks. To add to it, last weekend I started an on-line “Spring” course. Classes run for three hours Saturdays and Sundays for one month. Need to get the assignments done in the next couple days.
A couple weeks ago I was talking with a manager about prefix searches. At the time I mentioned that I always take a quick look on the Internet to get a general idea on what to do. I did so and Tries came up as an initial candidate. I wrote a blog about it. I tend to be quite passionate with my work and computer science in general. I went and spent some time solving a challenge All Domains -> Data Structures- > Trie -> Contacts on HackerRank (https://www.hackerrank.com/challenges/contacts).
In a nutshell the challenge calls for solutions using a Trie. After an hour or so I had a basic solution which passed the sample test. The issue was that the challenge has fourteen test cases. To help with ideas you can check for free the Discussions tab. For a fee (in hackos) you can purchase one of the tests. So far I have not purchase a test for any challenge. In my opinion that is how one learns.
It seems that the issue was not on the Trie implementation but on the performance of the “find partial” query. I coded my solutions using Java. In my first approach I used a HashMap to implement the Trie. It passed two or three test cases. My implementation follows:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
import java.util.HashMap;
import java.util.Iterator;
import java.util.Map;
import java.util.Map.Entry;
class TrieNode {
final int initialCapacity = 26;
final float loadFactor = 1.0f;
// **** fields ****
char c;
HashMap<Character, TrieNode> children = new HashMap<Character, TrieNode>(initialCapacity, loadFactor);
boolean isLeaf;
// **** constructors ****
public TrieNode() {
this.c = ‘\0’;
}
public TrieNode (Character c) {
this.c = c;
}
}
class Trie {
// **** fields ****
private TrieNode root;
// **** constructor ****
public Trie() {
root = new TrieNode();
}
// **** methods ****
public void insert(String word) {
HashMap<Character, TrieNode> children = root.children;
for (int i = 0; i < word.length(); i++) {
TrieNode t = null;
char c = word.charAt(i);
if (children.containsKey(c)){
t = children.get(c);
} else {
t = new TrieNode(c);
children.put(c, t);
}
children = t.children;
// **** set leaf node ****
if (i == (word.length() – 1))
t.isLeaf = true;
}
}
public void find(String partial) {
int count = 0;
TrieNode t = searchNode(partial);
if (t == null) {
System.out.println(count);
} else {
count = leafNodes(t);
System.out.println(count);
}
}
private int leafNodes(TrieNode root) {
int count = 0;
if (root.isLeaf) {
count++;
}
Iterator<Entry<Character, TrieNode>> it = root.children.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Character, TrieNode> pair = (Entry<Character, TrieNode>)it.next();
TrieNode t = pair.getValue();
count += leafNodes(t);
}
return count;
}
private TrieNode searchNode(String str){
HashMap<Character, TrieNode> children = root.children;
TrieNode t = this.root;
for (int i = 0; i < str.length(); i++) {
char c = str.charAt(i);
if (children.containsKey(c)){
t = children.get(c);
children = t.children;
} else {
return null;
}
}
return t;
}
public void traverse() {
traverseTrie(this.root, “”);
}
private void traverseTrie(TrieNode root, String s){
if (root.isLeaf) {
System.out.println(“<<< s ==>” + s + “<==”);
}
Iterator<Entry<Character, TrieNode>> it = root.children.entrySet().iterator();
while (it.hasNext()) {
Map.Entry<Character, TrieNode> pair = (Entry<Character, TrieNode>)it.next();
Character c = (Character)pair.getKey();
traverseTrie((TrieNode)pair.getValue(), s + c.toString());
}
}
}
public class Solution {
public static void main(String[] args) {
String buffer = “”;
int N = 0;
String op = “”;
/// **** open scanner ****
// Scanner sc = new Scanner(System.in);
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
// **** read the number of operations to perform on the trie ****
// N = sc.nextInt();
try {
buffer = in.readLine();
} catch (IOException e1) {
e1.printStackTrace();
}
N = Integer.parseInt(buffer);
// System.out.println(“<<< N: ” + N);
// **** create the root for the trie ****
Trie root = new Trie();
// **** loop reading and performing the specified operations ****
for (int n = 0; n < N; n++) {
// String op = sc.next();
try {
op = in.readLine();
} catch (IOException e) {
e.printStackTrace();
}
// System.out.println(“<<< op ==>” + op + “<==”);
String namePartial[] = op.split(” “);
// System.out.println(“<<< namePartial[0] ==>” + namePartial[0] + “<==”);
// System.out.println(“<<< namePartial[1] ==>” + namePartial[1] + “<==”);
switch (namePartial[0]) {
case “add”:
// String name = sc.next();
// root.insert(name);
root.insert(namePartial[1]);
break;
case “find”:
// String partial = sc.next();
// root.find(partial);
root.find(namePartial[1]);
break;
}
}
// **** close scanner ****
// sc.close();
try {
in.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
I decided to switch from Scanner to BufferedReader in hopes that the timeouts on some cases where due to the input throughput. Instead of replacing the Scanner I just commented out. I believe that addressed one or two cases at most.
On my third attempt I decided to switch from HashMap to Array. My code follows:
import java.io.BufferedReader;
import java.io.IOException;
import java.io.InputStreamReader;
class TrieNode {
char letter;
TrieNode[] links;
boolean fullWord;
int count;
TrieNode(char letter) {
this.letter = letter;
this.links = new TrieNode[26];
this.fullWord = false;
this.count = 0;
}
}
public class Solution {
static TrieNode createTree()
{
return(new TrieNode(‘\0’));
}
static void insertWord(TrieNode root, String word)
{
int offset = 97;
int len = word.length();
char[] letters = word.toCharArray();
TrieNode curNode = root;
for (int i = 0; i < len; i++) {
if (curNode.links[letters[i] – offset] == null) {
curNode.links[letters[i] – offset] = new TrieNode(letters[i]);
}
curNode.count++;
curNode = curNode.links[letters[i] – offset];
}
curNode.fullWord = true;
curNode.count++;
}
static TrieNode find(TrieNode root, String word) {
char[] letters = word.toCharArray();
int len = letters.length;
int offset = 97;
TrieNode curNode = root;
int i;
for (i = 0; i < len; i++) {
if (curNode == null) {
return null;
}
curNode = curNode.links[letters[i] – offset];
}
if ((i == len) && (curNode == null)) {
return null;
}
if ((curNode != null) && !curNode.fullWord) {
return null;
}
return curNode;
}
static TrieNode findStartNode(TrieNode root, String partial) {
char[] letters = partial.toCharArray();
int len = letters.length;
int offset = 97;
TrieNode curNode = root;
for (int i = 0; i < len; i++) {
if (curNode == null) {
return null;
}
curNode = curNode.links[letters[i] – offset];
}
return curNode;
}
static int countContacts(TrieNode root) {
int count = 0;
if (root.fullWord) {
count++;
}
for (int i = 0; i < 26; i++) {
TrieNode curNode = root.links[i];
if (curNode != null) {
count += countContacts(curNode);
}
}
return count;
}
public static void main(String[] args) {
TrieNode t = null;
int N = 0;
String line = “”;
String op = “”;
String name = “”;
BufferedReader in = new BufferedReader(new InputStreamReader(System.in));
try {
N = Integer.parseInt(in.readLine());
} catch (NumberFormatException | IOException e) {
e.printStackTrace();
}
TrieNode root = createTree();
for (int n = 0; n < N; n++) {
try {
line = in.readLine();
} catch (IOException e) {
e.printStackTrace();
}
String[] tokens = line.split(” “);
op = tokens[0];
name = tokens[1];
if (op.equals(“add”)) {
insertWord(root, name);
} else {
t = findStartNode(root, name);
if (t != null) {
System.out.println(t.count);
} else {
System.out.println(0);
}
}
}
}
}
As you can tell, there is a new field (count) in the Trie class. Disregard it for the moment. The implementation took care of nine out of fourteen cases. I only had five to clear to complete the challenge (easy said than done). I spent a couple more hours looking for ways to simplify the code. No joy.
I decided to read comments one more time to get inspired. The issue seemed to be related not to the insertion but the lookup (find) operation. I tested loading a full English dictionary and all appeared to work well. The find required to look for end (leaf) nodes starting at some node in the tree. If we could just add a field to count the leaf nodes when inserting, the lookups could be reduced to O(1).
Go back and look at the definition of the TrieNode and the insertWord() method. The count field accomplishes exactly what is needed. The code passed all fourteen test cases. The countContacts() method is no longer used.
At this time, I was sure to be done experimenting with Tries. I did find a different approach which will be in the next blog entry.
If you have comments or questions, please do not hesitate to contact me.
John
john.canessa@gmail.com
I think you misspelled the word “accesible” on your website. If you want to keep errors off of your site we’ve successfully used a tool like SpellPros.com in the past for our websites. A nice customer pointed out our mistakes so I’m just paying it forward :).
Hi Rebecca,
Thanks for the message.
Tried looking for the typo on the post at no avail.
Will keep in mind your suggestion.
Regards;
John