Contacts with Trie

trieHave been somewhat busy in the past couple weeks. To add to it, last weekend I started an on-line “Spring” course. Classes run for three hours Saturdays and Sundays for one month. Need to get the assignments done in the next couple days.

A couple weeks ago I was talking with a manager about prefix searches. At the time I mentioned that I always take a quick look on the Internet to get a general idea on what to do. I did so and Tries came up as an initial candidate. I wrote a blog about it. I tend to be quite passionate with my work and computer science in general. I went and spent some time solving a challenge All Domains -> Data Structures- > Trie -> Contacts on HackerRank (https://www.hackerrank.com/challenges/contacts).

In a nutshell the challenge calls for solutions using a Trie. After hacker_rank_logoan hour or so I had a basic solution which passed the sample test. The issue was that the challenge has fourteen test cases. To help with ideas you can check for free the Discussions tab. For a fee (in hackos) you can purchase one of the tests. So far I have not purchase a test for any challenge. In my opinion that is how one learns.

how-hashmap-works-internally-in-javaIt seems that the issue was not on the Trie implementation but on the performance of the “find partial” query. I coded my solutions using Java. In my first approach I used a HashMap to implement the Trie. It passed two or three test cases.  My implementation follows:

import java.io.BufferedReader;

import java.io.IOException;

import java.io.InputStreamReader;

import java.util.HashMap;

import java.util.Iterator;

import java.util.Map;

import java.util.Map.Entry;

class TrieNode {

final int initialCapacity  = 26;

final float loadFactor            = 1.0f;

// **** fields ****

char                       c;

HashMap<Character, TrieNode>      children = new HashMap<Character, TrieNode>(initialCapacity, loadFactor);

boolean                           isLeaf;

// **** constructors ****

public TrieNode() {

this.c = ‘\0’;

}

public TrieNode (Character c) {

this.c = c;

}

}

class Trie {

// **** fields ****

private TrieNode     root;

// **** constructor ****

public Trie() {

root = new TrieNode();

}

// **** methods ****

public void insert(String word) {

HashMap<Character, TrieNode> children = root.children;

for (int i = 0; i < word.length(); i++) {

TrieNode      t = null;

char          c = word.charAt(i);

if (children.containsKey(c)){

t = children.get(c);

} else {

t = new TrieNode(c);

children.put(c, t);

}

children = t.children;

// **** set leaf node ****

if (i == (word.length() – 1))

t.isLeaf = true;

}

}

public void find(String partial) {

int                  count  = 0;

TrieNode      t             = searchNode(partial);

if (t == null) {

System.out.println(count);

} else {

count = leafNodes(t);

System.out.println(count);

}

}

private int leafNodes(TrieNode root) {

int count     = 0;

if (root.isLeaf) {

count++;

}

Iterator<Entry<Character, TrieNode>> it = root.children.entrySet().iterator();

while (it.hasNext()) {

Map.Entry<Character, TrieNode> pair = (Entry<Character, TrieNode>)it.next();

TrieNode t = pair.getValue();

count += leafNodes(t);

}

return count;

}

private TrieNode searchNode(String str){

HashMap<Character, TrieNode>      children      = root.children;

TrieNode                                        t                    = this.root;

for (int i = 0; i < str.length(); i++) {

char c = str.charAt(i);

if (children.containsKey(c)){

t = children.get(c);

children = t.children;

} else {

return null;

}

}

return t;

}

public void traverse() {

traverseTrie(this.root, “”);

}

private void traverseTrie(TrieNode root, String s){

if (root.isLeaf) {

System.out.println(“<<< s ==>” + s + “<==”);

}

Iterator<Entry<Character, TrieNode>> it = root.children.entrySet().iterator();

while (it.hasNext()) {

Map.Entry<Character, TrieNode> pair = (Entry<Character, TrieNode>)it.next();

Character c = (Character)pair.getKey();

traverseTrie((TrieNode)pair.getValue(), s + c.toString());

}

}

}

public class Solution {

public static void main(String[] args) {

String buffer = “”;

int    N             = 0;

String        op            = “”;

/// **** open scanner ****

//            Scanner sc = new Scanner(System.in);

BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

// **** read the number of operations to perform on the trie ****

//            N = sc.nextInt();

try {

buffer = in.readLine();

} catch (IOException e1) {

e1.printStackTrace();

}

N = Integer.parseInt(buffer);

//            System.out.println(“<<< N: ” + N);

// **** create the root for the trie ****

Trie root = new Trie();

// **** loop reading and performing the specified operations ****

for (int n = 0; n < N; n++) {

//                   String op = sc.next();

try {

op = in.readLine();

} catch (IOException e) {

e.printStackTrace();

}

//                   System.out.println(“<<< op ==>” + op + “<==”);

String namePartial[] = op.split(” “);

//                   System.out.println(“<<< namePartial[0] ==>” + namePartial[0] + “<==”);

//                   System.out.println(“<<< namePartial[1] ==>” + namePartial[1] + “<==”);

switch (namePartial[0]) {

case “add”:

//                                String name = sc.next();

//                                root.insert(name);

root.insert(namePartial[1]);

break;

case “find”:

//                                String partial = sc.next();

//                                root.find(partial);

root.find(namePartial[1]);

break;

}

}

// **** close scanner ****

//            sc.close();

try {

in.close();

} catch (IOException e) {

e.printStackTrace();

}

}

}

I decided to switch from Scanner to BufferedReader in hopes that the timeouts on some cases where due to the input throughput. Instead of replacing the Scanner I just commented out. I believe that addressed one or two cases at most.trie_array

On my third attempt I decided to switch from HashMap to Array. My code follows:

import java.io.BufferedReader;

import java.io.IOException;

import java.io.InputStreamReader;

class TrieNode {

char      letter;

TrieNode[]       links;

boolean   fullWord;

int              count;

TrieNode(char letter) {

this.letter  = letter;

this.links   = new TrieNode[26];

this.fullWord      = false;

this.count   = 0;

}

}

public class Solution {

static TrieNode createTree()

{

return(new TrieNode(‘\0’));

}

static void insertWord(TrieNode root, String word)

{

int offset          = 97;

int len            = word.length();

char[] letters     = word.toCharArray();

TrieNode curNode   = root;

for (int i = 0; i < len; i++) {

if (curNode.links[letters[i] – offset] == null) {

curNode.links[letters[i] – offset] = new TrieNode(letters[i]);

}

curNode.count++;

curNode = curNode.links[letters[i] – offset];

}

curNode.fullWord = true;

curNode.count++;

}

static TrieNode find(TrieNode root, String word) {

char[] letters     = word.toCharArray();

int len            = letters.length;

int offset         = 97;

TrieNode curNode   = root;

int i;

for (i = 0; i < len; i++) {

if (curNode == null) {

return null;

}

curNode = curNode.links[letters[i] – offset];

}

if ((i == len) && (curNode == null)) {

return null;

}

if ((curNode != null) && !curNode.fullWord) {

return null;

}

return curNode;

}

static TrieNode findStartNode(TrieNode root, String partial) {

char[] letters     = partial.toCharArray();

int len            = letters.length;

int offset         = 97;

TrieNode curNode   = root;

for (int i = 0; i < len; i++) {

if (curNode == null) {

return null;

}

curNode = curNode.links[letters[i] – offset];

}

return curNode;

}

static int countContacts(TrieNode root) {

int count = 0;

if (root.fullWord) {

count++;

}

for (int i = 0; i < 26; i++) {

TrieNode curNode = root.links[i];

if (curNode != null) {

count += countContacts(curNode);

}

}

return count;

}

public static void main(String[] args) {

TrieNode t         = null;

int N              = 0;

String line          = “”;

String op            = “”;

String name          = “”;

BufferedReader in = new BufferedReader(new InputStreamReader(System.in));

try {

N = Integer.parseInt(in.readLine());

} catch (NumberFormatException | IOException e) {

e.printStackTrace();

}

TrieNode root      = createTree();

for (int n = 0; n < N; n++) {

try {

line = in.readLine();

} catch (IOException e) {

e.printStackTrace();

}

String[] tokens = line.split(” “);

op                         = tokens[0];

name                 = tokens[1];

if (op.equals(“add”)) {

insertWord(root, name);

} else {

t = findStartNode(root, name);

if (t != null) {

System.out.println(t.count);

} else {

System.out.println(0);

}

}

}

}

}

As you can tell, there is a new field (count) in the Trie class. Disregard it for the moment. The implementation took care of nine out of fourteen cases. I only had five to clear to complete the challenge (easy said than done). I spent a couple more hours looking for ways to simplify the code. No joy.

I decided to read comments one more time to get inspired. The issue seemed to be related not to the insertion but the lookup (find) operation. I tested loading a full English dictionary and all appeared to work well. The find required to look for end (leaf) nodes starting at some node in the tree. If we could just add a field to count the leaf nodes when inserting, the lookups could be reduced to O(1).

Go back and look at the definition of the TrieNode and the insertWord() method. The count field accomplishes exactly what is needed. The code passed all fourteen test cases. The countContacts() method is no longer used.

more_to_comeAt this time, I was sure to be done experimenting with Tries. I did find a different approach which will be in the next blog entry.

If you have comments or questions, please do not hesitate to contact me.

John

john.canessa@gmail.com

Leave a Reply

Your email address will not be published. Required fields are marked *