Baby Names

Currently we live in a world of deception. The overflow of data that we are exposed to every day via news channels, companies and social media streams does not allow us to easily distinguish what is true, what is somewhat true and what are just plain lies.

I am not a political fanatic. Last week president Trump paid a visit to the Twin Cities of Minneapolis and St. Paul. He had a presentation at Target Center in downtown. My only thoughts were not to be in or around that area the day of the event. What was interesting is that some Tweets showed pictures of the event before it started while people were starting to arrive. The Tweets stated that very few people attended the event. This is a premeditated lie. I happen to know a couple people who attended the event and had pictures and videos. Target Center was packed.

I also read an article from the Daily Mail. At first I thought what Trevor Jackson in the UK had developed is something that would disrupt electric cars. The current trend is to power electric cars with lithium-ion batteries. Such batteries require rare metals for their manufacturing. Once the batteries outlive their useful lives they need to be recycled. Both events heavily pollute the environment and should be taken into account when making the decision to buy / lease an electric car (i.e., Tesla). In addition the rare elements are not readily available so producers may decide to raise prices or just simply not make them available to the general public.

Another issue is that the size and weight of lithium-ion batteries needed to power an electric vehicle is large, while the range of a Tesla is less than 400 miles. The invention claims that with a battery that is about seven times smaller, an electric vehicle (i.e., Tesla) should have a range of about 1,500 miles.

When the aluminum batteries stop working, you just replace them. Aluminum (the most abundant metal on earth) is an element readily available in most parts of the world. The electrolyte is not poisonous. The article claims that Trevor Jackson has drunk it to prove his point in demonstrations. Do not tamper or try to drink the electrolyte from a lithium-ion battery because it is extremely poisonous. Also lithium-ion batteries tend to catch on fire. We all remember the issue with the batteries introduced in Samsung phones a few years ago and the number of cases of Tesla cars that spontaneously catch on fire.

If all the claims of the article are true Tesla should run and license this new technology. It would require some stations in which to replace batteries but on the other hand would remove the need of charging stations by 1/7. Hope the article is true and best wishes Trevor Jackson with his invention.

This post is about problem 17.7 Baby Names in the Cracking the Coding Interview by Gayle Laakmann McDowell. If you are interested in continuous education or changing jobs, I do recommend getting a copy of the book and trying the problems. You always learn something new or at least refresh your knowledge.

I am not going to copy the complete text for the problem. It is based on the list of names that the government releases every year holding the frequencies of the most common names for a specified year. In addition, parents use synonyms when naming their child. A list of synonym pairs is also made available (e.g., John -> Johnny, Jon -> John). A name may have no, one or more synonyms. Some names may or may not have been used in a specified year. Your mission [insert your name here], should you choose/decide to accept it (quote from Mission Impossible) is to print a list of names with associated true frequency. For example, if John -> 17, Johnny -> 6 and Jon -> 5 then this name represented by any one of the synonyms (John, Johnny or Jon) should have a true frequency of 17 + 6 + 5 = 28 (Jon 28).

I did scribble some thought on my white board.

On the top left I wrote some names and associated frequencies. I edited the names and numbers because I was not carrying the book from room to room. I also wrote some synonyms / aliases. I thought of referring to the first list as frequencies and to the second as pairs.

On the top right side I tried to figure out what I had to do. My thoughts were to build a new list of list in which each list would hold all the synonyms for a particular name including the first time a name or synonym showed up the pairs list.

The object is to print a name associated per synonym and the sum of all the frequencies associated with that name. I called my method printNameCount(). During my thinking process I changed names and types. When I started to implement my solution I looked up the data used in the problem description and in the solution. I did not look at the solution presented in the book until I was done. Without a doubt the most complicated method would be the one that takes the synonym pairs and builds a list holding them. Once that is accomplished, it would be rather simple to traverse the list, add the frequencies and print each result. Note that the problem does not require us to return a data structure with our results, we just need to print them.

5
John 15
Jon 12
Chris 13
Kris 14
Christopher 19
4
Jon John
John Johnny
Chris Kris
Chris Christopher
main <<< n: 5
main <<< freqs: [John, 15, Jon, 12, Chris, 13, Kris, 14, Christopher, 19]
main <<< n: 4
main <<< pairs: [Jon, John, John, Johnny, Chris, Kris, Chris, Christopher]

Johnny 27
Christopher 46main <<< n: 5
main <<< freqs: [John, 15, Jon, 12, Chris, 13, Kris, 14, Christopher, 19]
main <<< n: 4
main <<< pairs: [Jon, John, John, Johnny, Chris, Kris, Chris, Christopher]
collectSynonyms <<< synonyms: [[Jon, John, Johnny], [Chris, Kris, Christopher]]

Johnny 27
Christopher 46

After developing my solution using Java and the Eclipse IDE (will use Visual Studio Core on the next problem) I processed the sample data and produced some results. I always use the Test Driven Development (TDD) approach. Start with the test scaffolding and a sample set. Write the shells for the proposed methods which will miserably fail. Then start writing code and refining not only the code but the test scaffolding as needed. This approach works very well for me. Before I lay out the skeleton is where some paper or a white board becomes very useful. That said, in the real world, you cannot stop and spend hours or days with a white board. You need to start coding and cycling between code and white board in order to refine your solution. When done with my approach I looked at the two proposed solutions. I liked the graph approach but did not implement it at this time.

9
John 10
Jon 3
Davis 2
Kari 3
Johnny 11
Carlton 8
Carleton 2
Jonathan 9
Carrie 5
5
Jonathan John
Jon Johnny
Johnny John
Kari Carrie
Carleton Carlton
main <<< n: 9
main <<< freqs: [John, 10, Jon, 3, Davis, 2, Kari, 3, Johnny, 11, Carlton, 8, Carleton, 2, Jonathan, 9, Carrie, 5]
main <<< n: 5
main <<< pairs: [Jonathan, John, Jon, Johnny, Johnny, John, Kari, Carrie, Carleton, Carlton]
collectSynonyms <<< synonyms: [[Jonathan, John, Johnny, Jon], [Kari, Carrie], [Carleton, Carlton]]

Jon 33
Carrie 8
Carlton 10
Davis 2

This second set of data is similar but if you notice Johnny does not have a frequency in the first example and Davis does not have an associated synonym. You need to take care of both conditions. This adds to the example. I find it hard to develop a complete and bullet proof solution in about 60 to 90 minutes, especially when you have to make it as efficient as possible.

	/*
	 * Test scaffolding.
	 */
	public static void main(String[] args) {

		// **** frequency list ****
		ArrayList<String> freqs = new ArrayList<String>();
		
		// **** open scanner ****
		Scanner sc = new Scanner(System.in);
		
		// **** read in the number of entries ****
		int n = sc.nextInt();
		
		// ???? ????
		System.out.println("main <<< n: " + n);
		
		// **** loop filling the frequency list ****
		for (int i = 0; i < n; i++) {
			
			// **** read name and frequency ****
			String name = sc.next();
			int freq = sc.nextInt();
			
			// **** ****
			freqs.add(name);
			freqs.add(String.valueOf(freq));
		}
		
		// ???? ????
		System.out.println("main <<< freqs: " + freqs.toString());
		
		// **** read in the number of pairs list ****
		n = sc.nextInt();
		
		// ???? ????
		System.out.println("main <<< n: " + n);
		
		// **** list of pairs ****
		ArrayList<String> pairs = new ArrayList<String>();
		
		// **** loop reading the name pairs ****
		for (int i = 0; i < n; i++) {
			pairs.add(sc.next());
			pairs.add(sc.next());
		}
		
		// ???? ????
		System.out.println("main <<< pairs: " + pairs.toString());
		
		// **** close scanner ****
		sc.close();
		
		// **** print the true trequencies ****
		trueFreq(freqs, pairs);
	}

We start by reading the number of entries in the frequencies list. We follow it by reading the list. At some point I was going to use a hash map but decided to stick the design and implementation to using lists. It would have been quite simple to switch to a hash map. The book solutions read this first set of data into a hash map.

The next item is to read in the count and the actual synonym pairs. We just use a single method to compute and print the required results.

	/*
	 * Print names and true frequencies.
	 */
	static void trueFreq(ArrayList<String> freqs, ArrayList<String> pairs) {
		
		// **** build list of synonyms ****
		ArrayList<ArrayList<String>> synonyms = collectSynonyms(pairs);
		System.out.println();

		// **** traverse list of list of synonyms adding frequencies ****
		for (int i = 0; i < synonyms.size(); i++) {
			
			// **** get the next list ****
			ArrayList<String> synLst = synonyms.get(i);
			
			// **** traverse this list ****
			int sum 	= 0;
			String name = "";
			for (int j = 0; j < synLst.size(); j++) {
				
				// **** for ease of use ****
				name 			= synLst.get(j);
				int index 		= freqs.indexOf(name);

				// **** check if name does not have a frequency ****
				if (index == -1)
					continue;
			
				int frequency 	= Integer.valueOf(freqs.get(index + 1));
				
				// **** remove the frequency and associated name from the list ****
				freqs.remove(index + 1);
				freqs.remove(index);
				
				// **** add to the true frequency ****
				sum += frequency;
			}
			
			// **** display the name and frequency sum ****
			System.out.println(name + " " + sum);
		}		
		
		// ***** display remaining names and frequencies (without synonyms) ****
		for (int i = 0; i < freqs.size(); i += 2)
			System.out.println(freqs.get(i) + " " + freqs.get(i + 1));	
	}

The trueFreq() method first builds a list of synonyms per name. This seems to be the most complicated aspect of this problem. The simplicity and performance is based on the representations and data structures used.

Once the list of list is composed we can traverse them adding the frequencies and displaying a name form the synonyms (in our case the last one in each list) and the sum. When done we need to display the names that do not have a synonym. For this I decided to remove from the list of frequencies the names as they were being processed. Using such approach what is left needs to be traversed to complete the list.

	/*
	 * Build a list with list of synonyms.
	 */
	static ArrayList<ArrayList<String>> collectSynonyms(ArrayList<String> pairs) {
		
		// **** list of list of synonyms ****
		ArrayList<ArrayList<String>> synonyms = new ArrayList<ArrayList<String>>();
		
		// **** ****
		while (pairs.size() != 0) {
			
			// **** get the first name and synonym out of the list ****
			String name 	= pairs.remove(0);
			String synonym 	= pairs.remove(0);

			// **** add them to a temp list ****
			ArrayList<String> temp = new ArrayList<String>();
			temp.add(name);
			temp.add(synonym);
			
			// **** loop finding synonyms for names in the temp list (add them as needed) ****
			int i = 0;
			for (boolean traversed = false; !traversed; ) {
							
				// **** ****
				String curName = temp.get(i);
								
				// **** look for this name in the pairs ****
				int j = pairs.indexOf(curName);
								
				// **** name found ****
				if (j != -1) {
					
					// **** add new synonym ****
					if ((j % 2) == 0) {
						temp.add(pairs.get(j + 1));
					}
					
					// **** add new name ****
					else {
						temp.add(pairs.get(j - 1));
					}
					
					// **** remove pair ****
					if ((j % 2) == 0) {
						pairs.remove(j + 1);
						pairs.remove(j);
					} else {
						pairs.remove(j);
						pairs.remove(j - 1);
					}
				}
				
				// **** name not found ****
				else {
					i++;
				}
								
				// **** check if we finish traversing the list ****
				if (i >= temp.size())
					traversed = true;
			}

			// **** add the temp list to the synonyms list ****
			synonyms.add(temp);
		}
		
		// ???? ????
		System.out.println("collectSynonyms <<< synonyms: " + synonyms.toString());

		// **** return list of list of synonyms ****
		return synonyms;
	}

The collectSynonyms() method returns a list of lists with each list holding all the related synonyms. We achieve this by building a temp list and then adding it to the main list. We traverse the temp list looking for synonyms. That way we do not miss an associated one for each of the names.

The entire code for this project can be found in my GitHub repository.

If you have comments or questions regarding this or any other post in this blog, or if you would like me to help with any phase in the SDLC (Software Development Life Cycle) of a product or service, please do not hesitate and leave me a note below. Requests for help will remain private.

Keep on reading and experimenting. It is the best way to learn and refresh your knowledge!

John

Follow me on Twitter:  @john_canessa

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.