It is a nice sunny day, at least when I was having breakfast earlier this morning, in the Twin Cities of Minneapolis and St. Paul. Better yet, it is Friday!!!
Spoke with one of my sons. He and his family had scheduled a holiday and were on the road. When they moved, they build a home. Some years went by and last fall they decided to buy a new one that they liked. Shortly after they moved and put their first home on the market. A few months went by and finally they closed on it yesterday. I am very glad for them. Having two mortgages is not convenient at all.
Earlier this morning I tackled the Tag Content Extractor from HackerRank. It calls for a solution based on regular expressions.
After spending time and watching some videos to refresh the subject, I came up with the following Java code which I implemented using my Eclipse IDE:
public static void regexChecker(String theRegex, String str2Check) { // ???? ???? System.out.println("theRegex ==>" + theRegex + "<=="); System.out.println("str2Check ==>" + str2Check + "<== str2Check.length: " + str2Check.length()); // **** **** Pattern pattern = Pattern.compile(theRegex); // **** **** Matcher regexMatcher = pattern.matcher(str2Check); // **** look for a match **** boolean matchFound = false; while(regexMatcher.find()) { // ???? ???? System.out.println("regexMatcher.groupCount: " + regexMatcher.groupCount()); for (int i = 0; i <= regexMatcher.groupCount(); i++) System.out.println("group(" + i + "): " + regexMatcher.group(i)); System.out.println("regexMatcher.start: " + regexMatcher.start()); System.out.println("regexMatcher.end: " + regexMatcher.end()); // **** display the contents of the tag **** System.out.println(regexMatcher.group(2)); // **** flag that a match was found **** matchFound = true; } // **** if match not found **** if (!matchFound) System.out.println("None"); // ???? ???? System.out.println(); }
We need to find a pattern that complies, that is text bracketed by a set of matching tags which happens to adhere to the HTML standard.
I do not know about you, but at work I am seldom exposed to problems that require me to use regular expressions. I have a copy of Regular Expression Pocket Reference book and use Google search to find materials to refresh and learn. In this case I consulted both.
Allow me to show you the test examples that I used:
4 <h1>Nayeem loves counseling</h1> <h1> <h1>Sanjay has no watch</h1> </h1> <par>So wait for a while</par> <Amee>safat codes like a ninja</amee> <SA premium>Imtiaz has a secret crush</SA premium> theRegex ==><(.+)>([^<]+)</\1><== str2Check ==> <h1>Nayeem loves counseling</h1> <== str2Check.length: 32 regexMatcher.groupCount: 2 group(0): <h1>Nayeem loves counseling</h1> group(1): h1 group(2): Nayeem loves counseling regexMatcher.start: 0 regexMatcher.end: 32 Nayeem loves counseling theRegex ==><(.+)>([^<]+)</\1><== str2Check ==> <h1> <h1>Sanjay has no watch</h1> </h1> <par>So wait for a while</par><== str2Check.length: 67 regexMatcher.groupCount: 2 group(0): <h1>Sanjay has no watch</h1> group(1): h1 group(2): Sanjay has no watch regexMatcher.start: 4 regexMatcher.end: 32 Sanjay has no watch regexMatcher.groupCount: 2 group(0): <par>So wait for a while</par> group(1): par group(2): So wait for a while regexMatcher.start: 37 regexMatcher.end: 67 So wait for a while theRegex ==><(.+)>([^<]+)</\1><== str2Check ==><Amee>safat codes like a ninja</amee><== str2Check.length: 37 None theRegex ==><(.+)>([^<]+)</\1><== str2Check ==><SA premium>Imtiaz has a secret crush</SA premium><== str2Check.length: 50 regexMatcher.groupCount: 2 group(0): <SA premium>Imtiaz has a secret crush</SA premium> group(1): SA premium group(2): Imtiaz has a secret crush regexMatcher.start: 0 regexMatcher.end: 50 Imtiaz has a secret crush 1 <h1> Hello World </h1> theRegex ==><(.+)>([^<]+)</\1><== str2Check ==> <h1> Hello World </h1> <== str2Check.length: 23 regexMatcher.groupCount: 2 group(0): <h1> Hello World </h1> group(1): h1 group(2): Hello World regexMatcher.start: 0 regexMatcher.end: 23 Hello World 3 <h1>Hello</h1> <h2>World</h2> <a>Good Day</a> <name>John</name> <h1>Hi<</h2> <h2>John; how are you?</h2> theRegex ==><(.+)>([^<]+)</\1><== str2Check ==> <h1>Hello</h1> <h2>World</h2> <== str2Check.length: 28 regexMatcher.groupCount: 2 group(0): <h1>Hello</h1> group(1): h1 group(2): Hello regexMatcher.start: 0 regexMatcher.end: 14 Hello regexMatcher.groupCount: 2 group(0): <h2>World</h2> group(1): h2 group(2): World regexMatcher.start: 14 regexMatcher.end: 28 World theRegex ==><(.+)>([^<]+)</\1><== str2Check ==><a>Good Day</a> <name>John</name><== str2Check.length: 33 regexMatcher.groupCount: 2 group(0): <a>Good Day</a> group(1): a group(2): Good Day regexMatcher.start: 0 regexMatcher.end: 15 Good Day regexMatcher.groupCount: 2 group(0): <name>John</name> group(1): name group(2): John regexMatcher.start: 16 regexMatcher.end: 33 John theRegex ==><(.+)>([^<]+)</\1><== str2Check ==> <h1>Hi<</h2> <h2>John; how are you?</h2> <== str2Check.length: 40 regexMatcher.groupCount: 2 group(0): <h2>John; how are you?</h2> group(1): h2 group(2): John; how are you? regexMatcher.start: 13 regexMatcher.end: 40 John; how are you?
The first one was provided by the challenge. The other two I just made up.
Note that the screen captures contain debug statements. They were commented out for the submission.
My entire solution can be found on my GitHub repository.
If you have comments or questions or if you need my services for a software development project, please leave me a note bellow. Requests for assistance will not be made public.
Keep of reading and experimenting. It is the only way to learn.
John
Follow me on Twitter: @john_canessa