LeetCode 229. Majority Element II in Java

In this post we will solve the LeetCode 229. Majority Element II problem using two approaches.

Given an integer array of size n, find all elements that appear more than floor(n/3)times.

Constraints:

o 1 <= nums.length <= 5 * 10^4
o -10^9 <= nums[i] <= 10^9

We are given an array of integer values. We need to find all elements whose values are larger than floor(n / 3), where `n` is the number of elements in the input array.

If interested in this problem, I suggest that you read the current / actual description of the problem before attempting a solution.

After reading the requirements for the problem it seemed to me that a viable approach would be to use a hashmap to keep track of the frequencies of the values. Then using the values in the hashmap we could iterate while computing the floor adding the values that meet the requirement to a list. Our function of interest would then return the list.

We will be solving this problem using Java and the VSCode IDE on a Windows platform. I do this because I have developed my own approach to solve problems using a Test Driven Development approach. I find it easier to develop a solution and then copy it to the online IDE provided by LeetCode. That said; you should consider the simpler approach of solving the problem online using the IDE provided by LeetCode.

public List&lt;Integer&gt; majorityElement(int[] nums) {
	
}

The signature for the function of interest matches very well with the requirement description. We are provided an int[] array of integers and we need to return a list holding the result.

1,1,1,1,2,3,5
main <<< nums: [1, 1, 1, 1, 2, 3, 5]
<<< freqs: {1=4, 2=1, 3=1, 5=1}
main <<< output: [1]
<<<     n: 7
<<< floor: 2
<<< 1=4, 5=1
main <<< output: [1]


3,2,3
main <<< nums: [3, 2, 3]
<<< freqs: {2=1, 3=2}
main <<< output: [3]
<<<     n: 3
<<< floor: 1
<<< 3=2, 2=1
main <<< output: [3]


1
main <<< nums: [1]
<<< freqs: {1=1}
main <<< output: [1]
<<<     n: 1
<<< floor: 0
<<< 1=1, -1=0
main <<< output: [1]


1,2
main <<< nums: [1, 2]
<<< freqs: {1=1, 2=1}
main <<< output: [1, 2]
<<<     n: 2
<<< floor: 0
<<< 1=1, 2=1
main <<< output: [1, 2]


1,2,3
main <<< nums: [1, 2, 3]
<<< freqs: {1=1, 2=1, 3=1}
main <<< output: []
<<<     n: 3
<<< floor: 1
<<< 1=1, 2=1
main <<< output: []   


1,2,2,3,3,3
main <<< nums: [1, 2, 2, 3, 3, 3]
<<< freqs: {1=1, 2=2, 3=3}
main <<< output: [3]
<<<     n: 6
<<< floor: 2
<<< 3=3, 2=2
main <<< output: [3]


1,1,2,2,3,3,3
main <<< nums: [1, 1, 2, 2, 3, 3, 3]
<<< freqs: {1=2, 2=2, 3=3}
main <<< output: [3]
<<<     n: 7
<<< floor: 2
<<< 3=3, 2=2
main <<< output: [3]


1,1,2,2,2,3,3,3
main <<< nums: [1, 1, 2, 2, 2, 3, 3, 3]
<<< freqs: {1=2, 2=3, 3=3}
main <<< output: [2, 3]
<<<     n: 8
<<< floor: 2
<<< 3=3, 2=3
main <<< output: [3, 2]


1,1,2,2,2,3,3,3,4,4,4,4
main <<< nums: [1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 4]
<<< freqs: {1=2, 2=3, 3=3, 4=4}
main <<< output: []
<<<     n: 12
<<< floor: 4
<<< 4=4, 2=3
main <<< output: []


1,1,1,3,3,2,2,2
main <<< nums: [1, 1, 1, 3, 3, 2, 2, 2]
<<< freqs: {1=3, 2=3, 3=2}
main <<< output: [1, 2]
<<<     n: 8
<<< floor: 2
<<< 1=3, 2=3
main <<< output: [1, 2]

Since I am solving the problem on my Windows computer, I will need to develop a test scaffold. Please note that the test code IS NOT PART OF THE SOLUTION.

The test cases are separated by two blank lines. Each test case starts with a list of integers that our test code needs to read, create and populate an int[] array and for sanity purposes display the array to verify that before we call the function of interest all is well.

The first implementation seems to display the contents of a hashmap. The values and frequencies seem to match the ones in the input array. Our first implementation builds a list of integers and our test scaffold displays the list returned by the function of interest.

Our second implementation displays the number of elements in the input array and the floor which we will use to determine the integers that go into our output list.

Our test function displays none, one or two sets of values. The first is an integer from the input array followed by the associated frequency. Note that these pairs match the ones in the previous implementation, but the number of pairs may be smaller than in the previous implementation using a hashmap.

Note that both outputs match. We will discuss the second implementation shortly.

    /**
     * Test scaffold.
     * !!! NOT PART OF SOLUTION !!!
     * @throws IOException
     */
    public static void main(String[] args) throws IOException {
        
        // **** open buffered reader ****
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // **** read nums int[] ****
        int[] nums = Arrays.stream(br.readLine().trim().split(","))
                        .mapToInt(Integer::parseInt)
                        .toArray();

        // **** close bufferd reader ****
        br.close();
        
        // ???? ????
        System.out.println("main <<< nums: " + Arrays.toString(nums));

        // **** call function of interest and display output ****
        System.out.println("main <<< output: " + majorityElement0(nums).toString());

       // **** call function of interest and display output ****
       List<Integer> lst = majorityElement(nums);
       Collections.sort(lst);
       System.out.println("main <<< output: " + lst.toString());
    }

We use a buffered reader to read the input line and create and populate an int[] array. The array is displayed. We then call two implementations of the function of interest. Note that for display purposes, we have sorted the list from the second implementation. LeetCode is only interested in the values and not their order in the list. If you copy the code for both implementations they should be accepted.

    /**
     * Given an integer array of size n, 
     * find all elements that appear more than floor(n/3) times.
     * 
     * Execution: O(n) - Space: O(n)
     * 
     * 82 / 82 test cases passed.
     * Status: Accepted
     * Runtime: 7 ms
     * Memory Usage: 42.3 MB
     */
    static public List<Integer> majorityElement0(int[] nums) {

        // **** initialization ****
        int n                           = nums.length;
        HashMap<Integer, Integer> freqs = new HashMap<>();
        List<Integer> lst               = new ArrayList<>();

        // **** compute the floor ****
        // int floor = (int)Math.floor(n / 3);
        int floor = n / 3;

        // **** populate the freqs hashmap - O(n) ****
        for (int key : nums) {
            Integer val = freqs.putIfAbsent(key, 1);
            if (val != null)
                freqs.put(key, ++val);
        }

        // ???? ????
        System.out.println("<<< freqs: " + freqs.toString());

        // **** iterate through the freqs hash map - O(n) ****
        for (Map.Entry<Integer, Integer> e : freqs.entrySet()) {

            // **** for ease of use ****
            // int key = e.getKey();
            // int val = e.getValue();

            // **** add key to list (if needed) ****
            if (e.getValue() > floor) lst.add(e.getKey());
        }

        // **** return list of keys that met the floor condition ****
        return lst;
    }

We perform some initializations. We will use a hashmap to keep the frequencies of the integers and a list to return our result. We also initialize the `floor` variable which we need to use to select which frequencies are accepted so we can add them to the list.

We traverse the int[] and populate the hashmap with the frequencies for each value.

The contents of the hashmap are displayed. This is done for our benefit and IS NOT PART OF THE SOLUTION. When you see statements following a `// ???? ????` comment, they are for testing or to check alternate approaches.

Now that we have the frequencies, we iterate through the hashmap. If the key value matches the condition, we add the key to the output list.

When all is said and done, we return the list.

Now please take a look at the comments section in this implementation of the function of interest. It works and the solution was accepted, but it seems we could do better!

I then spent time online looking for better approaches. I found the Boyer–Moore majority vote algorithm which I summarized as follows:

Boyer–Moore majority vote algorithm
https://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_majority_vote_algorithm

The Boyer–Moore majority vote algorithm is an algorithm for finding the majority of a sequence of elements 
using linear time and constant space.

It is not possible for a streaming algorithm to find the most frequent element in less than linear space, 
for sequences whose number of repetitions can be small.

I dug further and found some Java code in the Boyer-Moore Majority Voting Algorithm and some additional information from which I summarized as follows:

Boyer-Moore Majority Voting Algorithm
https://www.geeksforgeeks.org/boyer-moore-majority-voting-algorithm/

The Boyer-Moore voting algorithm is one of the popular optimal algorithms 
which is used to find the majority element among the given elements `that have more than N/2 occurrences`. 

This works perfectly fine for finding the majority element which takes 2 traversals over the given elements, 
which works in O(N) time complexity and O(1) space complexity.

This algorithm works on the fact that if an element occurs more than N/2 times, 
it means that the remaining elements other than this would definitely be less than N/2.

The code addresses a single value so it had to be adapted to using two values.

If interested I would suggest to read both articles before proceeding to view the second implementation of the function of interest.

    /**
     * Given an integer array of size n, 
     * find all elements that appear more than floor(n/3) times.
     * 
     * Using the Boyer–Moore majority vote algorithm.
     * 
     * Execution: O(n) - Space: O(1)
     * 
     * 82 / 82 test cases passed.
     * Status: Accepted
     * Runtime: 1 ms
     * Memory Usage: 43.1 MB
     */
    static public List<Integer> majorityElement(int[] nums) {

        // **** initialization ****
        int n               = nums.length;
        List<Integer> lst   = new ArrayList<>();

        // int floor        = (int)Math.floor(n / 3);
        int floor           = n / 3;

        // ???? ????
        System.out.println("<<<     n: " + n);
        System.out.println("<<< floor: " + floor);

        int n1  = -1;
        int c1  = 0;

        int n2  = -1;
        int c2  = 0;

        // **** algorithm first pass - O(n) ****
        for (int num : nums) {

            // **** increment count for n1 ****
            if (num == n1) {
                c1++;
            }

            // **** increment count for n2 ****
            else if (num == n2) {
                c2++;
            }

            // **** replace n1 with num and start counting ****
            else if (c1 == 0) {
                n1  = num;
                c1  = 1;
            }

            // **** replace n2 with num and start counting ****
            else if (c2 == 0) {
                n2  = num;
                c2  = 1;
            }

            // **** decrement both counters (num != n1 && num != n2) ****
            else {
                c1--;
                c2--;
            }          
        }

        // **** algorithm second pass - O(n) ****
        c1 = 0; c2 = 0;
        for (int num : nums){
            if (num == n1) c1++;
            if (num == n2) c2++; 
        }

        // ???? ????
        System.out.println("<<< " + n1 + "=" + c1 + ", " + n2 + "=" + c2);

        // **** found single element ****
        if (n1 == n2) {
            lst.add(n1);
            return lst;
        }

        // **** found n1 ****
        if (c1 > floor) lst.add(n1);

        // **** found n2 ****
        if (c2 > floor) lst.add(n2);

        // **** return list of keys that met the floor condition ****
        return lst;
    }

In this approach we do not use a hashmap. We replace it with four variables. Two hold the values of interest (n1 and n2) and the other two (c1 and c2) are used to keep track of their associated counts. 

We then traverse the input array updating the four variables mentioned in the previous paragraph. The GeeksforGeeks article has a very good description of the Boyer-Moore Majority Voting Algorithm.

Note that the algorithm uses two pases. Each pass is executed in O(n) time so the algorithm runs in O(n) time and O(1) space. 

When all is said and done, we need to populate and return the list with none, one or two values.

Please take a look at the comments section of this function. It runs much faster and uses less space than the first implementation.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository named MajorityElementII.

Please note that the code here presented might not be the best possible solution. In most cases, after solving the problem, you should take a look at the best accepted solution provided by the different websites (i.e., Codility_, HackerRank, LeetCode to name a few).

If you have comments or questions regarding this, or any other post in this blog, please do not hesitate and leave me a note below. I will reply as soon as possible.

Keep on reading and experimenting. It is one of the best ways to learn, become proficient, refresh your knowledge and enhance your developer / engineering toolset.

Thanks for reading this post, feel free to connect with me John Canessa at LinkedIn.

Enjoy;

John

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.