Fraudulent Activity Notification

Good day software developers and engineers. Hope your day is going well. The days are getting shorter and colder in the Twin Cities of Minneapolis and St. Paul.

As usual, my wife fixed lunch. It was quite good. Some type of meat stew. For desert we had `gypsy’s arm`. If interested in this dessert please read the article “Gypsy’s Arm Explained“. My wife has been making this desert since we were dating. At some point she used a single can of `dulce de leche`. In the past couple years she is been using two cans. That makes for a much better dessert. Please do not confuse `dulce de leche` with `tres leches cake`. They are completely different things.

Earlier today I took a stab at the HackerRank Fraudulent Activity Notifications problem. It is not that hard to come up with a working first pass. As expected, it times out.

The problem calls for the generation of the `median` on each pass. That requires updating a set of numbers and sorting them. The issue is that sorting takes O(n * log(n)) time. As you are developing the first pass, you know deep inside that the code will time out.

I experimented with `Counting Sort`. You can even find Java code in Counting Sort by GeeksforGeeks.

I was going to include a version of the code that would include Counting Sort, but decided to post what I have so far. I hope to write a post on Sorting Code later this week.

HackerLand National Bank has a simple policy for warning clients about possible fraudulent account activity. 

If the amount spent by a client on a particular day is greater than or equal to 2x the client's `median spending` 
for a trailing number of days, 
they send the client a notification about potential fraud.

The bank doesn't send the client any notifications until they have at least that trailing number of prior days' 
transaction data.

Given the number of trailing days `d` and a client's total daily expenditures for a period of  days, 
determine the number of times the client will receive a notification over all `n` days.

Median: 
o If the number of observations is odd, the number in the middle of the list is the median. 
  This can be found by taking the value of the (n+1)/2 -th term, where n is the number of observations.
  Else, if the number of observations is even, then the median is the simple average of the middle two numbers.

Input Format:
The first line contains two space-separated integers `n` and `d`, the number of days of transaction data, 
and the number of trailing days' data used to calculate median spending respectively.
The second line contains `n` space-separated non-negative integers where each integer `i` denotes `expenditure[i]`.

Constraints:
o 1 <= n <= 2 x 10**5
o 1 <= d <= n
o 0 <= expenditure[i] <= 200

The problem description is somewhat verbose. The key is given in the last constraint. I suggest you read the article `Counting Sort`, and see if you can draw the relationship between the two.

    /*
     * Complete the 'activityNotifications' function below.
     *
     * The function is expected to return an INTEGER.
     * The function accepts following parameters:
     *  1. INTEGER_ARRAY expenditure
     *  2. INTEGER d
     */
    public static int activityNotifications(List<Integer> expenditure, int d) {

    }

I will attempt to solve the problem on my computer using the Java programming language and the VSCode IDE on a Windows computer. Due to this approach, I will have to create a test scaffold that will read the input data, will call the function of interest and will display the result.

9 5
2 3 4 2 3 6 8 4 5
main <<< n: 9 d: 5
main <<< expenditure: [2, 3, 4, 2, 3, 6, 8, 4, 5]
main <<< nots: 2


5 4
1 2 3 4 4
main <<< n: 5 d: 4
main <<< expenditure: [1, 2, 3, 4, 4]
main <<< nots: 0


5 3
10 20 30 40 50
main <<< n: 5 d: 3
main <<< expenditure: [10, 20, 30, 40, 50]
main <<< nots: 1


7 4
1 4 1 2 7 5 2
main <<< n: 7 d: 4
main <<< expenditure: [1, 4, 1, 2, 7, 5, 2]
main <<< nots: 1

As specified by the requirements, we are given two input lines. The first contains two numbers. The first is the total number of integers and the second the number of days before and attempt is made to determine if the expenditure for a day requires the generation of an alert.

The input lines are read and placed into variables which are then displayed. The test code calls the function of interest and displays the result.

You will find two implementations. The first timed out. The second was accepted.

    /**
     * Test scaffold.
     */
    public static void main(String[] args) throws IOException {
        
        // **** open a buffered reader ****
        BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

        // **** read n and d ****
        String[] strs = br.readLine().trim().split(" ");

        // **** extract n and d ****
        int n = Integer.parseInt(strs[0]);
        int d = Integer.parseInt(strs[1]);

        // **** read expenditures ****
        List<Integer> expenditure = Stream.of(br.readLine().trim().split(" "))
                                        .map(Integer::parseInt)
                                        .collect(Collectors.toList());

        // **** close the buffered reader ****
        br.close();

        // ???? ????
        System.out.println("main <<< n: " + n + " d: " + d);
        System.out.println("main <<< expenditure: " + expenditure.toString());


        // // ???? verify operation of priority queues in the Window class ????
        // Window window = new Window();

        // window.largers.add(4);
        // window.largers.add(1);
        // window.largers.add(3);
        // window.largers.add(2);
        // window.largers.add(1);

        // window.smallers.add(4);
        // window.smallers.add(1);
        // window.smallers.add(3);
        // window.smallers.add(2);
        // window.smallers.add(1);

        // while (!window.largers.isEmpty())
        //     System.out.println("main <<<  larger: " + window.largers.poll());

        // while (!window.smallers.isEmpty())
        //     System.out.println("main <<< smaller: " + window.smallers.poll());


        // // **** compute and display the number of notifications ****
        // System.out.println("main <<< nots: " + activityNotifications0(expenditure, d));

        // **** compute and display the number of notifications ****
        System.out.println("main <<< nots: " + activityNotifications1(expenditure, d));
    }

The test code seems to follow the description of how it processes the input data.

For the second approach we will be using priority queues. I tested the queues and then commented out the code.

    /*
     * Complete the 'activityNotifications' function below.
     *
     * The function is expected to return an INTEGER.
     * The function accepts following parameters:
     *  1. INTEGER_ARRAY expenditure
     *  2. INTEGER d
     */
    public static int activityNotifications0(List<Integer> expenditure, int d) {

        // **** sanity check(s) ****
        if (expenditure.size() < d) return 0;

        // **** initialization ****
        double median               = 0.0;
        int nots                    = 0;
        LinkedList<Integer> window  = new LinkedList<Integer>();

        // **** populate window ****
        for (int i = 0; i < d; i++)
            window.add(expenditure.get(i));

        // **** sort window - O(n * log(n)) ****
        Collections.sort(window);

        // **** ****
        for (int i = d; i < expenditure.size(); i++) {

            // **** get the expenditure associated with the window ****
            Integer currentExpenditure = expenditure.get(i);

            // **** compute median of window ****
            if (d % 2 == 0)
                median = (window.get(d / 2 - 1) + window.get(d / 2)) / 2.0;
            else
                median = window.get(d / 2);         

            // **** count this notification (if needed) ****
            if (currentExpenditure >= 2.0 * median)
                nots++;

            // **** remove leftmost entry ****
            window.removeFirst();

            // ****  **** 
            ListIterator<Integer> sit   = window.listIterator();
            ListIterator<Integer> lit   = window.listIterator();
            int smallerValue            = 0;

            // **** insert current expenditure keeping window in sorted order **** 
            do {

                // **** ****
                int largerValue = lit.next();

                // **** insert before first element in window (if needed) ****
                if (smallerValue <= currentExpenditure && currentExpenditure <= largerValue) {
                    sit.add(currentExpenditure);
                    break;
                }

                // **** ****
                smallerValue = sit.next();
            } while (lit.hasNext());

            // **** ****
            if (!sit.hasNext())
                sit.add(currentExpenditure);
        }

        // **** return number of notifications ****
        return nots;
    }

This is not my first approach which passed the test cases but, as expected, timed out.

In this approach we sort the window of interest once. From that point on we try to update values in order not to sort the window to obtain the median. Not fast enough. The code timed out.

I took a look at priority queues which we have used many times in this blog.

In Java programming, Java Priority Queue is implemented using Heap Data Structures 
and Heap has `O(log(n))` time complexity to insert and delete element. 
Offer() and add() methods are used to insert the element in the in the priority queue java program. 
Poll() and remove() is used to delete the element from the queue.

With that in mind, we developed a more complex `window` class that makes use of two priority queues.

    /*
     * Complete the 'activityNotifications' function below.
     * Using priority queues.
     *
     * The function is expected to return an INTEGER.
     * The function accepts following parameters:
     *  1. INTEGER_ARRAY expenditure
     *  2. INTEGER d
     */
    public static int activityNotifications1(List<Integer> expenditure, int d) {

        // **** sanity check(s) ****
        if (expenditure.size() < d) return 0;

        // **** initialization ****
        int nots        = 0;
        double med      = 0.0;
        Window window   = new Window();

        // **** populate window ****
        for (int i = 0; i < d; i++)
            window.insert(expenditure.get(i));

        // **** ****
        for (int i = d; i < expenditure.size(); i++) {

            // ???? ????
            // System.out.println("<<< expenditure: " + expenditure.toString());
            // System.out.println("<<< window: " + window);

            // **** get the expenditure associated with the window ****
            Integer currentExpenditure = expenditure.get(i);

            // **** compute median of window ****
            med = window.median();
            
            // ???? ????
            // System.out.println("<<< med: " + med);

            // **** count this notification (if needed) ****
            if (currentExpenditure >= 2.0 * med)
                nots++;

            // **** remove leftmost entry ****
            window.remove(expenditure.get(i - d));

            // ???? ????
            // System.out.println("<<< window: " + window);

            // **** **** 
            window.insert(currentExpenditure);

            // ???? ????
            // System.out.println("<<< window: " + window);
        }

        // **** return number of notifications ****
        return nots;
    }

This code is quite similar to the first implementation. Note that the previous window represented as a linked list has been replaced with a class which we will see shortly.

The commented out statements were used while developing the code.


    /**
     * Optimize the window operations from the first pass.
     */
    static class Window {


        // **** ****
        PriorityQueue<Integer> smallers = new PriorityQueue<>((a, b) -> b - a); // decrementing
        PriorityQueue<Integer> largers  = new PriorityQueue<>();                // incrementing


        /**
         * Largers should have size() + 1 of smallers.
         */
        private void adjustSizes() {
            if (smallers.size() > largers.size())
                largers.offer(smallers.poll());
            else if (largers.size() > smallers.size() + 1)
                smallers.offer(largers.poll());
        }


        /**
         * 
         */
        public void insert(Integer x) {

            // **** insert x (where needed) ****
            if (largers.isEmpty() || x >= largers.peek())
                largers.offer(x);
            else
                smallers.offer(x);

            // **** ****
            adjustSizes();
        }


        /**
         * 
         */
        public void remove(Integer x) {

            // **** ****
            if (x >= largers.peek())
                largers.remove(x);
            else
                smallers.remove(x);

            // **** ****
            adjustSizes();
        }


        /**
         * Compute median from window.
         */
        public double median() {
            if (largers.size() > smallers.size())
                return largers.peek();
            else
                return (smallers.peek() + largers.peek()) /2.0;
        }


        /**
         * 
         */
        public String toString() {
            return smallers + " - " + largers;
        }
    }

This is the class that implements the Window.

The window is updated with values from the two priority queues.

There are two main operations, `insert` and `remove` values from the window. When needed the `median` method is called to compute and return the media. Note that there is no need to sort the elements in the window.

Hope you enjoyed solving this problem as much as I did. The entire code for this project can be found in my GitHub repository.

Please note that the code here presented might not be the best possible solution. In most cases, after solving the problem, you can take a look at the best accepted solution provided by the different web sites (i.e., HackerRank, LeetCode).

If you have comments or questions regarding this, or any other post in this blog, please do not hesitate and leave me a note below. I will reply as soon as possible.

Keep on reading and experimenting. It is one of the best ways to learn, become proficient, refresh your knowledge and enhance your developer toolset.

Thanks for reading this post, feel free to connect with me John Canessa at LinkedIn.

Regards;

John

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.