1454. Active Users - Leetcode Solution

Problem Description

The "Active Users" problem asks you to determine the number of users who were active in your system within a given period. You are given a list of user activity logs, where each log contains a user_id and a timestamp (representing when the user was active). Your task is to find out how many unique users were active at least once in the last k days, given a reference day today.

Key constraints:

Each activity log is a pair: (user_id, timestamp).
The timestamps are integers representing days.
You must count each user only once, even if they were active multiple times within the period.
Do not count users who were only active before the window of today - k + 1 to today (inclusive).

Thought Process

The first idea that comes to mind is to check every log and see if its timestamp falls within the last k days, then collect the user_id values. However, if we simply collect all logs and then count, we might double-count users active on multiple days.

To avoid double-counting, we need a way to keep track of unique users. Using a set data structure is ideal, since it automatically ignores duplicates. That way, every time we find a log within the window, we add the user_id to the set. The final answer is the size of the set.

Brute-force would be to scan all logs, check the timestamp, and add to a set. But if the data is huge, can we do better? The main optimization is to only look at logs within the date window and ignore the rest as early as possible.

Solution Approach

Let's break down the steps to solve this problem efficiently:

Initialize a set: Create an empty set to store unique user_id values.
Define the window: The active window is from today - k + 1 to today, inclusive.
Iterate through logs: For each log entry, check if its timestamp is within the active window.
Add to set: If the timestamp is valid, add the user_id to the set.
Return the result: The answer is the number of unique users in the set.

Why use a set? A set gives O(1) average time complexity for add and check operations, and automatically handles duplicates.

Why iterate through all logs? We must check each log, since any could fall within the window and represent a unique user.

Example Walkthrough

Let's say we have the following logs:

(1, 10)
(2, 11)
(1, 12)
(3, 8)
(2, 12)

Suppose today = 12 and k = 3. The active window is 10 to 12 (inclusive).

First log: (1, 10) is in the window. Add user 1 to the set.
Second log: (2, 11) is in the window. Add user 2 to the set.
Third log: (1, 12) is in the window. User 1 is already in the set, so no change.
Fourth log: (3, 8) is not in the window. Ignore.
Fifth log: (2, 12) is in the window. User 2 is already in the set, so no change.

The set now contains {1, 2}. The answer is 2.

Time and Space Complexity

Brute-force approach:

Time Complexity: O(n), where n is the number of logs, since we must check each log.
Space Complexity: O(u), where u is the number of unique users in the window.

Optimized approach:

Time Complexity: Still O(n), because we must examine each log at least once.
Space Complexity: O(u), storing only the unique users active in the window.

The solution is optimal because we cannot avoid looking at every log (they could all be in the window), and the set only grows as large as the number of unique users.

Summary

The "Active Users" problem is efficiently solved by using a set to track unique users who were active in a given date window. By scanning each log and adding user IDs from qualifying timestamps, we avoid double-counting and achieve optimal time and space complexity. The elegance comes from leveraging the set data structure, which handles duplicates and provides fast lookups, making the solution both simple and robust.

Code Implementation

def count_active_users(logs, k, today):
    """
    logs: List of tuples (user_id, timestamp)
    k: integer, number of days for the window
    today: integer, the reference day
    """
    active_users = set()
    window_start = today - k + 1
    for user_id, timestamp in logs:
        if window_start <= timestamp <= today:
            active_users.add(user_id)
    return len(active_users)

#include <vector>
#include <unordered_set>
using namespace std;

int countActiveUsers(vector<pair<int, int>>& logs, int k, int today) {
    unordered_set<int> activeUsers;
    int windowStart = today - k + 1;
    for (auto& log : logs) {
        int userId = log.first;
        int timestamp = log.second;
        if (timestamp >= windowStart && timestamp <= today) {
            activeUsers.insert(userId);
        }
    }
    return activeUsers.size();
}

import java.util.*;

public class Solution {
    public int countActiveUsers(int[][] logs, int k, int today) {
        Set<Integer> activeUsers = new HashSet<>();
        int windowStart = today - k + 1;
        for (int[] log : logs) {
            int userId = log[0];
            int timestamp = log[1];
            if (timestamp >= windowStart && timestamp <= today) {
                activeUsers.add(userId);
            }
        }
        return activeUsers.size();
    }
}

function countActiveUsers(logs, k, today) {
    const activeUsers = new Set();
    const windowStart = today - k + 1;
    for (const [userId, timestamp] of logs) {
        if (timestamp >= windowStart && timestamp <= today) {
            activeUsers.add(userId);
        }
    }
    return activeUsers.size;
}

AlgoMap.io

Want Help Cracking FAANG?