class Solution:
def numUniqueEmails(self, emails):
unique_emails = set()
for email in emails:
local, domain = email.split('@')
local = local.split('+')[0]
local = local.replace('.', '')
normalized = local + '@' + domain
unique_emails.add(normalized)
return len(unique_emails)
class Solution {
public:
int numUniqueEmails(vector<string>& emails) {
unordered_set<string> unique;
for (const string& email : emails) {
size_t at = email.find('@');
string local = email.substr(0, at);
string domain = email.substr(at);
size_t plus = local.find('+');
if (plus != string::npos)
local = local.substr(0, plus);
local.erase(remove(local.begin(), local.end(), '.'), local.end());
unique.insert(local + domain);
}
return unique.size();
}
};
class Solution {
public int numUniqueEmails(String[] emails) {
Set<String> unique = new HashSet<>();
for (String email : emails) {
String[] parts = email.split("@");
String local = parts[0].split("\\+")[0].replace(".", "");
String normalized = local + "@" + parts[1];
unique.add(normalized);
}
return unique.size();
}
}
var numUniqueEmails = function(emails) {
const unique = new Set();
for (let email of emails) {
let [local, domain] = email.split("@");
local = local.split("+")[0].replace(/\./g, "");
unique.add(local + "@" + domain);
}
return unique.size;
};
You are given a list of email addresses as an array called emails
. Every email address is composed of a local name and a domain name separated by the '@'
sign.
There are two rules for normalizing emails:
'.'
) in the local name part are ignored.'+'
) in the local name is ignored.Your task is to find out how many unique email addresses there are after normalization. Uniqueness is determined by the normalized form of each email.
Constraints:
'@'
character.At first glance, it might seem like we need to compare every email address to every other email to check for uniqueness, but the problem gives us a shortcut: normalization. By applying the given rules to each email, we can transform them into a standard form. If two emails normalize to the same string, they are considered the same.
A brute-force approach would involve comparing each email with every other, but that's inefficient. Instead, we can use a set (or hash set) to keep track of all the normalized emails. Sets automatically handle uniqueness for us, so we don't have to check manually.
The main challenge is to correctly parse and normalize each email according to the rules, and then count how many unique normalized emails we have.
Let's break down the solution step by step:
emails
, split it into two parts at the '@'
sign: the local name and the domain name.
'+'
character in the local name, ignore everything after (and including) the plus sign.'.'
) from the local name.'@'
, to get the normalized email.
We use a set (or hash set) because lookups and insertions are both O(1) on average, making this approach very efficient.
Let's walk through an example with the following input:
["test.email+alex@leetcode.com", "test.e.mail+bob.cathy@leetcode.com", "testemail+david@lee.tcode.com"]
After normalization, we have two unique emails: "testemail@leetcode.com"
and "testemail@lee.tcode.com"
. The answer is 2.
Brute-force approach: If we compared each email to every other, the time complexity would be O(N^2), where N is the number of emails.
Optimized approach:
To solve the Unique Email Addresses problem, we normalize each email according to the specified rules and track the unique normalized emails using a set. This approach is efficient and elegant because it leverages string manipulation and the properties of sets to solve the problem in linear time. The key insight is that normalization reduces the problem to a simple uniqueness check, which sets handle naturally.