Enable search and notifications for email addresses using the "+" syntax
A lot of people use a syntax such as email@example.com where foo is a unique identifier for the site. They do this so that if they begin getting spammed, they can identify the source their email came from.
At the moment, HIBP treats this is a totally unique email address so if I've search for the parent email address without the "+" syntax, it won't be found. This idea is to ensure that searches and notifications recognise the syntax and return addresses that are logically still the same account.
One thing HIBP would also need to do is specify which account alias was in the breach or paste. For example, I would want to know that it was firstname.lastname@example.org that was exposed in the XYZ breach.
Edit: Just to put the value of this into context, I've just run some stats on the Adobe breach. Of the the 152,989,508 rows in the dump, only 49,905 email addresses have a "+" in the address so that's 0.03% of entries. That number is also a bit high as it includes junk entries. I'm definitely not ruling this idea out - it's still planned - I just wanted to give a sense of how useful it would be.
Edit: To add to this idea, Robert's comment about a period in the email is also very valid. I'd want to be very clear about the ubiquity of this practice across mail providers, but it's certainly a good suggestion and worth further investigation.
I currently do not use plus aliasing because it is not supported by HIBP. If this was supported I would start to use + when signing up for online services. I believe the percentage of people using this feature in their usernames would increase if it was supported by HIBP.
Lee Brotherston commented
Just to add my 2c....
Although the numbers are low in the samples that you look at, I would suggest that the comment below regarding these being likely to be people with security responsibilities is likely to hold true.
I think that this is a fairly ubiquitous feature of mail providers these days (with the notable exception, I think, of the default Exchange setup) even if it's not adopted by users that often.
I'm sure not sure what the underlying stack is to gauge how much effort would be involved (e.g. a change to an SQL query vs needing to retrospectively update a bunch of metadata that's generated at import time, etc) but I would suggest that there are a couple of implementation routes for this:
- Add this to the search facility, that email@example.com will search for both the provided address as well as firstname.lastname@example.org
- Leave search as-is, but setup +stuff in the notifications much like whole domain notifications. This way, presumably, no need to retrospectively update data, rather just handle this during the import of new data.
Michael H commented
The sooner this is implemented, the better.
Mike Williams commented
While the absolute number of people affected is small, consider that they are more likely to be people with security responsibilities elsewhere and as such are valuable vectors.
I did an inventory of my gmail "+" and "." variants. I'm currently using 67 variants of my email address.
Joe Kirwin commented
Lead with an example:
Search for email@example.com and (if this was a real account) firstname.lastname@example.org and you'd get different results. Yet for all intents and purposes it's all your data that had been leaked as they are the same account.
Would it be possible to define some canonicalizers for very popular email providers that remove superfluous things such as periods from the address both in the breach corpus and when searching?
I realize that there could be some user that likes to leverage things like
pwned.hibp+salesConference@ to track down which party exposed some breach, but I feel like that use case is not as broadly useful as giving people a full breach list.
+1 for a needed feature. Though using "+..." and/or additional dot(s) is a small (but growing) percentage of email usage, breaches involving those accounts may be a disproportionately higher risk to security-savvy folks. They are the most likely to use these techniques but may also be juicier targets.
CB has an excellent solution for this below. Cleaning up the input addresses on entry into the HIBP database before hashing, would handle the most variations. With the addition of recording which +tags were removed, and positions of periods, the data would be comprehensive with very little compromise.
It may make sense to hash both versions. E.g. "HIBP found an exact match" and "the following variations were found in breach database" are both useful.
I would also expand this request to simply being able to search @domain.com
You can use the feature to generate emails on the fly without them having an associated account, similar to the + syntax.
Unfortunately, that would also mean searching potentially hundreds of addresses one at a time to see if they've been compromised.
If this idea gets implemented, please make it work only for "validated" emails. I don't want people to be able to type my email and see every variation of it. Hopefully we get this feature in the future =) Thank you Troy for all your hard work.
It's interesting that this is the most requested feature by far, but the FAQ makes it sound like it's unimportant. If we're all using this website, it's a given that we're more security and privacy aware than others and we will use all tools available to us, such as the plus tag and using different spacing (such as: my-email, my.email, myemail, m.y.e.mail).
It would also be nice to see an example of the international phone number for those of us that are not familiar with that format
I am not sure if this is a duplicate, but here goes... It would be nice I could provide a base address (like email@example.com) and HIBP reported hits for:
1) any + variant of the base address (first-last+aNYstRing@gmail.com)
2) any valid dot format (i.e., firstname.lastname@example.org and variants)
3) can handle user supplied dots in the base name without disabling #2 (i.e.,email@example.com)
status on this?
Claudio Brandt commented
(I understand there were similar suggestions, to which the response was to look at https://haveibeenpwned.uservoice.com/forums/275398-general/suggestions/6774229-enable-search-and-notifications-for-email-addresse, which concerns email aliases with '+'. But while aliases are necessarily known to the user who created them, variations with dot can be arbitrarily created by hackers and will be accepted both for email AND login by Gmail)
So the problem is:
Gmail is an ubiquitous email provider.
Gmail accepts dots anywhere in the username.
Gmail ignores dots, so that:
user123 is the same as:
A hacker intent on evading HaveIBeenPwnd monitoring could easily add dots to all Gmail addresses before selling and/or leaking a list of email and passwords. This way, after a major leak is advertised, user123@gmail visiting HIBP may leave with a false sense of security that their password wasn't in the leak because currently HIBP will only return a match for the exact address(es) input by the user.
But if the hacker added a dot somewhere in the address, the combination username+password would still be available to access the account, while the legit user would not have a clue that their password was compromised.
The solution: for each Gmail address, remove the dots before adding to HIBP's database, so that:
1) user123, user.123, u.ser123 etc will be stored as user123 within HIBP's database;
2) when an user visits HIBP and inputs their Gmail address, any variation caused by dots will be stripped of dots before matching against HIBP's database, resulting in a positive even if a dot variant username was leaked.
A more general syntax would be very helpful. I've been using spamgourmet.com for many years, and many of the addresses are valid for a long time period.
Myself and many of my colleagues ONLY use this aliasing, especially since Microsoft added support for it in Office 365 (G Suite has had it for a while). Please please make this a feature!
PS: we would not expect Adobe users to make use of aliases hahaha.