Enable search and notifications for email addresses using the "+" syntax
A lot of people use a syntax such as firstname.lastname@example.org where foo is a unique identifier for the site. They do this so that if they begin getting spammed, they can identify the source their email came from.
At the moment, HIBP treats this is a totally unique email address so if I've search for the parent email address without the "+" syntax, it won't be found. This idea is to ensure that searches and notifications recognise the syntax and return addresses that are logically still the same account.
One thing HIBP would also need to do is specify which account alias was in the breach or paste. For example, I would want to know that it was email@example.com that was exposed in the XYZ breach.
Edit: Just to put the value of this into context, I've just run some stats on the Adobe breach. Of the the 152,989,508 rows in the dump, only 49,905 email addresses have a "+" in the address so that's 0.03% of entries. That number is also a bit high as it includes junk entries. I'm definitely not ruling this idea out - it's still planned - I just wanted to give a sense of how useful it would be.
Edit: To add to this idea, Robert's comment about a period in the email is also very valid. I'd want to be very clear about the ubiquity of this practice across mail providers, but it's certainly a good suggestion and worth further investigation.
I did an inventory of my gmail "+" and "." variants. I'm currently using 67 variants of my email address.
Joe Kirwin commented
Lead with an example:
Search for firstname.lastname@example.org and (if this was a real account) email@example.com and you'd get different results. Yet for all intents and purposes it's all your data that had been leaked as they are the same account.
Would it be possible to define some canonicalizers for very popular email providers that remove superfluous things such as periods from the address both in the breach corpus and when searching?
I realize that there could be some user that likes to leverage things like
pwned.hibp+salesConference@ to track down which party exposed some breach, but I feel like that use case is not as broadly useful as giving people a full breach list.
+1 for a needed feature. Though using "+..." and/or additional dot(s) is a small (but growing) percentage of email usage, breaches involving those accounts may be a disproportionately higher risk to security-savvy folks. They are the most likely to use these techniques but may also be juicier targets.
CB has an excellent solution for this below. Cleaning up the input addresses on entry into the HIBP database before hashing, would handle the most variations. With the addition of recording which +tags were removed, and positions of periods, the data would be comprehensive with very little compromise.
It may make sense to hash both versions. E.g. "HIBP found an exact match" and "the following variations were found in breach database" are both useful.
I would also expand this request to simply being able to search @domain.com
You can use the feature to generate emails on the fly without them having an associated account, similar to the + syntax.
Unfortunately, that would also mean searching potentially hundreds of addresses one at a time to see if they've been compromised.
If this idea gets implemented, please make it work only for "validated" emails. I don't want people to be able to type my email and see every variation of it. Hopefully we get this feature in the future =) Thank you Troy for all your hard work.
It's interesting that this is the most requested feature by far, but the FAQ makes it sound like it's unimportant. If we're all using this website, it's a given that we're more security and privacy aware than others and we will use all tools available to us, such as the plus tag and using different spacing (such as: my-email, my.email, myemail, m.y.e.mail).
It would also be nice to see an example of the international phone number for those of us that are not familiar with that format
I am not sure if this is a duplicate, but here goes... It would be nice I could provide a base address (like firstname.lastname@example.org) and HIBP reported hits for:
1) any + variant of the base address (first-last+aNYstRing@gmail.com)
2) any valid dot format (i.e., email@example.com and variants)
3) can handle user supplied dots in the base name without disabling #2 (i.e.,firstname.lastname@example.org)
status on this?
Claudio Brandt commented
(I understand there were similar suggestions, to which the response was to look at https://haveibeenpwned.uservoice.com/forums/275398-general/suggestions/6774229-enable-search-and-notifications-for-email-addresse, which concerns email aliases with '+'. But while aliases are necessarily known to the user who created them, variations with dot can be arbitrarily created by hackers and will be accepted both for email AND login by Gmail)
So the problem is:
Gmail is an ubiquitous email provider.
Gmail accepts dots anywhere in the username.
Gmail ignores dots, so that:
user123 is the same as:
A hacker intent on evading HaveIBeenPwnd monitoring could easily add dots to all Gmail addresses before selling and/or leaking a list of email and passwords. This way, after a major leak is advertised, user123@gmail visiting HIBP may leave with a false sense of security that their password wasn't in the leak because currently HIBP will only return a match for the exact address(es) input by the user.
But if the hacker added a dot somewhere in the address, the combination username+password would still be available to access the account, while the legit user would not have a clue that their password was compromised.
The solution: for each Gmail address, remove the dots before adding to HIBP's database, so that:
1) user123, user.123, u.ser123 etc will be stored as user123 within HIBP's database;
2) when an user visits HIBP and inputs their Gmail address, any variation caused by dots will be stripped of dots before matching against HIBP's database, resulting in a positive even if a dot variant username was leaked.
A more general syntax would be very helpful. I've been using spamgourmet.com for many years, and many of the addresses are valid for a long time period.
Myself and many of my colleagues ONLY use this aliasing, especially since Microsoft added support for it in Office 365 (G Suite has had it for a while). Please please make this a feature!
PS: we would not expect Adobe users to make use of aliases hahaha.
The tides might be changing https://docs.microsoft.com/en-us/exchange/recipients-in-exchange-online/plus-addressing-in-exchange-online and in the spirit of catching leaks (however obfuscated), not supporting https://en.wikipedia.org/wiki/Email_address#Subaddressing seems a bit unconstructive in supporting adoption.
Does / did Adobe allow subaddressing? (in all of its registration forms? typical hindrance)
Joe Weeds commented
plz plz plz do this... i use the + syntax to keep my logins unique and easily rememberable - which I think we can all agree is what we want most users doing as part of good digital hygiene. here you will be rewarding those taking the appropriate steps so that breaches do not spread further.
Joe Weeds commented
including a simple wildcard search where email@example.com would be really useful. even if you didn't enumerate every variant of firstname.lastname@example.org in the results but were at least able to link them in the search query. alternatively linking the email addresses (call them the parents) to the children (+something variants) would be fairly fast given that they represent 0.03% of the records and this can be done after the fact as a small post processing routine.
fully agreed, please do it!
and the stats for + sign can be like that because probably tons of people are not aware that it is possible - but most likely they are also not aware of HIBP, password managers and other stuff, and Adobe is quite popular among people with lower IT and security awareness.