C# – Email Regular Expression
I wrote a regex for email that is gets the best results of any I have found online. Along with getting better results, it is shorter too.
Download the C# project with unit tests here: EmailRegEx on GitHub
The pattern of an email is described as follows:
- It will always have a single @ sign
- 1 to 64 characters before the @ sign called the local-part. Can contain characters a–z, A–Z, 0-9, ! # $ % & ‘ * + – / = ? ^ _ ` { | } ~, and . if it is not at the first or end of the local-part.
- Some characters after the @ sign that have a pattern as follows called the domain.
- It will always have a period “.”.
- One or more character before the period.
- Two to four characters after the period.
So a simple patterns of an email address should be something like these:
- This one just makes sure there are characters before and after the @
.+@.+ - This one makes sure the are characters before and after the @ as well as a character before and after the . in the domain.
.+@.*+\..+ - This one makes sure that there is only one @ symbol.
[^@]+@[^@]+\.
These are all quick an easy examples and will not work in every instance but are usually accurate enough for casual programs.
But a comprehensive example is much more complex.
- I wrote one myself that is the shortest and gets the best results of any I have found:
1
^[\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*@((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))\z
- Here is another complex one I found: [reference]
1
^(([^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$
So let me explain the first one that I wrote as it passes my unit tests below:
The start | |
[\w!#$%&’*+\-/=?\^_`{|}~]+ | At least one valid local-part character not including a period. |
(\.[\w!#$%&’*+\-/=?\^_`{|}~]+)* | Any number (including zero) of a group that starts with a single period and has at least one valid local-part character after the period. |
@ | The @ character |
( | Start group 1 |
( | Start group 2 |
([\-\w]+\.)+ | At least one group of at least one valid word character or hyphen followed by a period. The attached project has a more complex hostname regex option too. |
[\w]{2,4} | Any two to four valid top level domain characters. |
) | End group 2 |
| | an OR statement |
( | Start group 3 |
([0-9]{1,3}\.){3}[0-9]{1,3} | A regular expression for an IP Address. The attached project has a more complex IP regex example too. |
) | End group 3 |
) | End group 1 |
\z | No end of line: \r or \n. |
Code for the Email Regular Expression
Here is code for both examples. My email regular expression is enabled and the one I found on line is commented out. To see how they work differently, just comment out mine, and uncomment the one I found online.
01 02 03 04 05 06 07 08 09 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 | using System; using System.Collections.Generic; using System.Text.RegularExpressions; namespace RegularExpressionsTest { class Program { static void Main( string [] args) { String theEmailPattern = @"^[\w!#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*" + "@" + @ "((([\-\w]+\.)+[a-zA-Z]{2,4})|(([0-9]{1,3}\.){3}[0-9]{1,3}))\z" ; // The string pattern from here doesn't not work in all instances. //String theEmailPattern = @"^(([^<>()[\]\\.,;:\s@\""]+(\.[^<>()[\]\\.,;:\s@\""]+)*)|(\"".+\""))" // + "@" // + @"((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])" // + "|" // + @"(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$"; Console.WriteLine( "Bad emails" ); foreach (String email in GetBadEmails()) { Log(Regex.IsMatch(email, theEmailPattern)); } Console.WriteLine( "Good emails" ); foreach (String email in GetGoodEmails()) { Log(Regex.IsMatch(email, theEmailPattern)); } } private static void Log( bool inValue) { if (inValue) { Console.WriteLine( "It matches the pattern" ); } else { Console.WriteLine( "It doesn't match the pattern" ); } } private static List<String> GetBadEmails() { List<String> emails = new List<String>(); emails.Add( "joe" ); // should fail emails.Add( "joe@home" ); // should fail emails.Add( "a@b.c" ); // should fail because .c is only one character but must be 2-4 characters emails.Add( "joe-bob[at]home.com" ); // should fail because [at] is not valid emails.Add( "joe@his.home.place" ); // should fail because place is 5 characters but must be 2-4 characters emails.Add( "joe.@bob.com" ); // should fail because there is a dot at the end of the local-part emails.Add( ".joe@bob.com" ); // should fail because there is a dot at the beginning of the local-part emails.Add( "john..doe@bob.com" ); // should fail because there are two dots in the local-part emails.Add( "john.doe@bob..com" ); // should fail because there are two dots in the domain emails.Add( "joe<>bob@bob.com" ); // should fail because <> are not valid emails.Add( "joe@his.home.com." ); // should fail because it can't end with a period emails.Add( "john.doe@bob-.com" ); // should fail because there is a dash at the start of a domain part emails.Add( "john.doe@-bob.com" ); // should fail because there is a dash at the end of a domain part emails.Add( "a@10.1.100.1a" ); // Should fail because of the extra character emails.Add( "joe<>bob@bob.com\n" ); // should fail because it end with \n emails.Add( "joe<>bob@bob.com\r" ); // should fail because it ends with \r return emails; } private static List<String> GetGoodEmails() { List<String> emails = new List<String>(); emails.Add( "joe@home.org" ); emails.Add( "joe@joebob.name" ); emails.Add( "joe&bob@bob.com" ); emails.Add( "~joe@bob.com" ); emails.Add( "joe$@bob.com" ); emails.Add( "joe+bob@bob.com" ); emails.Add( "o'reilly@there.com" ); emails.Add( "joe@home.com" ); emails.Add( "joe.bob@home.com" ); emails.Add( "joe@his.home.com" ); emails.Add( "a@abc.org" ); emails.Add( "a@abc-xyz.org" ); emails.Add( "a@192.168.0.1" ); emails.Add( "a@10.1.100.1" ); return emails; } } } |
Well, now you have the best C# Email Regular Expression out there.
Update: My attached project has an even better and more accurate one now too.
(Reference: wikipedia)
Spot on with this write-up, I seriously feel this site
needs much more attention. I'll probably be back again to
see more, thanks for the info!
[…] 자세한 내용은 여기를 참조하십시오 : C # – Email Regular Expression […]
joe@home is a valid email, not invalid, could you fix it?
See this question on Stack Overflow: https://stackoverflow.com/questions/21810464/why-is-this-angular-email-validation-valid
What is your use case? While, yes, joe@home is RFC valid, it is a special case that we don't want to allow to pass. Having this fail is desired.
While joe@home may be valid in highly localized environment, it is invalid 99.99999% of time on the internet.
On the internet, an email without a tld is a bad email. For internet and remote environments, which requires URL registration and dns which means you must have a format like: {domain}.{tld}.
You want me to support the .00001% instead of supporting the %99.99999.
However, you are more than welcome to update the regex yourself. You can change this part.
([\-\w]+\.)+[a-zA-Z]{2,4})
To something like this: (Untested)
([\-\w]((\.[\-\w])+\.[a-zA-Z]{2,4})*)
public static string ComplexEmailPattern4 = "..." ; I would add "readonly" there
Hi
it doesnot work fine when i enter xxx@xxx.xxxxxxxxxxxxxxx
when domain name exceeds 4 characters, this doesnot work.
Can someone fix this?
It used to be that all TLDs were 1-3 characters. However, that has recently changed.
Here is a list of valid TLDs
http://data.iana.org/TLD/tlds-alpha-by-domain.txt
Using regex to validate against this list would be a bad idea. Here is a new regex that is the most complete and passes all RFC rules.
^[\w!
#$%&'*+\-/=?\^_`{|}~]+(\.[\w!#$%&'*+\-/=?\^_`{|}~]+)*@((([\w]+([-\w]*[\w]+)*\.)+[a-zA-Z]+)|((([01]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]).){3}[01]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))\z
Then after that, I would have a following up check where your code checks that the email ends with a TLD contained in this list: http://data.iana.org/TLD/tlds-alpha-by-domain.txt
tested...i works fine but only one issue
regular expression not working for match test@-test.com or test@test-.com
Are you sure? Neither test-.com or -test.com are valid. I have specific unit tests for both of those being invalid and the unit tests are passing.
From RFC952
The last character must not be a minus sign or period.
The first character must be an alpha character.
From RFC 1123
One aspect of host name syntax is hereby changed: the restriction on the first character is relaxed to allow either a letter or a digit.
If a newline character is at the end of the address it passes as valid even against the most complex regex when it should not I guess. Such as "test@test.com\n"
Interesting...
\r works
\r\n works
\n doesn't
I am reading about it here:
http://stackoverflow.com/questions/988951/net-regex-and-newline
Looks like if I replace the very last $ with \z (lowercase z) it works.
Thank you, the shortest and still accurate one I have come across so far.
[...] See updates here: C# – Email Regular Expression [...]
I added a project with Unit tests and fixed a bug or two.
In the project, in the constructor of the EmailValidator object, just set the Pattern to the desired static email regular expression.