Regular expression for e-mail validation

Asked

Viewed 58,283 times

32

I am trying to create a regular expression to validate any e-mail, I wrote the expression below, but it is not working as expected:

var parse_email = /^[a-z0-9.]+@[a-z0-9]+\.[a-z]+\.([a-z]+)?$/i;

What I expected from each passage:

  • [a-z0-9.]+ - part before the @ email, name of the user;
  • @ - mandatory character of arroba;
  • [a-z0-9]+ - part after the @ email, name of the provider;
  • \. - dot character after the name of the provider;
  • [a-z]+ - usually where is placed the .com;
  • \. - dot character after the .com, should only be mandatory if there is, for example, a .br or abbreviation of any other country at the end of the email;
  • ([a-z]+)? - usually where the country abbreviation is placed.

As I tested the expression:

var espacos = '                           ';
var parse_email = /^[a-z0-9.]+@[a-z0-9]+\.[a-z]+\.[a-z]?$/i;
console.log("[email protected]" + espacos.substring("[email protected]".length) + parse_email.test("[email protected]"));
console.log("[email protected]" + espacos.substring("[email protected]".length) + parse_email.test("[email protected]"));
console.log("[email protected]" + espacos.substring("[email protected]".length) + parse_email.test("[email protected]"));
console.log("foo.bar@gmail." + espacos.substring("foo.bar@gmail.".length) + parse_email.test("foo.bar@gmail."));
console.log("foo.bar@gmailcom" + espacos.substring("foo.bar@gmailcom".length) + parse_email.test("foo.bar@gmailcom"));
console.log("foo.bargmail.com" + espacos.substring("foo.bargmail.com".length) + parse_email.test("foo.bargmail.com"));
console.log("@gmail.com" + espacos.substring("@gmail.com".length) + parse_email.test("@gmail.com"));
  • 7

    suggestion: validate only the basics, because if you have something annoying for the end user is to try to register with a valid email and fail (I myself have the habit of calling the marketing company by burning the programmer. It seems a little Shiite, but I think a lot about the laypeople who don’t understand what is happening). Have characters, an arroba, more characters, a point at least after and 2 characters or more? Great already. Today has mastery with accent, takes place that can point in the name, takes place with underline, has some with sign of more. Better not complicate (or use updated and complete RFC).

7 answers

36


First you need to accept that you will not be able to process all possible emails. Their specification is long and complicated. For example, here is a regex that accepts all emails and nothing else: http://ex-parrot.com/~pdw/Mail-RFC822-Address.html

With that in mind, it breaks to make a regex that can hit mostly of the cases.

/^[a-z0-9.]+@[a-z0-9]+\.[a-z]+\.([a-z]+)?$/i

(regexplained)

Your mistake here was only to not include the last \. in parentheses. By doing this I get this result:

[email protected]       true
[email protected]    true
[email protected] false
foo.bar@gmail.          false
foo.bar@gmailcom        false
foo.bargmail.com        false
@gmail.com                 false

Which I believe is what you search for. But this will fail in several other cases. As if the email include underscore or +, or if the domain includes many characters (like some of the government [email protected], I’ve had trouble with those).

A more complete suggestion from HTML5 specification by W3C:

The following Javascript- and Perl-compatible regular Expression is an implementation of the above Definition.

/^[a-zA-Z0-9.!#$%&'*+/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$/

(regexplained)

Remember that validations in client-side should not be considered reliable, especially in javascript, because the user can change the code and circumvent the validation. There should always be a server validation, usually by sending a confirmation email. Usually client validation is only used to improve user experience, showing what is wrong without the delay of waiting for a server response.

  • 3

    +1 I was going to post the same answer, but you were faster... :) In this case, I leave here only one additional reference, with more regex options and an extensive test of its false positives and negatives.

  • 2

    I suggest putting a /g at the end of the expression to prevent any text from being injected at the end of the string. Example: "[email protected] xyz" is accepted without the g.

  • @utluiz the ^ and $ wouldn’t you realize it? I tested with "[email protected]" and "[email protected] x", the first passes while the second fails. I did not understand.

  • You’re right, William. I should. But I tested the suggested expression (the second) and accepted an email with space. See here. Nor with the g worked. There’s something wrong with that expression.

  • @utluiz It looks like you changed the text being shown but not the string used in regex.test(). I tested it here, works smoothly.

  • It’s true, I forgot to change in other places. Just one more thing: zignd.igor@gmailcom should be valid?

  • Well... I’ve never seen anything like this, but the link from @mgibsonbr claims that test@example and first.last@com are valid. Then this should be also.

  • This must have something to do with this recent change in the higher-level domains (top level Domains). It used to be a short list - and it required at least one subdomain - but now it accepts anything (for whoever pays the price).

  • 3

    emails with more than 2 points after the arroba are perfectly valid, and very common in organizations with subdomains, especially academic ones. example taken from some contacts of mine: [email protected]

  • 1

    +1 by regexplained site.

  • I recommend using the regex of http://ex-parrot.com/~pdw/Mail-RFC822-Address.html, since the simplification will leave out valid emails that are currently in use. Including email from domains or user names that contain hyphens -.

  • @motobói The expression suggested by W3C encompasses those containing hyphen. In addition there is no need to include such a large expression for practically unused cases. W3C fails emails that contain quotes or comments or whose domain is an Ipv6. Anyone who uses an email like this doesn’t really expect it to pass as validators.

Show 7 more comments

13

Difficulties in Validating Emails

Validating emails with Regex, even more via Javascript, can be a double-edged sword. Valid emails can be rejected and invalid can be accepted in most expressions commonly used on websites around the world.

On the other hand, it is important to understand that it is not necessary, nor recommended, to validate an email very rigidly, after all we will only know if it is truly valid when we send an email to the address in question.

An interesting discussion about email validation can be found at this link.

Validation with Regex

A regular expression that properly validates an email according to the official definition of emails can be found at this link. However, it is very complex and probably not supported by Engines javascript.

A simplification can be found in the aforementioned article (first link), which would be:

/[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?/gi

See the jsfiddle.

Note that the suffix /gi at the end. The i causes emails with uppercase and lowercase letters to be accepted. The g causes the expression to check the String as a whole. Without the g, an email as [email protected] [email protected] would be accepted.

Adjusting Question Regular Expression

As for the original expression of the question, a small adjustment would make it work:

/^[a-z0-9.]+@[a-z0-9]+\.[a-z]+(\.[a-z]+)?$/i

Note that the last parenthesis opened is now before the last full-stop character, which should also be optional. On the other hand, this expression will have many false-positives. The previous expression is more appropriate.

See the jsfiddle.

Completion

The regular expression proposed in the question can be easily corrected, but it will prevent valid emails from being accepted. The other phrase quoted above is more recommended. Although emails like [email protected] are accepted, they are, in theory, perfectly valid.

  • 2

    I suggest replacing the ? in the last part by *, to support an unlimited number of sub-domains: (\.[a-z]+)* Fewer false negatives... (or better yet, add the \.[a-z]+ previous to it, resulting in (\.[a-z]+)+)

  • 1

    @mgibsonbr I agree, but if you are referring to the "Adjusting the Regular Expression of the Question" section, this was just to make the expression work in the test. The above recommended expression already accepts unlimited subdomains.

5

Two expressions that I use without problem are:

"^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+).(\.[a-z]{2,3})$"

"^[a-zA-Z0-9_.-]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$"

Thus:

 <script type="text/javascript">
function validateEmail(email)
{
 var reg = /^\w+([-+.']\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$/
 if (reg.test(email)){
 return true; }
 else{
 return false;
 }
} 
</script> 

However, I hope that it is no longer interesting to use these validations, because they make mistakes with the new domains and extensions. I advise you to take a look at article by Douglas Lovell, which is for PHP but easily converts to JS.

3

Create a function with the example name: validEmail. This function should receive the email by parameter and check if it is a valid email.

The user name (before the arroba) can be any alphanumeric character, including the underscore, sign "+" and the dot

After arroba, the domain can contain only alphanumeric characters and the underscore;

For the extension, the domain must be followed by a dot, and at least 2 alphanumeric characters;

Domain ending is optional, but if it exists, it should start with a dot, followed by a maximum of 2 alphanumeric characters

inserir a descrição da imagem aqui

After the explanations follow the javascript code of the function with regex

function validEmail(email){
    return /^[\w+.]+@\w+\.\w{2,}(?:\.\w{2})?$/.test(email)
}

This function will return true or false

1

That expression satisfies me:

let email = "[email protected]";
let regex_validation = /^([a-z]){1,}([a-z0-9._-]){1,}([@]){1}([a-z]){2,}([.]){1}([a-z]){2,}([.]?){1}([a-z]?){2,}$/i;
console.log("É email válido? Resposta: " + regex_validation.test(email));
  • Welcome to Stack Overflow in English. This code may be a solution to the question, but your answer might be better if you include an explanation of the key points of the code. The goal is not only to help those who asked the question, but the next visitors as well. Read more on Code-only answers - What to do?.

0

With HTML5 syntax it is also possible to perform an HTML validation even by setting the input type to "email".

<input type="email" name="email"/>

-5

Can use without fear, works even in Typescript.

[A-Za-z0-9\\._-]+@[A-Za-z0-9]+\\..(\\.[A-Za-z]+)*

Browser other questions tagged

You are not signed in. Login or sign up in order to post.