9
Feb

Validate Email Using POSIX Regular Expressions

You can validate an email address using POSIX Regular Expressions like the following examples:

^[a-zA-Z0-9_-.]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$
^[a-zA-Z0-9_-.]+@[a-zA-Z0-9-]+.[com|edu|gov]+$

This is two different ways of doing the same thing. If you want tighter control on your validity, I would recommend specifying com, edu, gov or other domain names you consider valid like shown in the second example above.

A quick PHP script to demonstrate:

<?php

// hit this in a browser and specify the email in the URL:
// http:[email protected]

$email=$_GET['email'];

//if (!eregi('^[a-zA-Z0-9_-.]+@[a-zA-Z0-9-]+.[a-zA-Z0-9-.]+$', $email)) {
if (!eregi('^[a-zA-Z0-9_-.]+@[a-zA-Z0-9-]+.[com|edu|gov]+$', $email)) {
        echo "<p>That is not a valid email address.</p>".
             "<p>Please return to the previous page and try again.</p>";
        exit;
}
else
        echo "<p>That is one sexy email address!</p>";
?>

Feel free to donate if this post prevented any headaches! Another way to show your appreciation is to take a gander at these relative ads that you may be interested in:


There's 2 Comments So Far

  • Roland
    February 10th, 2011 at 2:13 am

    I find this post very useful , only one problem I would not use the eregi PHP function due to the fact that it’s DEPRECATED as of PHP 5.3.0.

  • Andreas Haerter
    June 9th, 2011 at 1:07 pm

    Please allow “+” before the @. Adresses like “[email protected]” are totally valid and e.g. many GMail users are using them.

    Additionally (as already mentioned), eregi() is deprecated in most PHP projects. I would recommend the following:

    if ((strlen($data) > 320 && //see http://tinyurl.com/6qztqv for…
    && mb_strlen($data) > 320) || //…information about the 320 char limit
    !preg_match(“/^[_a-zA-Z0-9-+]+(.[_a-zA-Z0-9-+]+)*@[a-zA-Z0-9-]+(.[a-zA-Z0-9-]+)*.([a-zA-Z]{2,6})$/u”, $data)) {
    return false; //structure is broken, can’t be a valid email address
    }

    Small hint referring str_len(): First check with stlen() is VERY fast but the results are too big with most UTF-8 strings (because it counts bytes, not chars!). We just can use the function to kick long, invalid stuff very fast before doing the reliably check with the slower mb_strlen() and regex.

Share your thoughts, leave a comment!