While I was securing an email form at work, I noticed that making a secure web form in PHP is really hard if not impossible. While reading the server logs, I found that spammers were now using email form injections to send spam. Let’s analyze why its hard to properly secure a web form:
1. Inherent flaws in PHP
according to the Month of PHP Bugs, the mail function has an inherent flaw in that it cannot properly sanitize additional control characters after a new line. like in MOPB’s example:
<?php
mail("test@domain(dot)com", "Test\r\n \nAnother-Header: Blub", "Message");
?>
shows how the subject is injectable with newlines.
UPDATE: This bug is currently fixed as of PHP 5.2.2, so update your PHP version to the latest stable release to patch this bug
2. ereg() functions not binary-safe
Many tutorial websites about PHP uses the ereg functions to validate email forms. Unfortunately, ereg() functions are NOT binary-safe
take this code for instance:
if ( eregi("^[A-Z0-9._%-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}$", $email )
return TRUE;
else
return FALSE:
seems good right? unfortunately, injecting a null byte %00 to the email string will bypass an ereg function making the validation useless. Fortunately, there is a binary-safe regex engine for PHP, the PCRE or Perl-compatible Regular Expressions. Which brings us to #3.
3. Wrong usage of the PCRE functions
once you have enough knowledge to know the limitations of #1 and #2, the next choice would be to use the binary-safe PCRE functions. But be wary there is also a major pitfall for using this function. Notice that some PHP tutorial websites will show you this code for validating an email address:
if ( preg_match("/^[A-Z0-9._%-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}$/", $email) )
return TRUE;
else
return FALSE;
unfortunately, there are 2 pitfalls for this code. First, this code is case sensitive and will only match uppercase email addresses. The second pitfall is that it will not check whether there is a newline at the end. so replacing the email with test@test.com\n would bypass this filter. By appending /i for case insensitivity and /D for dollar end, we can prevent these filter bypass. In the end, this should be the way to match an email quickly and practically:
if ( preg_match("/^[A-Z0-9._%-]+@[A-Z0-9.-]+\\.[A-Z]{2,4}$/iD", $email) )
return TRUE;
else
return FALSE;









