Jump to content

Help with E-mail Validation Script


ben_1uk

Recommended Posts

Hi everyone,

 

I have been asked to look into the below E-mail validation script due to a number of people not being able to register their E-mail address on a website of mine. For example, people who's E-mail address begins with a.bcde@fghij.com cannot register and receives an e-mail validation error message. TBH, I pinched the code from somewhere else and do not understand how it works exactly.

 

Could someone help identify which part of the code needs changing? I have highlighted the area of code I believe needs changing, but would appreciate some help.

 

Thanks,

 

function checkemail() {

 

var str = document.getElementById('register-email').value;

 

if ((str.indexOf(".") > 2) && (str.indexOf("@") > 0)) {

document.getElementById('emailcheck1').style.backgroundPosition = "top left";

return true;

}

else {

document.getElementById('emailcheck1').style.backgroundPosition = "bottom left";

return false;

}

}

Link to comment
Share on other sites

I would consider PHP/AJAX and not just javascript, too many issues and too easily beaten.

 

The PHP function (page):

<?php
function checkEmail($address) {
if (!eregi("^[_a-z0-9-]+(.[_a-z0-9-]+)*@[a-z0-9-]+(.[a-z0-9-]+)*(.[a-z]{2,3})$", $address)){
	return FALSE;
}
return TRUE;
}
if (checkEmail(trim(stripslashes($_GET['address'])))) {
     echo "<span style='color: Maroon;'>VALID</span>";
}
else {
     echo "<span style='color: Maroon;'>INVALID</span>";
}
?>

 

The AJAX/HTML page:

<script type="text/javascript">
function AjaxFunction(email) {
var httpxml;
try {
	//Firefox, Opera 8.0+, Safari
	httpxml=new XMLHttpRequest();
}
catch (e) {
	//Internet Explorer
	try {
		httpxml=new ActiveXObject("Msxml2.XMLHTTP");
	}
	catch (e) {
		try {
			httpxml=new ActiveXObject("Microsoft.XMLHTTP");
		}
		catch (e) {
			alert("Your browser does not support AJAX!");
			return false;
		}
	}
}
function stateck() {
	if(httpxml.readyState==4) {
		document.getElementById("msg").innerHTML=httpxml.responseText;

	}
}
var url="email-ajax.php";
url=url+"?email="+email;
url=url+"&sid="+Math.random();
httpxml.onreadystatechange=stateck;
httpxml.open("GET",url,true);
httpxml.send(null);
}
</script>

<form name=f1 action=''>
Your First Name <input type=text name=n1><br>
Any email address <input type=text name=email onBlur="AjaxFunction(this.value);"><div id="msg"></div>
<br />
Your City Name <input type=text name=city>
<input type=submit value=Submit >
<input type=hidden value=test name=todo>
</form>

Link to comment
Share on other sites

http://php.net/manual/en/function.eregi.php

 

I think it is better to use this as eregi is dead:

if(!preg_match("/([\w\-]+\@[\w\-]+\.[\w\-]+)/",$email))

 

I would concur, on principle; using known deprecated syntax is never a good idea, despite how recent the deprecation (only in 5.3). The big difference being the favor of the PCRE matching engine versus the POSIX matching engine. It's a polarizing debate, you either love POSIX or you hate POSIX. Ether way, PHP's trend would indicate the the ereg family of constructs aren't near being completely removed from the language.

Link to comment
Share on other sites

Ok, a number of things:

 

1) I didn't notice that the OP was using javascript.  Never rely on javascript for validation.  The PHP filter_var function I linked to is the only 100% reliable way to validate emails.  You can use JS to make it pretty, but don't rely on it.

 

2)  The link posted by freelance84 is wrong.  Don't use ereg, ever.  It's deprecated and will be removed in future versions of PHP.

 

3)  The regex provided by freelance84 is also wrong.  Sure, it will succeed for many (most) emails, but it's not correct. 

 

-Dan

Link to comment
Share on other sites

Ok, a number of things:

 

1) I didn't notice that the OP was using javascript.  Never rely on javascript for validation.  The PHP filter_var function I linked to is the only 100% reliable way to validate emails.  You can use JS to make it pretty, but don't rely on it.

 

2)  The link posted by freelance84 is wrong.  Don't use ereg, ever.  It's deprecated and will be removed in future versions of PHP.

 

3)  The regex provided by freelance84 is also wrong.  Sure, it will succeed for many (most) emails, but it's not correct. 

 

-Dan

 

Dan,

 

I, at first posted a demonstration to OP using php/regex and AJAX.  The snippet I provided used eregi().  Freelance84 pointed out that it was a deprecated construct and suggested the use of preg_match, to which I agreed with a following post and explaining the difference in the ereg and preg family of constructs.

 

The filter_var() construct that you suggest, is a wrapper for the PCRE REGEX matching engine, which is what the preg family of constructs use. So by using preg_match, you will achieve the net result of filter_var().

 

The big difference is that with filter_var(), you can use canned constants (like FILTER_VALIDATE_EMAIL); which takes the guess work out of common REGEX tasks like, email validation.

 

Make no mistake, by using filter_var() you are using PCRE REGEX, it's just not as evident.

Link to comment
Share on other sites

 

2)  The link posted by freelance84 is wrong.  Don't use ereg, ever.  It's deprecated and will be removed in future versions of PHP.

 

3)  The regex provided by freelance84 is also wrong.  Sure, it will succeed for many (most) emails, but it's not correct. 

 

-Dan

 

Sorry, i meant with regard to phpORcaffine's post as he was using ereg.

 

 

Link to comment
Share on other sites

Yeah...PHPOrCaffiene's first post didn't show up for me at all.  I just saw you (freelance84) posting what I thought was the wrong information two different ways.

 

Either way, filter-var on the PHP side is the only "correct" solution, plus some JS validation using regex.

 

The filter_var() construct that you suggest, is a wrapper for the PCRE REGEX matching engine, which is what the preg family of constructs use. So by using preg_match, you will achieve the net result of filter_var().
While it's true that filter_var implements regex, it's not a wrapper per se.  I never said anything about regex being wrong or...anything like that.  If you use filter_var you get the right regex.  If you write one yourself you get the wrong regex.  Therefore, use filter_var to get the right regex.  I'm a big fan of writing regex (in fact I moderate a regex forum as well), but only when there's no other solution available.

 

Link to comment
Share on other sites

Thanks for all the responses - even if they are far too technical for me!

 

I would like to understand the code I originall posted better so I can work with it rather than have to completely re-write the script.

 

How would I adapt the code highlighted in red to stop people with E-mail addresses like a.something@...from receiving error messages?

 

Thanks.

Link to comment
Share on other sites

if i'm not mistaken there are few errors on the snippet above like:

1) trim(stripslashes($_GET['address']) should be trim(stripslashes($_GET['email']) if you want to keep che ajax as it is (or viceversa modify ajax putting address in the url).

2) why should you pass sid in the get request if you are not using it in your php code?

 

having said that i agree with php side filter_var($your_mail,FILTER_VALIDATE_EMAIL). use js only to do graphical fancy-thing on client site and to modify dom.

Link to comment
Share on other sites

From what I can see, the code checks for a period (.) character anywhere after the 3rd character in the string (this is very cautious, allowing for one letter domain names e.g. a@b.com; and for @ characters anywhere after the first character (more possible, as a single letter username is easily possible)

 

However, if I were to modify this to allow e-mail addresses with a (.) immediately after the 1st character, would this then effetively render the validation next to useless?

 

Somebody has also suggested the below code to tidy it up a bit:

 

function isValidEmail(str) {

return (str.lastIndexOf(".") > 2) && (str.indexOf("@") > 0) && (str.lastIndexOf(".") > (str.indexOf("@")+1)) && (str.indexOf("@") == str.lastIndexOf("@"));

}

 

However, it is still looking for a (.) after the 3rd character, which will still generate error messages for anybody with an alphabetic character followed immediately by a (.)

Link to comment
Share on other sites

The w3c suggests the following for a general purpose first-pass email validation in javascript:

 

function validateForm()
{
var x=document.forms["myForm"]["email"].value;
var atpos=x.indexOf("@");
var dotpos=x.lastIndexOf(".");
if (atpos<1 || dotpos<atpos+2 || dotpos+2>=x.length)
  {
  alert("Not a valid e-mail address");
  return false;
  }
}

However, I don't think you've understood the main point:  Do not rely on javascript.  javascript cannot ever be relied on for anything.  It's used to make a page pretty where available.  It can be altered or disabled by the user.  You cannot use it for ACTUAL validation.  You must use filter_var for that.

 

-Dan

Link to comment
Share on other sites

I 100% agree with the statements above, but . . .

 

I also like to implement client-side validation to compliment server-side validation to provide a more interactive experience for the user. Plus, depending on what version of PHP your host is using, filter_var() may not be available (but if that's the case you should probably change hosts).

 

Anyway, here are a JS and PHP function I have used in the past for validating email formats. Again, you should be using the built in email test using filter_var() if it is available to you.

 

JS

function validEmail(emailStr)
{
    //Return true/false for valid/invalid email
    formatTest = /^[\w!#$%&\'*+\-\/=?^`{|}~]+(\.[\w!#$%&\'*+\-\/=?^`{|}~]+)*@[a-z\d]([a-z\d-]{0,62}[a-z\d])?(\.[a-z\d]([a-z\d-]{0,62}[a-z\d])?)*\.[a-z]{2,6}$/i
    lengthTest = /^(.{1,64})@(.{4,255})$/
    return (formatTest.test(emailStr) && lengthTest.test(emailStr));
}

 

PHP:

function is_email($email) 
{
    $formatTest = '/^[\w!#$%&\'*+\-\/=?^`{|}~]+(\.[\w!#$%&\'*+\-\/=?^`{|}~]+)*@[a-z\d]([a-z\d-]{0,62}[a-z\d])?(\.[a-z\d]([a-z\d-]{0,62}[a-z\d])?)*\.[a-z]{2,6}$/i';
    $lengthTest = '/^(.{1,64})@(.{4,255})$/';
    return (preg_match($formatTest, $email) && preg_match($lengthTest, $email));
}

 

I have never had a problem with a "legitimate" valid email not being accepted. Here is a full description of the validation.

// NOTES:
//
// Format test
// - Username:
//     - Can contain the following characters: 
//         - Uppercase and lowercase English letters (a-z, A-Z) 
//         - Digits 0 to 9 
//         - Characters _ ! # $ % & ' * + - / = ? ^ ` { | } ~ 
//     - May contain '.' (periods), but cannot begin or end with a period
//       and they may not appear in succession (i.e. 2 or more in a row) 
//     - Must be between 1 and 64 characters 
// - Domain:
//     - Can contain the following characters: 'a-z', 'A-Z', '0-9', '-' (hyphen), and '.' (period). 
//     - There may be subdomains, separated by a period (.), but the combined domain may not
//       begin with a period and they not appear in succession (i.e. 2 or more in a row) 
//     - Domain/Subdomain name parts may not begin or end with a hyphen 
//     - Domain/Subdomain name parts must be between 1-64 characters
// - TLD accepts: 'a-z' & 'A-Z' (2 to 6 characters)
//
// Note: the domain and tld parts must be between 4 and 255 characters total 
//
// Length test
// - Username: 1 to 64 characters
// - Domain: 4 to 255 character

Link to comment
Share on other sites

The main problem I have with filter_var($email, FILTER_VALIDATE_EMAIL) is that it will accept email addresses without a tld extension (such as for an intranet). I.E. user@hostname is considered valid, which for an internet application presents an issue.

That's very true, but that's also a valid email address.  As is maniac+dan@ff::0  There's an interesting debate about where your line could be drawn.

 

-Dan

Link to comment
Share on other sites

The main problem I have with filter_var($email, FILTER_VALIDATE_EMAIL) is that it will accept email addresses without a tld extension (such as for an intranet). I.E. user@hostname is considered valid, which for an internet application presents an issue.

That's very true, but that's also a valid email address.  As is maniac+dan@ff::0  There's an interesting debate about where your line could be drawn.

 

-Dan

 

In the spirit of keeping this conversation running ....

 

One could, in theory, split or explode the email address by the '.' character and then evaluate the TLD separately from the hostname; if the TLD is NULL/empty, then it isn't a valid InTERnet address. If the TLD !empty but doesn't match an ICANN approved TLD, it must not be valid.

 

Admittedly, this would be a bit expensive to do inside a loop with a lot of iterations; it would take some creative implementation. However, for just validating a POST'ed value, it shouldn't be to expensive to break it out like that.

 

It's all about how badly you need to ensure good validation on email and what it means to your system  if you get one or two bunk email addresses.

 

Link to comment
Share on other sites

With custom TLDs that would be even more annoying, and would break on IP addresses instead of hostnames. 

 

Still though, if it's a valid email address, I say allow it.  Even if that means a@b gets through.  It's valid, let it through.  Do further email validation to make sure the validly-formed email is actually accessible by the human being the filled out the form. 

Link to comment
Share on other sites

With custom TLDs that would be even more annoying, and would break on IP addresses instead of hostnames. 

 

Still though, if it's a valid email address, I say allow it.  Even if that means a@b gets through.  It's valid, let it through.  Do further email validation to make sure the validly-formed email is actually accessible by the human being the filled out the form. 

 

True, I think we have to resign to say that, unless a human being manually approves every submission, there will be 'some' attrition. Is it more important to annoy your user's while trying to signup because of an overly tough validation process or to just get them through as quickly as possible and deal with bunk data after the fact?

 

I think this thread has looked at email validation from every angle, lol.

 

I say, simple/basic validation upfront and a more stringent vet later.

Link to comment
Share on other sites

Well that's why I use filter_var.  It will allow ALL valid emails, and disallow anything else.  What goes through filter_var has to match the email specification, which means that somewhere, somehow, sending an email to that address will reach a valid box (or potentially a valid box).

 

Then you do the second phase, the "click here to confirm your account" link in the email itself. 

 

Putting your own validation spin on things will only annoy the users more.  I've had addresses with weird TLDs, and I like to use plus-addressing (like I demonstrated above).  Most custom by-hand email validators don't allow maniacdan+theSiteName@phpfreaks.com  That's what I like to use to do proper filtering on my gmail box. 

Link to comment
Share on other sites

There is no one size fits all solution. It all depends on what the purpose of the email field is and the "cost" of fixing things later. By having lax rules, how many incorrect, but valid, emails would be accepted and what are the "cost" of getting them changed later? The costs can be development costs to add functionality for dealing with these scenarios, customer service costs for someone to assist the user and even costs associated with customer satisfaction. If they place an order and don't get their email because of lax validation will they consider it their fault or something the company "should have known" to resolve.

 

For example, I think it would be much more likely that someone forgets to include the TLD rather than it being a valid email address. I, personally, wouldn't allow them.

Link to comment
Share on other sites

For example, I think it would be much more likely that someone forgets to include the TLD rather than it being a valid email address. I, personally, wouldn't allow them.

But the problem I have with that is:  That means you have to write your own custom validation which will absolutely 100% fail to match someone's actual valid email address.  Writing your own custom validation takes you a lot of time and it can't be correct.  I have never seen anyone (amateur or professional) write a proper email validation regex on their own.  So if you say "I have a reason not to use filter_var," then from my experience you're also saying "screw people from australia and anyone who uses plus addressing and anyone who's email ends in a number and any number of other categories that I can't think of right now." 
Link to comment
Share on other sites

On the subject of plus addresses, the only real world examples I've seen of plus addresses being used are from spammers, so they can register thousands of accounts without the hassle of actually having to actually create all those pesky email addresses. I think a legitimate argument could be made against allowing them in most web applications.

Link to comment
Share on other sites

For example, I think it would be much more likely that someone forgets to include the TLD rather than it being a valid email address. I, personally, wouldn't allow them.

But the problem I have with that is:  That means you have to write your own custom validation which will absolutely 100% fail to match someone's actual valid email address.  Writing your own custom validation takes you a lot of time and it can't be correct. I have never seen anyone (amateur or professional) write a proper email validation regex on their own.  So if you say "I have a reason not to use filter_var," then from my experience you're also saying "screw people from australia and anyone who uses plus addressing and anyone who's email ends in a number and any number of other categories that I can't think of right now." 

 

Calm down and take a deep breath. No one said we shouldn't support people from Australia or plus addresses or any such nonsense. I was merely pinpoint out that even though the range of "possible" valid emails is quite large, the real-world usage is much more limited. I can speak from first hand knowledge that for some usages allowing the "wrong" albeit "valid" email address can be an expensive proposition with respect to customer service, development, QA, etc. You are right that too many applications needlessly restrict valid email addresses, but that doesn't mean filter_var is the holy grail. There are a small percentage of people that are biologically both  male and female, but I'm not going to provide uses with two checkboxes to select gender even though that might be a vaild response.

 

On the subject of plus addresses, the only real world examples I've seen of plus addresses being used are from spammers, so they can register thousands of accounts without the hassle of actually having to actually create all those pesky email addresses. I think a legitimate argument could be made against allowing them in most web applications.

 

I use plus addresses all the time. Are you aware that if you have a gmail account, such as username@gmail.com, you can use any 'plus' address with that username (e.g. username+shopping@gmail.com) and it will all go to the same inbox? You can then use that information in your rules to easily categorize mail messages. It's also a godsend for testing when you need to have unique email addresses.

Link to comment
Share on other sites

On the subject of plus addresses, the only real world examples I've seen of plus addresses being used are from spammers, so they can register thousands of accounts without the hassle of actually having to actually create all those pesky email addresses. I think a legitimate argument could be made against allowing them in most web applications.

But I just said...I use them.  They're valid email addresses and it's annoying when they don't work.

 

Calm down and take a deep breath. No one said we shouldn't support people from Australia or plus addresses or any such nonsense. I was merely pinpoint out that even though the range of "possible" valid emails is quite large, the real-world usage is much more limited.
It take a lot more than you think to actually upset me.  All I'm saying is that you, personally, cannot possibly make the declaration that certain kinds of email addresses don't count. Once your have a form that marks a valid email as invalid, it is broken.  You may have broken it on purpose for various reasons, but it's broken.  You're just going to annoy people like me (and, apparently, you) by disallowing valid emails.

 

As for my comments about australia, I've seen users decide that filter_var is not what they want and roll their own email validation.  Each and every one of them forgets about the stacked TLDs like com.au. 

 

There are better ways to combat spam aside from breaking email validation.  You could do duplicate analysis without the plus-addressing part.  You could run IP analysis.  There's plenty of other ways without rolling your own email validation.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.