Jump to content

Efficient Way of Checking for Duplicates in an Array


nz_mitch

Recommended Posts

Hi there,

 

I've written the following script which is working to check for duplicates in a database before allowing a new subscription.

 

The current script grabs subscriber data from the API and puts it in an array which looks like this (I've shortened it ;)):

 

Array
(
    [anyType] => Array
        (
            [subscriber] => Array
                (
                    [0] => Array
                        (
                            [EmailAddress] => amanda@domain.com
                            [Name] => Amanda
                            [Date] => 2009-03-15 10:52:00
                            [state] => Active
                        )

                    [1] => Array
                        (
                            [EmailAddress] => sally@domain.com
                            [Name] => Sally Smith
                            [Date] => 2009-03-13 22:21:00
                            [state] => Active
                                      )

                )

        )

)

 

The code I'm using to check for duplicates looks like this:

 

$array = GETS ARRAY FROM API;
           
$newArray=array();
array_flatten($array,$newArray);


// recursively reduces deep arrays to single-dimensional arrays
function array_flatten($array, &$newArray = Array() ,$prefix='',$delimiter='|') {
  foreach ($array as $key => $value) {
    if (is_array($child)) {
      $newPrefix = $prefix.$key.$delimiter;
      $newArray =& array_flatten($child, $newArray ,$newPrefix, $delimiter);
    } else {
      $newArray[$prefix.$key] = $child;
    }
  }
  return $newArray;
} 

if (in_array($email_out,$newArray)) {
	echo "Sorry, this email address is already in the database!";
	}
else { ... }

 

As you can imagine, this is a bit of kludge and it takes ages if there's more than a few hundred users in the database.

 

Any recommendation on how I could speed this up?

 

Thanks again!

Link to comment
Share on other sites

I would recommend going in a reverse route, really.  Depends on the database system you're using, but basically, first you would design a function to get the entry from the database as if it were there.  If it finds one, the row exists, and returns values.  If it doesn't exist, it returns a 0 or null string.  Therefore,

 

$test = query(database); 
/* Using the variables setup from your form, do a query on the database to see if it returns a row.  If the row returns, it fills $test with information/values.  If it doesn't return anything, it is empty. */
$test = make_usable($test);  // if using MySQL, this would be mysql_fetch_assoc($test);

if ($test['username'] == $_POST['username']) {
  echo "Username exists.  Try something else."; }
else { echo "Username is available.  Continuing..."; }

 

or

 

$test = query(database); 
/* Using the variables setup from your form, do a query on the database to see if it returns a row.  If the row returns, it fills $test with information/values.  If it doesn't return anything, it is empty. */
$number_rows = count_rows($test);  // if using MySQL, this would be mysql_num_rows($test);

if ($number_rows >= 1) {
  echo "Username exists.  Try something else."; }
else { echo "Username is available.  Continuing..."; }

 

The value of this is that it takes the load off your application and puts it with the database server in its own environment, with is alot faster than PHP can do from the outside.

Link to comment
Share on other sites

Thanks kittrellbj!

 

I'm not 100% sure I'll be able to do that - looking through the API I'm thinking this is the best I can do.

 

I'll send an email to the devs though and see if there's an alternative. In the meantime though, any other ideas to do it my side?

Link to comment
Share on other sites

I tried to edit my above post, but ran out of time.  Here it is:

 

What database software do you use?

 

If you are using MySQL, I will go ahead and post what I use on my registration form, works very fast and is quite nice.

 

I will explain it simply (I have about thirty fields in the registration/profile thing).

 

Let us consider each user will have a username, password, and email address.  Let us also consider that we don't want people using the same username, and we don't want the same email address signing up twice.  The password field is arbitrary; users can have the same password, and we don't want to alert them if that password is in use.  They could try every username in the database and they might be able to login to someone else's account.  But, I digress. :)  On with it!

 

So, we have a database (db), with a table (users), and that table has 3 rows (username, password, email).

 

So, the form looks something like this:

 

<FORM METHOD="<?php echo PHP_SELF; ?>>
<INPUT TYPE="TEXT" NAME="username">
<INPUT TYPE="PASSWORD" NAME="password">
<INPUT TYPE="TEXT" NAME="email">
<INPUT TYPE="SUBMIT" NAME="submit" VALUE="submit">

 

Okay, so we have a form with the variable names for the $_POST function to take in,  When users hit Submit, it reloads the current page, and takes the variables into $_POST's for use in processing the form.

 

So, let's preformat the page a little bit to get it how we want.  We need the form to execute if no variables exist, but we don't want it to appear if someone just hit Submit.  So, let's go:

 

<?php
if ($_POST['submit']) {
// Here, we will put the validation code,
// along with what to do if everything validates.
}
// else, display form
else { ?>
<FORM METHOD="<?php echo PHP_SELF; ?>>
<INPUT TYPE="TEXT" NAME="username">
<INPUT TYPE="PASSWORD" NAME="password">
<INPUT TYPE="TEXT" NAME="email">
<INPUT TYPE="SUBMIT" NAME="submit" VALUE="submit">

 

Now that we have that, we can move on to how to make sure multiples aren't placed into the database.  This can be easily accomplished by checking the database for an entry that we have the variables for (the ones the user submitted).  It is easier than checking against every entry in the database; the database simply either returns a field matching what you ask for or it says it can't find it.

 

So, using MySQL, the query and PHP looks something like this.

 

$query = "SELECT * FROM users WHERE username='" . $_POST['username'] . "' OR email='" . $_POST['email'] . "'";
$result = mysql_query($query);
// The above sets up the query to find what the user
// entered.  The result is queried on the second line.
$count = mysql_num_rows($result); // how many records already exist matching the above?
if ($count >= 1) { // if there is more than 0 records, i.e. this input is NOT unique, then...
echo "Cannot process.  There is already someone with that username or email address."; }
elseif ($count == 0) { // there are no records matching what the user has inputted, then...
echo "Processed!  Your account created successfully!"; // Success!
// Along with any other database insertion, such as actually putting the user in the database.
}
elseif (!$_POST['submit'] { // If the user got here by some other means than submitting a form...
//Display the form
?>
<FORM METHOD="<?php echo PHP_SELF; ?>>
<INPUT TYPE="TEXT" NAME="username">
<INPUT TYPE="PASSWORD" NAME="password">
<INPUT TYPE="TEXT" NAME="email">
<INPUT TYPE="SUBMIT" NAME="submit" VALUE="submit">

 

And there you (basically) have it.

Link to comment
Share on other sites

Hi kittrellbj,

 

Thank you very much for your thorough response!

 

I'm pretty sure I won't be able to get access to the database. I'm using this API, with the PHP wrapper provided here.

 

The call I'm using to the API then is:

 

$array = $cm->subscribersGetActive($date = 0,$list_id);

 

So it seems to be the only way I can get at this information is to at least start with an array, right?

Link to comment
Share on other sites

OK I think I've sussed out an alternative route, but now I'm having trouble getting this function to work. Does anyone know what I'm doing wrong?

 

Here's what I've got from the PHP wrapper of the API. I'm guessing this function should work perfectly:

 

	/**
* Given an array of lists, indicate whether the $email is subscribed to each of those lists.
*
* @param string $email User's email
* @param mixed $lists An associative array of lists to check against. Each key should be a List ID
* @param boolean $no_assoc If true, only returns an array where each value indicates that the user is subscribed
*        to that particular list. Otherwise, returns a fully associative array of $list_id => true | false.
* @return mixed An array corresponding to $lists where true means the user is subscribed to that particular list.
*/

function checkSubscriptions( $email, $lists, $no_assoc = true )
{
	$nlist = array();
	foreach ( $lists as $lid => $misc )
	{
		$val = $this->subscribersGetIsSubscribed( $email, $lid );
		$val = $val != 'False';
		if ( $no_assoc && $val ) $nlist[] = $lid;
		elseif ( !$no_assoc ) $nlist[$lid] = $val;
	}

	return $nlist;
}
}

I've tried the following in my form:

	// Get info from form
$name = $_POST["Field1"];			// Name
$email_in = $_POST["Field2"];		// Raw email from form
$email_out = strtolower($email_in); // Strip emails to lower case
$ref = $_POST["Field109"];			// Referring Agent

// Check if already subscribed, add if not, output error if is

$var = $cm->checkSubscriptions($email_out, $list_id, $no_assoc = true );

if ($var == TRUE) { }
	else { }

This results in following error and the subscriber is added to the database (even if already subscribed):

 

Warning: Invalid argument supplied for foreach() in /home/www/domain.com/web/api/CMBase.php on line 550

 

Does anyone know where I could be going wrong?

 

Thanks in advance!

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.