Jump to content

[SOLVED] Profanity filter


limitphp

Recommended Posts

$wordlist = "crap|dang|shoot";
$comment = preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/./","*","\\1")', $comment);

This seems to work really good, because it catches words by themselves, not if they are part of another word like class

 

ex) it changes crap into ****.

 

Does anyone know how to modify it so it will instead change it to cr*p.

In other words modify it so I can add a new list of replacement words?

 

Thanks

 

 

 

Link to comment
Share on other sites

$wordlist = "crap|dang|shoot";
$comment = preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/./","*","\\1")', $comment);

This seems to work really good, because it catches words by themselves, not if they are part of another word like class

 

ex) it changes crap into ****.

 

Does anyone know how to modify it so it will instead change it to cr*p.

In other words modify it so I can add a new list of replacement words?

 

Thanks

 

 

 

 

not to sound a bit baffled by this, isnt it better to have it all * out, to complety get rid of the word

Link to comment
Share on other sites

not to sound a bit baffled by this, isnt it better to have it all * out, to complety get rid of the word

Probably yes.  But I think it would be better to have a little of the word showing.  Kind of like on TV when you can hear them say shhh and then you hear the beep and maybe even hear the t.

 

Plus, if I chose alot of words to beep out it might get confusing if there are ***** everywhere.

With some of the letters showing, it might be easier to understand what the user was saying or expressing. 

Link to comment
Share on other sites

so im assuming you have a word list file?

 

I'm going to make one.

I'll just call it $replacement

 

I like the way the code works so that you don't have to make an array like

$wordlist[0] = "crap"

$wordlist[1]...

etc

 

you just use the | to seperate the words.  I'm hoping to do that with the replacement words as well, but if it can't be done, oh well.  I'll put them in an array if I have to.

Link to comment
Share on other sites

an array is fine.

I wouldn't even know how to do that.

the original code uses a preg_replace inside a preg_replace , so that right there sort of warps my mind.

I think the /b keeps it from replacing a word with the badword in it, like class.  But I'm not sure what the ie does or the 1 or the dot.

 

Link to comment
Share on other sites

I saw this:

$patterns[0] = '/crap/';
$patterns[1] = '/dang/';
$replacements[0] = 'cr*p';
$replacements[1] = 'd**g';
$sting = preg_replace($patterns, $replacements, $string);

That would be perfect, but it doesn't differientiate between dang and dangerous.

 

Dangerous would become d**ngerous.

 

If it could be modified to change the word only when its by itself it would be perfect.

Link to comment
Share on other sites

Here is a much less elegant solution that I just made up and is very untested.

 

<?php
$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';
$wordlist = "crap:cr*p|dang:d*ng|shoot:sh**t";
$words = explode('|', $wordlist);
foreach ($words as $word) {
list($match, $replacement) = explode(':', $word);
$comment = preg_replace("/([^a-z^A-Z]?)($match)([^a-z^A-Z]?)/i", "$1".$replacement."$3", $comment);
}
echo $comment;
?>

 

d*ng this cr*p, I want to sh**t somebody. But that would be dangerous! 

Link to comment
Share on other sites

Lolz.

 

I'm trying to get this to work:

 

<?php
$str = "Crap! Dang! The dangerous gun didn't shoot!";
$wordlist = "crap|dang|shoot";
echo preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/./ie",\'preg_replace("/a|e|i|o|u/","*","\\1")\',"\\1")', $str);
?>

 

But right now it prints:

 

Cr*pCr*pCr*pCr*p! D*ngD*ngD*ngD*ng! The dangerous gun didn't sh**tsh**tsh**tsh**tsh**t! 

Link to comment
Share on other sites

Got it!

 

Try this crazy sh*t:

 

<?php
$str = "Crap! Dang! The dangerous gun didn't shoot!";
$wordlist = "crap|dang|shoot";
echo preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/\\1/ie",\'preg_replace("/a|e|i|o|u/","*","\\1")\',"\\1")', $str);
?>

 

Cr*p! D*ng! The dangerous gun didn't sh**t! 

 

*EDIT*

Super lolz!  Looks like this works just as well:

 

<?php
echo preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/a|e|i|o|u/","*","\\1")', $str);
?>

 

Link to comment
Share on other sites

Got it!

 

Try this crazy sh*t:

 

<?php
$str = "Crap! Dang! The dangerous gun didn't shoot!";
$wordlist = "crap|dang|shoot";
echo preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/\\1/ie",\'preg_replace("/a|e|i|o|u/","*","\\1")\',"\\1")', $str);
?>

 

Cr*p! D*ng! The dangerous gun didn't sh**t! 

 

*EDIT*

Super lol!  Looks like this works just as well:

 

<?php
echo preg_replace("/\b($wordlist)\b/ie", 'preg_replace("/a|e|i|o|u/","*","\\1")', $str);
?>

 

 

that is good, I think you broke alot of ground here, but on words like @sshole.  If I put in s in the list its going to mess up *h*t.

 

What we need is actual replacement control.

So, basically, you replaced "/./"

with all the letters we want replaced.

But somehow we need to feed it a list of actual replacement words.

 

 

Link to comment
Share on other sites

Here is a much less elegant solution that I just made up and is very untested.

 

<?php
$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';
$wordlist = "crap:cr*p|dang:d*ng|shoot:sh**t";
$words = explode('|', $wordlist);
foreach ($words as $word) {
list($match, $replacement) = explode(':', $word);
$comment = preg_replace("/([^a-z^A-Z]?)($match)([^a-z^A-Z]?)/i", "$1".$replacement."$3", $comment);
}
echo $comment;


?>

 

d*ng this cr*p, I want to sh**t somebody. But that would be dangerous! 

 

limitphp, if we want relacement control lets use the code i quoted above, then make a form that explodes the list on | and then explode it again on : then feed them into text fields, so you can change them, to whatever you want you would have a list like

 

these would be in text fields

Jerkwad    J*rkw*d

 

but if you decided  J*rkw*d was too revealing you could change it to J**kw*d,

 

also, always have the script make a blank set of fields at the bottom so you can add a new one, incase somebody slips a new word in there. since there is a true plethora of curse word derivitives and all. then upon post, reconstiute the string with the : and | characters then rewrite the file.

 

the beauty of this is that your banned word list is truely customizable with how they display, it doenst just take out the vowels, you can ban whole words, you can change it to *REMOVED* if you thing it is that raunchy, and your list of words could start with just the basics, and be constantly growing, as long as you are active in your comment thread readings and your comment system has flag and edit functions you can do your edit and then go add it to the list

 

I know, lots of reading, but it could work

 

EDIT:

 

thanks for starting this thread, i have been trying to do this for a while but am terrible with regex, so these expressions are really helping me out

Link to comment
Share on other sites

Here is a much less elegant solution that I just made up and is very untested.

 

<?php
$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';
$wordlist = "crap:cr*p|dang:d*ng|shoot:sh**t";
$words = explode('|', $wordlist);
foreach ($words as $word) {
list($match, $replacement) = explode(':', $word);
$comment = preg_replace("/([^a-z^A-Z]?)($match)([^a-z^A-Z]?)/i", "$1".$replacement."$3", $comment);
}
echo $comment;


?>

 

d*ng this cr*p, I want to sh**t somebody. But that would be dangerous! 

 

limitphp, if we want relacement control lets use the code i quoted above, then make a form that explodes the list on | and then explode it again on : then feed them into text fields, so you can change them, to whatever you want you would have a list like

 

these would be in text fields

Jerkwad    J*rkw*d

 

but if you decided  J*rkw*d was too revealing you could change it to J**kw*d,

 

also, always have the script make a blank set of fields at the bottom so you can add a new one, incase somebody slips a new word in there. since there is a true plethora of curse word derivitives and all. then upon post, reconstiute the string with the : and | characters then rewrite the file.

 

the beauty of this is that your banned word list is truely customizable with how they display, it doenst just take out the vowels, you can ban whole words, you can change it to *REMOVED* if you thing it is that raunchy, and your list of words could start with just the basics, and be constantly growing, as long as you are active in your comment thread readings and your comment system has flag and edit functions you can do your edit and then go add it to the list

 

I know, lots of reading, but it could work

 

EDIT:

 

thanks for starting this thread, i have been trying to do this for a while but am terrible with regex, so these expressions are really helping me out

 

Oh, ok, so it works!  Awesome!

I misunderstood.  I thought that didn't work.

Well, this is great.  That is exactly what I needed. 

I'll be doing exactly as you said, continuing to grow the badword list.

Of course, right now the site isn't built yet.  And once its built I'll only actually need this if it becomes popular.

And if it becomes popular, which will be a long shot, I'll have hit the BIG TIME.

:)

 

I'm a bit neurotic though, I can't move on until I fix or build something I think I'll need.

lol....

Thanks again!

Link to comment
Share on other sites

its good, but ya forget that preg_replace can take arrays for the patterns as well as the replacements

so lets modify yer code to take advantage of that feature

$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';
$wordlist = "crap:cr*p|dang:d*ng|shoot:sh**t";
$words = explode('|', $wordlist);
foreach ($words as $key=>$word) {
   list($needle[$key],$replacement[$key])=explode(':', $word);
   $needle[$key]= "/\b{$needle[$key]}\b/i";
}
$comment = preg_replace($needle,$replacement, $comment);
echo $comment;
?>

 

there ya have it. BTW nice work so far...

Link to comment
Share on other sites

its good, but ya forget that preg_replace can take arrays for the patterns as well as the replacements

so lets modify yer code to take advantage of that feature

$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';
$wordlist = "crap:cr*p|dang:d*ng|shoot:sh**t";
$words = explode('|', $wordlist);
foreach ($words as $key=>$word) {
   list($needle[$key],$replacement[$key])=explode(':', $word);
   $needle[$key]= "/\b{$needle[$key]}\b/i";
}
$comment = preg_replace($needle,$replacement, $comment);
echo $comment;
?>

 

there ya have it. BTW nice work so far...

 

Awesome....thanks guys....

it will take me a good while here to figure out why this actually works.  But when I do, it will be another great lesson learned.

Oh, by the way, should I take out any "|" vertical lines before I send it to this filter?

 

Link to comment
Share on other sites

haha. I completed a list of "Bad words" for you if u wanna use em. The location of em is:

 

http://24.76.182.99/codesnippets/Swear%20Filter.txt

 

btw: If what I did is offensive please dont ban me. Just delete my post

 

Your URL is not responding. Is there another way to get a copy? Would you be willing to ZIP a copy and email it?

Link to comment
Share on other sites

its good, but ya forget that preg_replace can take arrays for the patterns as well as the replacements

so lets modify yer code to take advantage of that feature

$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';
$wordlist = "crap:cr*p|dang:d*ng|shoot:sh**t";
$words = explode('|', $wordlist);
foreach ($words as $key=>$word) {
   list($needle[$key],$replacement[$key])=explode(':', $word);
   $needle[$key]= "/\b{$needle[$key]}\b/i";
}
$comment = preg_replace($needle,$replacement, $comment);
echo $comment;
?>

 

there ya have it. BTW nice work so far...

 

Awesome....thanks guys....

it will take me a good while here to figure out why this actually works.  But when I do, it will be another great lesson learned.

Oh, by the way, should I take out any "|" vertical lines before I send it to this filter?

 

 

The pipes ("|") are just there as delimiters to separate the bad words, and the colons (":") are just there to separate the bad word from it's replacement.  You could use anything you wanted for these.  For example, you could have the badwords in a text file where each bad word was on a new line, and the bad word and it's replacement were separated by the pipe, e.g.

 

badwords.txt

shoot|sh**t
dang|d*ng
crap|cr*p

 

And then you could alter the code to be a little bit more flexible (combined with laffin's optimization):

 

<?php
$comment = 'Dang this crap, I want to shoot somebody.  But that would be dangerous!';

$word_separator = "\n";
$replacement_separator = "|";

$wordlist = file_get_contents('badwords.txt');

$words = explode($word_separator, $wordlist);
foreach ($words as $key => $word) {
   list($needle[$key], $replacement[$key]) = explode($replacement_separator, $word);
   $needle[$key] = "/\b{$needle[$key]}\b/i";
}
$comment = preg_replace($needle, $replacement, $comment);
echo $comment;
?>

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.