Jump to content

Is there a way to remove scripts from a string?


simboski19

Recommended Posts

Is there a way/function to not only remove <script>, <embed> tags etc but also remove the content within the tags so this:

 

"

some text

<script>

functionhere();

</script>

some more text

"

 

to this:

"

some text

some more text

"

 

In effect remove the whole tag and content within the tags? Havent been able to find anything online that works

 

Many thanks in advance.

Simon

 

Link to comment
Share on other sites

Yeah i have tried around 4-5 of these functions that i discovered using Google search but this is slightly above my ability so just wondered if anyone had the same need in the past and a function that works.

 

Many of the preg_replace functions didnt work replacing all of the content and the tags.

 

Simon

Link to comment
Share on other sites

not the prettiest regex, threw it together in a minute, but it's tested.

 

<?php
$string = "some text <script> functionhere(); </script> some more text";
$regex = "/<.+>[a-zA-Z0-9]+<\/[a-zA-Z0-9]+>/";
$string = preg_replace($regex,'',$string);
echo $string;
?>

Link to comment
Share on other sites

Thanks AyKay47, I will give this a go.

 

Hi Adam, I need to remove all tags and their content as i need to stop people inserted dangerous scripts into my DB. They were just a few examples but if you have any further suggestions they would be welcomed.

 

Thanks guys

Simon

Link to comment
Share on other sites

Okay, though removing the contents of every tag would leave the posts not making sense. The reason strip_tags() only removes the actual tags, is so that any text in <b>bold</b> for example will still be readable. If you don't want your users to be able to insert HTML, just escape it with htmlspecialchars as you output it.

 

<?php echo htmlspecialchars($str); ?>

 

 

Link to comment
Share on other sites

didn't know that you were using it for this purpose. the best method for sanitizing a user input string in my opinion is to escape the special characters beofre inserting the string into your db.. this will disallow sql injection xss etc. you can use filter_var and specify the filter to your liking.. or you can use a combination of htmlspecialchars and mysql_real_escape_string, or really you can also use a regex to either disallow specific special chars, or remove them completely, however the ladder choice isn't very user friendly. Depends on what your logic for this is.

Link to comment
Share on other sites

the best method for sanitizing a user input string in my opinion is to escape the special characters beofre inserting the string into your db..

 

There's no need to escape the data within the database, it won't do any harm there. It would also take up more memory with all the HTML in its entity form. Escaping is only required when you *output* the data. Of course, you should still sanitise the data from SQL injections before using it within a SQL string.

Link to comment
Share on other sites

Thanks for the information guys.

 

One thing though as I am not so clued up with the inserting of data in a safe manor apart from mysql real escape string(). Are you saying that as long as the data is made safe on entering and exiting the database there would never been an issue of security here?

 

Thanks

Simon

 

 

Link to comment
Share on other sites

Nope. Saying that if you sanitize the data correctly there won't be any issue of users injecting exploits. One situation to be aware of is when you use numeric data, and don't include quotes within the SQL:

 

select * from TableName where id = $id;

 

Here the user could insert an SQL injection, like for example "1 OR 1=1", that would break the where condition logic. This wouldn't be secured against by mysql_real_escape_string() as there's no quotes or special characters used. You need to validate or cast the data as an integer. Obviously in this case it wouldn't do any actual damage, but consider if there was an UPDATE statement with the same exploit...

 

It's also down to your application code to prevent any logic-based security holes. For example, if the user modified the GET parameters to try and view a page they didn't have permission to see, your application should check this every time.

 

Security is a very broad subject, I would recommend reading this tutorial for a better understanding.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.