Is this secure?

doubledee · March 26, 2011

The code below allows me to insert articles into my website without having to hard-code them in the home page.

Is this code secure? (Someone told me I should use a switch statement instead?!)

<?php
	if (isset($_GET['article'])) {
		$articleFile = preg_replace('#[^A-z0-9_\-]#', '', $_GET['article']).'.php';

		if(file_exists($articleFile)) {
			include($articleFile);
		}else{
			$title = 'Article Not Found';
			$content = '';
		}
	}else{
		include('default.php');
	}
?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">

<head>
<title>Dynamic Content Example</title>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<link type="text/css" rel="stylesheet" href="css/pagelayout.css">
<link type="text/css" rel="stylesheet" href="css/dropdown.css">
</head>

<body>
<div id="wrapper" class="clearfix">
	<div id="inner">
		<div id="header">
			<!-- DROP-DOWN MENU -->
			<ul id="topMenu">
				<li class="current"><a href="?article=article1">Article 1</a></li>
				<li><a href="?article=article2">Article 2</a></li>
				<li><a href="?article=article3">Article 3</a></li>
				<!-- and so on... -->
			</ul><!-- End of TOPMENU -->

		</div>
		<div id="left">
			<p>
				Other content goes here : Other content goes here : Other content goes here :
			</p>
		</div>
		<div id="middle">
			<div id="content">
				<h2>MAIN CONTENT</h2>
				<p>
				<!-- Dynamically insert Article here using PHP include!! -->
				<?php echo $content; ?>
				</p>
			</div>
		</div>
		<div id="right">
			<p>
				Adverting goes here : Adverting goes here : Adverting goes here :
			</p>
		</div>
	</div>
	<div id="l"></div>
	<div id="r"></div>
</div>
<div id="footer">
	<p>footer</p>
</div>
</body>

</html>

If there is a better way to accomplish the same thing, and/or a more secure way, I would be interested in hearing about it.

Thanks,

Debbie

gizmola · March 26, 2011

The concern from a security standpoint is that someone will use this routine to attempt to get the webserver to include a file you wouldn't want them to.

There are 2 security provisions here: preg_replace('#[^A-z0-9_\-]#'... is using a regex to match any characters that aren't either alphanumeric or the '_" or '-'.

The other provision is that it's adding the .php at the end, so only php files (which will be parsed as php code) will be looked at.

The only issue I see is that a person could probably crash your server by causing it to include this script by getting it to include itself, which would recurse until the process ran out of memory or exceeded execution time. This is because the code will include any php file in the same directory as this script, including any of the scripts that are part of your site.

gizmola · March 26, 2011

Also a switch statement is really an alternative to a large "if then elseif " construct, and not applicable to this particular approach.

doubledee · March 26, 2011

The concern from a security standpoint is that someone will use this routine to attempt to get the webserver to include a file you wouldn't want them to.

There are 2 security provisions here: preg_replace('#[^A-z0-9_\-]#'... is using a regex to match any characters that aren't either alphanumeric or the '_" or '-'.

The other provision is that it's adding the .php at the end, so only php files (which will be parsed as php code) will be looked at.

Is the regular expression correct?

What I hear you saying is that this is pretty secure because of the two measures you have mentioned, right?

The only issue I see is that a person could probably crash your server by causing it to include this script by getting it to include itself, which would recurse until the process ran out of memory or exceeded execution time. This is because the code will include any php file in the same directory as this script, including any of the scripts that are part of your site.

Well, I will be storing my articles in a separate directory - probably called "articles" - so that should solve that issue, right?

Debbie

doubledee · March 26, 2011

Also a switch statement is really an alternative to a large "if then elseif " construct, and not applicable to this particular approach.

Well, I like the above code much better, because a switch statement would get REALLY unwieldy with my top menu and sub-menu. I mean there could upwards of 100 different choices, so that is a humongous switch or If-The-Else, right?!

(BTW, this is the original question that I posted a week or so ago and got all kinds of flak because my question was too vague.)

I think this solution works nicely for what I want, which is the ability to pull out *content* from my home page template.

I suppose it could be made better by putting my articles in a database, but honestly, I think that is overkill for now.

At any rate, is there anything else I could do to the above code to make it more secure and/or more efficient?? (If it looks good enough, then I'd like to run with it this weekend setting up my website!

Thanks,

Debbie

gizmola · March 26, 2011

To your first point - yes forcing the files to all be in a particular directory would solve the problem I brought up, and is what I'd suggest as well.

The regex is correct from the looks of it. Regular expressions can be tricky, but that one is fairly straightforward. It's using a "character class" as denoted by the '[]'. The first character of "^" inside a character class is "not" so from there on you are saying .. match any characters that are "not" the ones in this character class. And from there you have the a-z and 0-9 ranges and the underscore. Because the '-' character is special inside a character class, you can't use it without escaping it, so that is why there is a '\-' at the end.

Since I'm on a PC I use http://weitz.de/regex-coach/ for testing of regex routines. While not as fully featured, if you're on a mac you can use: http://sourceforge.net/projects/quregexmm/. They give you a way to interactively test out a regex so you have a better idea of whether or not it's going to work for you.

For the case statement - I agree. For a group of url's that will not change, a switch statement would be a good solution. For articles, where the assumption is that you will be adding to them, you want something that will not require you to recode your site every time you want to add a new article.

The only question you should seriously consider is whether or not you really want to have your articles be php files, especially if you don't really need or want them to have actual php code inside of them. You can include text files of any type, so this approach would work just as well for .html files and that might even be preferable, depending on your workflow and the tools you will be using to author your articles.

doubledee · March 26, 2011

Okay, good, so it sounds like my code - while simple - is fundamentally sound!

The only question you should seriously consider is whether or not you really want to have your articles be php files, especially if you don't really need or want them to have actual php code inside of them. You can include text files of any type, so this approach would work just as well for .html files and that might even be preferable, depending on your workflow and the tools you will be using to author your articles.

Hmmm... That spawns a bunch of questions and comments!

1.) Yes, if the articles were HTML files, then they would be useful in and of themselves, which is a good thing.

2.) Really newbie question... But can I "include" an HTML file and have it work the same way as I have things currently set up using PHP files? (i.e. Can I include and HTML file and insert/mesh it with a larger PHP file so it become one file to the user?)

3.) Do I compromise security using either a .txt or .html file? (If I keep my articles as .php, then if someone ever ran them they wouldn't see anything. Of course, they are supposed to see them - at least if they use my site the right way?!)

4.) What are the pros/cons of keeping things as separate files (e.g. HTML or PHP)?

5.) What is the benefit of putting the articles in a MySQL database?

The big thing I can see is that you can associate lots of meta-data (e.g. author, create date, update date, categories, etc) to the article itself.

Thanks,

Debbie

gizmola · March 26, 2011

Okay, good, so it sounds like my code - while simple - is fundamentally sound!

The only question you should seriously consider is whether or not you really want to have your articles be php files, especially if you don't really need or want them to have actual php code inside of them. You can include text files of any type, so this approach would work just as well for .html files and that might even be preferable, depending on your workflow and the tools you will be using to author your articles.

Hmmm... That spawns a bunch of questions and comments!

1.) Yes, if the articles were HTML files, then they would be useful in and of themselves, which is a good thing.

2.) Really newbie question... But can I "include" an HTML file and have it work the same way as I have things currently set up using PHP files? (i.e. Can I include and HTML file and insert/mesh it with a larger PHP file so it become one file to the user?)

Yes, because PHP and html can be intermingled. PHP drops out of "php parsing mode" whenever it includes a file, until it sees the start of a php block (the <?php).

3.) Do I compromise security using either a .txt or .html file? (If I keep my articles as .php, then if someone ever ran them they wouldn't see anything. Of course, they are supposed to see them - at least if they use my site the right way?!)

It just depends on what you have in the files. The advantage of .html files is that they will still be parsed by the web server if they are in web space, and of course, because you can use a wysiwyg editor to check that the look of the articles is what you want it to be. One other thing about using include() is that the files in question do not have to be in web space. You can pick a location for the directory that is not under the webroot and it will still work, so you can have a special directory for the articles so that they are not accessible via a url.

4.) What are the pros/cons of keeping things as separate files (e.g. HTML or PHP)?

The pros are that this would be a reasonably high performance solution that requires very little in the way of infrastructure or moving parts. The cons are that you have to upload the files in order to publish them, and you really have no intrinsic way to test them before publishing.

There's also a question of navigation, although even that could be built into your system using routines like opendir() & readdir() to build a list of articles. You become highly dependent on the operating system and things like the file creation time, if for example, you want to have a list of articles show in the order in which they were added.

5.) What is the benefit of putting the articles in a MySQL database?

The big thing I can see is that you can associate lots of meta-data (e.g. author, create date, update date, categories, etc) to the article itself.

Yes those are the benefits, as well as supporting additional related data like user comments. As data piles up, the ability of the database to have indexed searching give you efficiency when you want to do things like paginate articles using different criteria, or provide a list of all the articles by a particular author. There are file based solutions for any of these problems, as well, but you have to add your own procedural code to get the same features.

For an article oriented site, it's a good idea to look at the various "no-sql" systems that have become popular in recent years, as a number of them are document oriented. CouchDB and MongoDb would be 2 of particular interest. A relational database isn't very good at dealing with text, and since you brought up meta data, you might find that a document oriented db is a better fit if you're primarily dealing with documents.

Probably a larger benefit is that using a database tends to be a better platform for facilitating a web based authoring system based on forms. You can still deal with documents, but there are a lot of permissions issues that come into play.

doubledee · March 26, 2011

Interesting conversation so far. Thanks for all of the comments!

2.) Really newbie question... But can I "include" an HTML file and have it work the same way as I have things currently set up using PHP files? (i.e. Can I include and HTML file and insert/mesh it with a larger PHP file so it become one file to the user?)

Yes, because PHP and html can be intermingled. PHP drops out of "php parsing mode" whenever it includes a file, until it sees the start of a php block (the <?php).

Okay, that is good to know.

3.) Do I compromise security using either a .txt or .html file? (If I keep my articles as .php, then if someone ever ran them they wouldn't see anything. Of course, they are supposed to see them - at least if they use my site the right way?!)

It just depends on what you have in the files. The advantage of .html files is that they will still be parsed by the web server if they are in web space, and of course, because you can use a wysiwyg editor to check that the look of the articles is what you want it to be. One other thing about using include() is that the files in question do not have to be in web space. You can pick a location for the directory that is not under the webroot and it will still work, so you can have a special directory for the articles so that they are not accessible via a url.

True, but since I develop using NetBeans, I could still use included PHP files and see how the final PHP page looks just fine. (That is what I am doing on this code block we are discussing now. Maybe not sophisticated, but it works for me!)

I like the idea of separating the files.

Are there any reasons from a performance or security standpoint that you would not want to put your articles outside of the webroot? (Or did you just mean during creation?)

4.) What are the pros/cons of keeping things as separate files (e.g. HTML or PHP)?

The pros are that this would be a reasonably high performance solution that requires very little in the way of infrastructure or moving parts. The cons are that you have to upload the files in order to publish them, and you really have no intrinsic way to test them before publishing.

See my comment about NetBeans above.

There's also a question of navigation, although even that could be built into your system using routines like opendir() & readdir() to build a list of articles. You become highly dependent on the operating system and things like the file creation time, if for example, you want to have a list of articles show in the order in which they were added.

Okay, that might be an issue down the road.

5.) What is the benefit of putting the articles in a MySQL database?

The big thing I can see is that you can associate lots of meta-data (e.g. author, create date, update date, categories, etc) to the article itself.

Yes those are the benefits, as well as supporting additional related data like user comments. As data piles up, the ability of the database to have indexed searching give you efficiency when you want to do things like paginate articles using different criteria, or provide a list of all the articles by a particular author. There are file based solutions for any of these problems, as well, but you have to add your own procedural code to get the same features.

Good to know, but for now, I don't think the benefits of a database are that large. (I'll just be happy if I have time to write 20-30 articles to flesh out my website!)

For an article oriented site, it's a good idea to look at the various "no-sql" systems that have become popular in recent years, as a number of them are document oriented. CouchDB and MongoDb would be 2 of particular interest. A relational database isn't very good at dealing with text, and since you brought up meta data, you might find that a document oriented db is a better fit if you're primarily dealing with documents.

Can you give me an overview of how they differ from a regular database? (I'm guessing that they use a table with meta-data that is related to a physical file (e.g. "MyArticle.html")?!

Probably a larger benefit is that using a database tends to be a better platform for facilitating a web based authoring system based on forms. You can still deal with documents, but there are a lot of permissions issues that come into play.

Okay, however, since this is a one-woman-show, I don't think that is such a big deal for now at least.

Thanks,

Debbie

doubledee · March 26, 2011

To make each article file more identifiable, it would seem that using the article's title as a filename would be a good strategy. (I have been doing that for over a decade on my own computer when I save a news article (e.g. "RisingGasPricesCouldStallRecovery.html")

What are your thoughts on this versus using some derived naming convention (e.g. primary key "8951204329.html")?

Also, I see a lot of online newspapers that have user-friendly URLs like this"

w w w. usatoday.com/money/smallbusiness/startup/abrams-week2-position-yourself.htm

How do I go from my article to a URL like that?

1.) First, how do I go from a file name to a friendly filename like in the URL above?

2.) What is the most efficient way to include a directory structure in my code above?

I tried referencing my articles sub-directory that you encouraged me to use, but I can't get it working in Netbeans...

<body>
<div id="wrapper" class="clearfix">
	<div id="inner">
		<div id="header">
			<!-- DROP-DOWN MENU -->
			<ul id="topMenu">
				<li class="current"><a href="/articles/?article=article1">Article 1</a></li>
				<li><a href="/articles/?article=article2">Article 2</a></li>
				<li><a href="/articles/?article=article3">Article 3</a></li>
				<!-- and so on... -->
			</ul><!-- End of TOPMENU -->

Adding even a simple file path in the HTML code makes things ugly too!

Debbie

Sign In

Is this secure?

Recommended Posts

doubledee

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

doubledee

Link to comment

Share on other sites

doubledee

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

doubledee

Link to comment

Share on other sites

gizmola

Link to comment

Share on other sites

doubledee

Link to comment

Share on other sites

doubledee

Link to comment

Share on other sites

Join the conversation

Browse

Activity

Important Information