OO PHP Part 2: Boring OO Principles

by John Kleijn on Jun 7, 2008 11:20:14 AM - 90,971 views

Introduction

How about that catchy title eh? If you are reading this, you must be REALLY dedicated to learning about Object Orientated applications!

This article will try to explain some OO principles to you, as well as some â€˜good practiceâ€™.

Know what to expect: there will be no funny pictures, no diagrams. There will be some code, but that wonâ€™t be very exciting either. This is all pretty dry stuff. Regardless, it is important. This tutorial will include some of the things intentionally left out of part 1, as mentioned in section 1.2 of that tutorial.

Still feel like it? Come on then, time to get your foundations in place!

Index

1 Core OO(P) Principles
1.1 Inheritance
1.2 Polymorphism
1.3 Encapsulation
2 Coupling, Cohesion and Some Related Principles
2.1 A practical example
2.2 Single Responsibility Principle (SRP)
2.3 Donâ€™t Repeat Yourself (DRY)
2.4 Open-Closed Principle (OCP)
3 Defensive Programming
4 Heuristics
4.1 Avoid global data
4.2 All properties should be private
5 In conclusion

1 Core OO(P) Principles

1.1 Inheritance

Remember these classes from part 1?

class Animal 
{
	public $hungry = 'hell yeah.';
	
	function eat($food)
	{
		$this->hungry = 'not so much.';
	}	
}
class Dog extends Animal 
{
	function eat($food)
	{
		if($food == 'cookie')
		{
			$this->hungry = 'not so much.';
		}
		else 
		{
			echo 'barf, I only like cookies!';
		}
	}
}

I used them to explain the concept of inheritance to you. Class Dog inherits Animalâ€™s properties and methods when an object is instantiated, if it doesnâ€™t re-declare them itself.

Animal could be the base class for other animal classes, like Bird and Cat. Subclasses like these would have a common interface defined by Animal. Confused? Sharing an interface simply means other components of your application can interact with the classes in a similar way. This interaction is defined by the calling routines (methods and method arguments â€“ the name, arguments and return value of a method is also referred to as the method signature).

In this ridiculously simply example, eat is the whole of the interface. Any class extending Animal is guaranteed to have a eat method which takes a single argument, because even if they donâ€™t declare it themselves, Animal defines the default behaviour.

1.2 Polymorphism

Letâ€™s take a look at a possible Birdâ€¦

class Bird extends Animal 
{	
	function eat($food)
	{
		if($food == 'seed')
		{
			$this->hungry = 'not so much.';
		}
			else 
		{
			echo 'barf, I only like seed!';
		}
	}
}

As you can see, Bird isnâ€™t all that different from Dog. They both eat food. However, because they are not the same type of Animal, they behave differently in response to the $food they are being fed. This is referred to as having their own implementations of the interface provided by Animal, or simply: implementations of Animal.

This is at the core of the principle of polymorphism: Different types of objects can be handled in the same way, even though their implementations vary.

1.3 Encapsulation

Encapsulation refers to â€˜isolatingâ€™ meaningful parts of your application from each other. This involves hiding information (both data and the implementation of a specific behaviour), and breaking your application into meaningful parts. Often encapsulation is used as a synonym for Information Hiding. But this is only half of the story. The other half, breaking your application into meaningful parts, will be covered in the next chapter.

For a practical example of Information Hiding, check out the heuristic â€œAll Properties Should be Privateâ€ later in this tutorial.

2 Coupling, Cohesion and Some Related Principles

Coupling describes the relationship between different parts of an application, and is an indication of dependency between those parts.

Cohesion is the degree in which different parts form a meaningful unit. 'Parts' can refer to pieces of software on any level, from components, to packages, sub packages, classes, methods, or even blocks of code. It's not really contradictory to coupling. High cohesion can promote loose coupling, because high cohesion usually equals less (or more related) responsibilities for that part.

Parts of an application are coupled when a structural or behavioural change in one, requires a change in other parts of the application. The â€˜amountâ€™ of change required represents the level of coupling.

Decoupling simply means to separate a specific implementation from its context, by generalization and/or encapsulation. It is key to achieving reusability of code.

Ok, I assume that at this point you are really bored, confused, or both. So letâ€™s try a practical example.

2.1 A practical example

Imagine some CMS with data access logic scattered throughout the application. A part of it is a User object, which looks sorta like below.

class User
{
	private $_db;
	
	private $_password;
	
	public function __construct(mysqli $db, $username)
	{
		$this->_db = $db;
		
		$result = $this->_db->query("SELECT * FROM user WHERE username = $username");
		
		$row = $result->fetch_assoc();
		
		$this->_password = $row['password'];
		
	}
	
	public function login($password)
	{
		if($this->_password === hash('sha256', $password))
		{
			$this->_loggedIn = true;
			$this->_update();
			return true;
		}
		else 
		{
			return false;
		}
	}
	
	private function _update()
	{
		$this->_db->query("UPDATE user SET logged_in = $this->_loggedIn");
	}
}

Note that this isnâ€™t a mysqli tutorial and there are better ways to use mysqli. Just to keep things simple, weâ€™ll stick with the query() method.

2.2 Single Responsibility Principle (SRP)

If you havenâ€™t fell asleep you might remember that it is important to strive for high cohesion â€“ form a meaningful unit. The SRP is not much more than a nice handle for that goal.

Each component (e.g. a class) should have a single, obvious responsibility. Unfortunately this is not always as easy as it sounds. The biggest challenges lies in correctly formulating a componentsâ€™ responsibility.

Look at the example class, and ask yourself: what does it do? The simple (but untrue) answer would be â€˜it handles user related stuffâ€™. The right answer is: it handles user related stuff AND maps its data to the database. This is a direct violation of the SRP principle (bring out the handcuffs!). So how do we fix this? We encapsulate the concept that varies (much about that in a later article). Database mapping has nothing to do directly with the concept of a user, so out it goes!

class User
{	
	private $_name;
	private $_password;
	private $_loggedIn;
	
	public function __construct($username, $password, $loggedIn = false)
	{
		$this->_name = $username;
		$this->_password = $password;
		$this->_loggedIn = $loggedIn;
	}
	
	public function login($password)
	{
		if($this->getPassword() === hash('sha256', $password))
		{
			$this->_loggedIn = true;
			
			return true;
		}
		else 
		{
			return false;
		}
	}
	
	public function getUserName()
	{
		return $this->_name;
	}
	
	public function getPassword()
	{
		return $this->_password;
	}
	
	public function isLoggedIn()
	{
		return $this->_loggedIn;
	}
}

Of course we still need to map to database, so we create a Data Mapper (more about that later as well, for now a simplified example).

class UserDataMapper
{
	private $_db;
		
	public function __construct(mysqli $db)
	{
		$this->_db = $db;
	}
	
	public function find($username)
	{
		$result = $this->_db->query("SELECT * FROM user WHERE username = $username");
		
		$row = $result->fetch_assoc();
		
		return new User($row['username'], $row['password'], $row['logged_in']);
	}
	
	public function update(User $user)
	{
		$this->_db->query(
			"UPDATE user SET logged_in = {$user->isLoggedIn()}"
		);
	}
}

The User class is now independent of the SQL and database adapter used. Coupling between User and mysqli is non-existent. The client code is still coupled though:

//Configuration
$mapper = new UserDataMapper(new mysqli());

//Later
$user = $mapper->find('448191');

if(!$user->login('foo'))
{
	echo 'Sorry dude, you\'re not 448191..';
}

But that is where it belongs, in the place where you pick the components you want to use. Sometimes, all this talk of abstraction and reducing responsibilities can get people confused. Because in the bigger picture, youâ€™re not reducing responsibilities, you are simply moving them to different context. Somewhere where you can be comfortable about committing to a specific implementation, without losing flexibility. You can, for example, use a configuration option to decide what database adapter to use.

We have achieved decoupling. Where does the cohesion come in? In this case, simply by striving for decoupling, we have achieved higher cohesion as well! On a class level, the _update() method is gone, and User now forms a more meaningful unit.

If we apply this throughout our badly written app, the different mappers and Domain Objects (thatâ€™s what User is â€“ other examples are ShoppingCart, Post, Board, etc..) will form a meaningful unit as well. Achieving loose coupling and high cohesion is that easy. ;)

2.3 Donâ€™t Repeat Yourself (DRY)

A linear script, simply executing top down, may have to do the same or similar thing several times. In the procedural model, you create a function to try and encapsulate these repeated procedures in functions. You have done this before.

In the context of OOP, you are provided with way more opportunities to reduce repeated logic than when you code procedural. Going back to our UserDataMapper, we may want to add more find methods, thatâ€™ll find users based on different criteria. Letâ€™s say we just want the all time highest poster. Using the copy and paste approach, we could produce this:

class UserDataMapper
{
	private $_db;
		
	public function __construct(mysqli $db)
	{
		$this->_db = $db;
	}
	
	public function find($username)
	{
		$result = $this->_db->query("SELECT * FROM user WHERE username = $username");
		
		$row = $result->fetch_assoc();
		
		return new User($row['username'], $row['password'], $row['logged_in']);
	}
	
	public function findHighestPoster()
	{
		$result = $this->_db->query(
			"SELECT user_id, username, password, logged_in, COUNT(post_id) AS postcount 
				FROM user
				JOIN posts USING(user_id)
			GROUP BY user_id ORDER BY postcount DESC LIMIT 1"
		);
		
		$row = $result->fetch_assoc();
		
		return new User($row['username'], $row['password'], $row['logged_in']);		
	}
	
	public function update(User $user)
	{
		$this->_db->query(
			"UPDATE user SET logged_in = {$user->isLoggedIn()}"
		);
	}
}

But find and findHighestPoster now have very similar implementations. In fact, the only difference is the query used. It also reveals a responsibility of UserDataMapper we overlooked: creating new User objects. How do we fix this? There are two things that violate DRY: Getting an array out of a query and initializing a User object. It would be overkill to move the fetching of a array to a different type, weâ€™ll just abstract it out.

How do we do that? We create an abstract class which all DataMappers will extend. This is where we will put the behaviour that is common to all types of Data Mappers.

abstract class DataMapper
{
	protected $_db;
		
	public function __construct(mysqli $db)
	{
		$this->_db = $db;
	}
	
	protected function _fetchSingleRecordAssoc($query)
	{
		$result = $this->_db->query($query);
		
		return $result->fetch_assoc();
	}
}
class UserDataMapper extends DataMapper
{
	public function find($username)
	{		
		$row = $this->_fetchSingleRecordAssoc("SELECT * FROM user WHERE username = $username");
		
		return new User($row['username'], $row['password'], $row['logged_in']);
	}
	
	public function findHigestPoster()
	{
		$row = $this->_fetchSingleRecordAssoc(
			"SELECT user_id, username, password, logged_in, COUNT(post_id) AS postcount 
				FROM user
				JOIN posts USING(user_id)
			GROUP BY user_id ORDER BY postcount DESC LIMIT 1"
		);
		
		return new User($row['username'], $row['password'], $row['logged_in']);		
	}
	
	public function update(User $user)
	{
		$this->_db->query(
			"UPDATE user SET logged_in = {$user->isLoggedIn()}"
		);
	}
}

That fixes one problem, one to go. Instantiating a new User makes UserDataMapper coupled to the User class. Thereâ€™s no way we can really eliminate this, but we can reduce it. We could do that in a variety of ways, but for the sake of brevity, we are simply going to abstract creating a new User. Then at least, we wonâ€™t be violating DRY, just the SRP, and I think we can get away with a fine and a month probation.

public function find($username)
{		
	return $this->userFactory(
		$this->_fetchSingleRecordAssoc("SELECT * FROM user WHERE username = $username")
	);	
}
public function userFactory(array $row)
{
	return new User($row['username'], $row['password'], $row['logged_in']);
}

2.4 Open-Closed Principle (OCP)

The open closed principle prescribes that components should be â€œclosed for modificationâ€ but â€œopen for extensionâ€.

If you go back to the example in chapter 1.1, youâ€™ll see this principle in effect. At some point a component is complete and tested. When you need additional functionality or alternative behaviour, you could go back in and modify the component. But why change something that does exactly what you want it to do?

Instead we declare Animal closed (although we do not have any means to enforce this) for modification. It is still open for extension, and Dog gratefully takes advantage of this by redefining (thus overriding) method eat() to suit its own implementation of how an Animal should behave.

The OCP is very closely related to the core OOP principles of inheritance, encapsulation and polymorphism.

Boring, no? The bottom line: try to make your components extensible. That way you donâ€™t have to hack code that works fine to support new behaviour.

3 Defensive Programming

Defensive programming is â€˜hope for the best, plan for the worstâ€™. When you write a component, you donâ€™t just assume that it will be used correctly. You validate the arguments and context and let the component fail when you find something at a miss.

class Dog extends Animal 
{
	function eat($cookie)
	{
		switch($cookie)
		{
			case 'grain cookie';
				$this->_hungry = 'Not so much.';
			break;
			case 'chocolate chip cookie';
				$this->_hungry = 'I could eat some more.';
			break;
			case 'almond cookie';
				$this->_hungry = 'Hungry for more.';
			break;
			default: 
				throw new NotATastyCookieException("Unknown cookie: '$cookie'");
		}
	}
}

We will build on this example, and provide more insight into defensive programming in the next chapter.

4 Heuristics

There are many â€œheuristicsâ€ in OOD that according to many should be accounted with.

These â€œlessons to be learned by othersâ€™ experienceâ€, can indeed help you to avoid problems. They focus on a specific design decision, and are usually the result of trying to conform to OOD principles. OOD heuristics are not at all free of debate though.

Often OOD heuristics predefines the solution to a problem before you really encounter it. Therefore I like to think of them as â€˜rules of thumbâ€™: in most cases they hold true.

My advice: adopt these guidelines, but if they limit you in a way you can not compensate for, start looking for alternative solutions.

There are many more published and more descriptive heuristics, but Iâ€™m unable (and unwilling) to list all of them.

4.1 Avoid global data

Also known as â€˜All global data is evilâ€™. It pains me when I see people do stuff like this:

function foo()
{
	global $abc, $lkj, $yty;

	$abc->do($lkj, $yty);
}

Do you have any idea whatâ€™s going on here? Neither do I. Not only is it unclear what is in those globals, it is also impossible to back trace the execution path by looking at the code. This makes it impossible to debug, unless you wrote it yourself and have a excellent memory. The latter definitely excludes myself.

But there is another type of â€˜global dataâ€™: statically available data. Consider the following example:

class SomeClass
{
	private static $_something = 'a string';
	
	public static function getSomething()
	{
		 return self::$_something;
	}
	public static function setSomething($somethingElse)
	{
		 self::$_something = $somethingElse;
	}
}
$something = SomeClass::getSomething();

SomeClass::setSomething('not ' . $something);

Although the data is not in the global space, it is globally available, and it is quite easy to litter your application with unclear dependencies on some global data. You cannot make any assumptions about the contents of the data. Any components using this data are coupled to the global environment of the application!

In general, youâ€™re better off using known data, shielded from the rest of the environment and passed only by clearly defined and stateful interfaces.

4.2 All properties should be private

This has its roots in Information Hiding as well as Defensive Programming.

Basically the argument is that you should have total control over your properties values, and it definitely has some merits.

Look back at the Defensive Programming example. If hungry was still public, we could pretend we fed Doggie a chocolate chip cookie, simply by setting hungry to 'I could eat some more.'.

But thatâ€™s cheating and nullifies the point of having that validation there. What we do instead is declare hungry private, and provide so called â€˜accessor methodsâ€™, also known simply as â€˜setters and gettersâ€™. Declaring all a classâ€™ properties private also forces you to make a conscious choice about the accessibility of an objects internal data.

Youâ€™ve seen some getters in the User class. Those allow UserDataMapper read-only access to the User properties.

â€˜Settersâ€™ allow you to validate the input before setting it on the object. In either type of method you may want to do some other things as well, like delegating to a aggregate object (Again, we will cover this later), or keeping track of the number accesses and printing â€œCongratulations on being the 1000th customer!â€.

So, how do we make doggy adhere to all of this?

Since hungry is actually a property of Animal, weâ€™ll have to go in and edit that first.

class Animal 
{
	private $_hungry = 'hell yeah.';
	
	private static $_validHungryStrings = array(
		'Not so much.',
		'I could eat some more.',
		'Hungry for more.'
	);
	
	public function eat($food)
	{
		$this->_setHungry('not so much.');
	}

	public function getHungry()
	{
		return $this->_hungry;
	}
	
	protected function _setHungry($hungry)
	{
		if(!in_array($hungry, self::$_validHungryStrings))
		{
			throw new InvalidArgumentException("Invalid hungry string: '$hungry'");
		}
		$this->_hungry = $hungry;
	}
}

This is an example of very strict defensive programming: we do not trust children of Animal to provide a valid $hungry string, in fact, we do not even trust Animalâ€™s own method eat to provide a valid string. This may seem like overkill, but it is in fact very good practice that will prevent bugs. If you look closely, youâ€™ll see that the code above will in fact fail (â€˜not so muchâ€™ needs to start with a capital letter).

Mending Dog to work with the stricter Animal is trivial:

class Dog extends Animal 
{
	function eat($cookie)
	{
		switch($cookie)
		{
			case 'grain cookie';
				$this->_setHungry('Not so much.');
			break;
			case 'chocolate chip cookie';
				$this->_setHungry('I could eat some more.');
			break;
			case 'almond cookie';
				$this->_setHungry('Hungry for more.');
			break;
			default: 
				throw new InvalidArgumentException("Unknown cookie: '$cookie'");
		}
	}
}

5 In conclusion

While this tutorial covers a lot of ground, you may have noticed that we are really only touching the surface. Much of this will be covered in more detail when we go into Design Patterns.

Look forward to a Class Diagram tutorial and a tutorial introducing the most common Design Patterns in the near future. Every tutorial will occasionally look back to provide frame of reference.

I hope you learned a lot, and provided you didnâ€™t fell asleep, you probably did ;D

Till next time.

Comments

Daniel Jun 7, 2008 12:19:04 PM

Great. You're moving faster than I would've thought. Personally I'm looking forward to reading part six through eight.

I haven't read this one yet, but when I have I'll try to write what I think about it :)

John Kleijn Jun 7, 2008 12:27:09 PM

Curses! You revealed that I have 8 parts planned! Jinx....

Daniel Jun 7, 2008 12:36:21 PM

Haha... now you're obliged to write them all :D

Daniel Jun 7, 2008 12:53:18 PM

Okay, so I've just finished reading it. I think it's really good, but I'm wondering, seeing as "All properties should be private", do you think it would be wrong to have protected properties so that children classes may modify it?

Edit: I modified your TOC so the page numbers from your .doc are no longer there.

John Kleijn Jun 7, 2008 2:58:14 PM

Strictly, yes.

Absolute best practice would be to have only private properties. Especially when keeping classes 'open for extension', you risk losing the benefits of defensive programming if you declare properties protected instead of private. If you want to give child classes some special privileges, write a protected method that allows them to execute those privileges.

But, I cheat a little in that respect myself and use protected in some cases as well. In particular when I have a class that doesn't have any accessors yet, I can be lazy and use protected instead of accessors. But being lazy doesn't exactly make good practice..

In all, it isn't cast in stone, just best practice.

Daniel Jun 7, 2008 3:17:48 PM

I suppose that makes sense. I'll keep that in mind for future projects I'll be working on.

John Kleijn Jun 7, 2008 3:24:33 PM

Putting defensive programming aside, there's another reason why using accessors, within a hierarchy or even in the class itself is a good idea:

"The reason why I chose to use the get_*() and set_*() methods inside the class even though the properties are accessible as well is because if one later wanted to change how to retrieve or set the data only those functions would have to be updated and not everything using them."

Does that quote look familiar? This is about cohesion and centralizing responsibility as well.

Daniel Jun 7, 2008 3:33:33 PM

Indeed it looks familiar :D

Corbin Hughes Jun 9, 2008 12:48:44 AM

Wow.... This was one of the best things I've ever read on OOP. For some reason, the whole responsibility and coupling and what not thing had never clicked before....

Looking forward to part 3 ;p.

allenskd Jul 5, 2008 8:52:37 PM

Puff! Yay, catching up with the tutorials, love them! :)

Brandon Frohs Sep 11, 2008 9:00:19 PM

Just finished reading all 3 of your articles on OOP and I was taking a second look at this and noticed a spelling error. Thought you might like to know so you can fix it :-)

2.3 Donâ€™t Repeat Yourself (DRY)
Second Code Block
'findHigestPoster' -> 'findHighestPoster'

John Kleijn Sep 12, 2008 4:43:32 AM

Brandon,

Actually, that's a typo. Not to say there aren't loads and loads of spelling mistakes and grammatical errors in all of my tutorials. I need to find the time look them over properly, but I'm afraid that won't happen too soon.

Thanks for the feedback though.

If anyone wants to volunteer proofreading my tutorials in say a Language Editor function, let me know. ;)

SomeFunkyDude Mar 8, 2009 1:40:38 PM

Thanks for the run-through on the basic principles, really I didn't find this that boring, because it was very straight forward.

artdyke Aug 19, 2009 5:57:10 PM

This really helped me wrap my head around why OOP is useful. Thanks!

noverticallimit Oct 11, 2009 7:05:40 AM

thanks alot for your contribution/ sharing knowledge ^^, i hope i can learn php deeper, need to do php & MYaql for my data structures subject ><

god HELP ME! ^^

btw; thanks John Kleijn

Best Regard shootdatarget

Derleek Nov 6, 2009 5:53:23 PM

excellent... just plain excellent!!!

manjulait Jul 28, 2010 11:50:56 PM

This is good idea and excellent work. I am waiting for next section. thanks lot.