Tutorials

Data Joins Unions

by Barry Andrew on Jun 30, 2008 5:59:59 PM

The Data

After weeks of intensive data analysis and data modelling we present our redesigned database to the headmaster. (OK, it took a few seconds but if you were doing this on a corporate scale it is going to take that timescale, identifying all the processes and data elements and how they relate to one another. Get this bit right and life is a lot easier down the line.)

We have immediately identified three main items of data (entities), namely pupil, house and subject and these will be our main tables. We give each entity a unique identifier (ID). From the spreadsheet you might have counted four entities but a pupil's house-master depends on the house to which the pupil belongs, and is therefore an item in the house table.

house               pupil            subject
============        ===========      ==========
houseID             pupilID          subjectID
house_name          pupil_name       subject
house_master

The Data Relationships

We look at each pair of entities and consider how they are related.

House - subject

We will assume there is no relationship between house and subject. Our examination of the processes showed no requirement. If Beetsem-Daley had wanted, say, a league table of house results for each subject then there would be. (As a professional, it's my duty to make him aware of the potential though. No harm in generating extra work).

House - pupil

A house has many pupils. A pupil belongs to one house. Here we have a simple one-to-many relationship between the two. We handle this by adding the houseID to the pupil record so we know to which house each pupil belongs. (Computers handle numbers much more efficiently than text so we will will use these numeric ids to link our data.)

Pupil - subject

A pupil can choose many subjects. A subject can be chosen by many pupils. Here we have a many-to-many relationship. We handle these by creating another entity (in this case we'll call it choice) to link pupil and subject. Each choice will contain the pupilID and the subjectID of a chosen subject. Actually, a subject can be chosen by none, one or many pupils. This doesn't affect our table design but it will affect how we query the data (more of which later).

The Database Schema

Our final design, showing the entity relationships, looks like this

house               pupil                 choice                 subject
============        ===========           ==========             ==========
houseID      --+    pupilID     ---+      id             +----   subjectID
house_name     |    pupil_name     +---<  pupilID        |       subject
house_master   +--< houseID               subjectID   >--+

Notice that our many-to-many relationship between pupil-subject has now become two simple one-to-many relationships between pupil-choice and subject-choice. In our original spreadsheet we could see which subjects were chosen by a pupil. Now, by looking at the same relationship in the opposite direction, we can easily find which pupils chose a particular subject.

Had we been required to store who teaches each subject then there would also be a teacher entity so that the teacher's name (and other teacher details) would be stored in only one place. In this case the house-master item would have been a teacherID linked to the teacher table and each subject would have a "taught_by" item, also linked to the teacher table.

Now we can start querying our database using joins.

Comments

Very useful!
My php for dummies was useless for joins - many thanks!

1. Tom Butterworth on Jul 1, 2008 6:20:06 PM

Barand, could you extend the tutorial to include how to loop the results of the query. I find that when I use a Join , the query will return the first result and then stop. It will not run thru both tables and list all the results. When the 1st match is found, the query then doesnt look thru for other matches which seems to mean that the if stmt following the result is not running correctly .

Ex: $sql11 = "SELECT * FROM `TabA`
INNER JOIN
`TabB` USING (Sid)
WHERE TabA.Eid = $group AND TabA.Month = $month AND TabA.Year = $year ORDER BY TabB.Name asc" ;

$result11 = mysql_query($sql11);
if($myrow11 = mysql_fetch_array($result11))
{ do whatever}
while($myrow11 = mysql_fetch_array($result11);

Thanks !

2. swatisonee on Jul 3, 2008 4:07:11 AM

swatisonee: You'd do it like with all other query results:

3. Daniel Egeberg on Jul 3, 2008 5:45:05 AM

Thanks Daniel but I'm afraid I still fail to run thru the loop. I will post on the help forum the code as it runs without using Join and then the one I am working on to replace it using Joins as it is too long to post as a tute. comment.

4. swatisonee on Jul 4, 2008 4:38:28 AM
Login or register to post a comment.