Jump to content

Data normalization and displaying result set


Andy-H

Recommended Posts

Ok, say if I have a database containing these tables:



books
bookID
title
pages
publisherID


bookSubjects
bookID
subjectID


bookAuthors
bookID
authorID


subjects
subjectID
subject


authors
authorID
author


publishers
publisherID
publisher

 

 

books - authors = one to many

books - subjects = one to many

books - publishers = one to one

------------------------------------------------

 

 

and I wanted to display all the stored info about a book, lets say with bookID = 13, would this query be correct in order to retrieve the desired info?

 

 

SELECT b.title, b.pages, s.subject, a.author, p.publisher
FROM books b
	INNER JOIN bookSubjects bs USING (bookID),
	INNER JOIN subjects s USING (subjectID) ,
	INNER JOIN bookAuthors ba USING (bookID),
	INNER JOIN authors a USING (authorID),
	INNER JOIN publishers p USING (publisherID)
		WHERE b.bookID = 13;

 

 

Also, wouldn't I need to loop the result set to display multiple authors / subjects? and doesn't this defeat the object of the data being normalized - due to the redundant data (title, pages, publisher) being pulled/looped multiple times to display the publishers and subjects or is there a way around this?

 

 

Should I be using multiple queries or is my data not normalized properly?

 

 

Thanks - Andy

Link to comment
Share on other sites

Your normalization has an issue - you don't need to have pivot tables (like bookAuthors) for a 1:many relationship.  Simply put the primary key of the many as the foreign key of the 1, so Books would have an AuthorID column, but Authors would not have a BookID column.

 

Pivot tables, by their very structure, denote many:many relationships.  Why?  Because each key can be associated with any number of other keys.  Example:

 

BookAuthors:
BookID                              AuthorID
1                                       12
1                                       45
1                                       34
2                                       45
2                                       77
3                                       12
3                                       77

 

Notice the repeated key values on both sides.  AuthorID 12 points to BookIDs 1 and 3, for example.

Link to comment
Share on other sites

Perhaps I got the relationships wrong in that case, as one book record could reference many authors, so having comma-separated values would be bad practice, as would having author1 author2 etc...

 

 

It must be a many-many relationships, I think I confused myself by imagining a scenario of a single book record and how it would relate to the other tables. Thanks for the help.

Link to comment
Share on other sites

First, your data definition shows: books-to-authors as many-to-many; books-to-subjects as many-to-many; books-to-publishers as many-to-one. I'm not sure I see why books-to-subjects is many-to-many, I generally think of a book as having a single subject, but I don't know how you are defining "subjects".

 

To answer your questions:

 

Also, wouldn't I need to loop the result set to display multiple authors / subjects?

Yes, since there could be multiple Authors and/or Subjects, you will have to loop the result set of that query to present all of the data.

 

doesn't this defeat the object of the data being normalized - due to the redundant data (title, pages, publisher) being pulled/looped multiple times to display the publishers and subjects

No. Normalization is about storage and maintenance. If you normalize your data properly, you save storage space -- you only store the Title of a book in one place. And you reduce you maintenance overhead - if you need to correct the spelling of an Author's name, you edit it in one place only. Retrieval, will return multiple instances of the same book Title regardless of whether the data is properly normalized or not.

 

or is there a way around this?

Yes there is.  You could do multiple queries, with queries inside a loop -- which most people on this forum will say "don't ever do". Or you could aggregate the data using the GROUP_CONCAT() function of mySql. It all depends on your presentation requirements.

 

Should I be using multiple queries or is my data not normalized properly?

I think the outline you provided is normalized properly (other than using the wrong terminology for the relationships).

 

Be aware that with two (or more) many-to-many relationships, the query you proposed will return a lot more rows than you think. Say there are three Authors and two subjects for a given book. You probably expect three or five rows, but you will, in fact, get six. Author1 will be linked with every subject (2 rows), Author2 will be linked with every subject (2 rows), and Author3 will be linked with every subject (2 rows) -- that's six rows with every author listed twice (once for each subject) and every subject listed three times (once for every author). Having the front-end process this result could be tedious and error prone. I would likely use multiple queries or aggregate the data at the server.

 

 

As an example of aggregating the data, here is your proposed query, modified slightly:

SELECT b.title, b.pages, p.publisher, 
    GROUP_CONTCAT(s.subject SEPARATOR ', ') as Subj, 
    GROUP_CONCAT(a.author SEPARATOR ', ') as Auth
FROM books b
	INNER JOIN bookSubjects bs USING (bookID),
	INNER JOIN subjects s USING (subjectID) ,
	INNER JOIN bookAuthors ba USING (bookID),
	INNER JOIN authors a USING (authorID),
	INNER JOIN publishers p USING (publisherID)
		WHERE b.bookID = 13
GROUP BY b.bookID;

This will return a single row for the book, with multiple Authors in a comma-delimited field and multiple Subjects in a comma-delimited field. You could then manipulate the Subj and Auth fields in PHP to fit your presentation strategy.

 

I have not tested this specific query, but I have found the function useful. See the manual for GROUP_CONCAT() for more details on this function.

Link to comment
Share on other sites

This thread is more than a year old. Please don't revive it unless you have something important to add.

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Restore formatting

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.