Article
Concepts of Database Design and Management
Third Normal Form
So, you think your table is clear now. You think it is ready for the big leagues? No…not yet. The 3NF is ready to take on new challenges. Lets go right to our example table:
| Customer # | Name | Address | Credit | Sales Rep # | Sales Rep Name |
| 2343 | Bob | 2343 Wee St. | $1000 | 03 | Dave |
| 2344 | Daniel | 1230 Wow St. | $3000 | 04 | Pam |
| 2345 | Kate | 33 Soho | $5000 | 03 | Dave |
| 2346 | Less | 99 More Is | $1000 | 07 | Garrett |
| 2347 | Dan | 1 Less St. | $500 | 09 | Peter |
| 2348 | Bruno | 45 Candy Ln. | $2000 | 03 | Dave |
Now, you probably recognize a few problems, but be assured, this is 2NF. Every field depends on the Customer Number. For example, Customer Number 2345 will only have one name, address, credit, sales rep number, and sales rep name associated with it. However, this doesn't mean the table is ready for the prime time.
First, we need to define a determinant as any column or collection of columns that determine another column. By this definition, and primary key, or any candidate key will be a determinant. Also, that would make Sales Rep Number a determinant, but it's not the candidate key or a primary key.
So, how does this relate to the 3NF? Well, the definition of the 3NF is a table that complies with the 2NF (and of course, the 1NF) and if the only determinants it contains are candidate keys. This does, of course, include the primary key.
Now, you have just discovered the problem with the table: the determinant of Sales Rep Number. So how do you go about getting rid of it?
First, track down all the determinants that are NOT a candidate key. Then, remove all the fields that rely on this determinant key, but keep the determinant key in the table. Then, with all the fields you removed, put them into a table, with the primary key being the determinant key you left in the main table. So, that would mean our original table of:
Customer #, Name, Address, Credit, Sales Rep #, Sales Rep Name
Would be turned into these two tables:
Customer #, Name, Address, Credit, Sales Rep #
Sales Rep #, Sales Rep Name
Quality.
Remember, these two new tables need to comply with 1NF, 2NF, and 3NF as well, so double check them, though if you are up to this point, you should be doing okay.
Fourth Normal Form
Finally, we are up to the big one. 4NF is the father of the forms, as it is the be all and end all. This takes care of any problem that may occur. Lets start this time by defining a very important term, multivalued dependence (MD). MD is when field B is multidependent on A if each value of A is associated with a specific list of values for B, and this collection is independent of any values of C.
Lets take a deep breath, and illustrate this with a simple table. Lets assume that Faculty represents the guidance counselor for any number of students represented by Student Number. Also, we can assume that the faculty member will be a part of any number of committees, and we want to store this as well. So, this is what we come up with.
| Faculty | Student Number | Committee Code |
| 1243 | 2343 | ADV |
| 2345 | PER | |
| HSG | ||
| 1553 | 3243 | ADV |
| 4003 | 3408 | HSG |
| 4095 | ||
| 4403 |
Now, passing this to 1NF would give you this result:
| Faculty | Student Number | Committee Code |
| 1243 | 2343 | ADV |
| 1243 | 2345 | PER |
| 1243 | 2345 | HSG |
| 1553 | 3243 | ADV |
| 4003 | 3408 | HSG |
| 4003 | 4095 | HSG |
| 4003 | 4403 | HSG |
But, you can clearly see the problem developing there as well. This is what MD is. If Faculty number 1243 was no longer counseling Student Number 2345, and we deleted the data, then the Committee information would be deleted as well. This is not quality.
A 4NF is 3NF compliant, and there are no multivalued dependencies.
You would handle this problem like in 2NF and 3NF, by splitting the table up into smaller tables with each containing the field that multidetermined them, and in this case, that would be the Faculty field. So, the new tables would be:
Faculty, Student Number
Faculty, Committee Code
Conclusion
Truly, that was a lot of work, but it is well worth it. Making sure your tables are optimized is key to quality design. Taking care of the problem before it becomes a problem is vital to managing a database.
Before you sit down to design the database, gather all the information you want to include in the database. I mean everything. Go around to each department of the company (or just write it out yourself if this is just for you) and find out what everyone wants in the database. Once you have everything, bring it back, and create one huge table.
From there, break that table down to 1NF, then 2NF, and so on. Go back over each table, and make sure they all work together, and are all 4NF tables. If they aren't, then it can be assured the tables will suffer problems in the future.
Quality is in the design. And for those people who know, this helps comply with Codd's first 2 rules for a truly relational database system.
Happy coding.