The Database Corner
Robert L. Stewart, SIG Leader

Putting Normalization into Practice
Last issue, I talked about 3 of the 6 levels of normalization.
The last 3 are generally only used in extremely well-designed systems. And, I doubt if you will ever see one designed to that level unless you do it yourself. If you are working in the real world, you will generally hear that they want a 3rd normal form database. Unfortunately, most people do not understand what a normalized database means.
To give you an example that you can really use, I am going to show you how to design a contact management database. This will be designed to give you the maximum flexibility. I will also introduce you to a quick method of determining how to normalize the tables.
First, we will start with something that looks like the typical database (or spreadsheet) that you will probably run into. The structure would look something like this:
Name_ID
First_Name
Middle_Name
Last_Name
Company_Name
Home_Address
Home_City
Home_State
Home_Postal_Code
Company_Address
Company_City
Company_State
Company_Postal_Code
Home_Phone
Office_Phone
Personal_Cell_Phone
Work_Cell_Phone
Personal_Email
Work_Email
Contact_History
Hobbies
Special_Dates
Starting with the first rule, everything in a column should stand alone; we need to look at 3 columns in our design above. First, here is the simple method. Ask the question “Can there ever be more than one?” Looking at our design above, we need to ask that question of hobbies, special_dates, and contact_history. Can a person ever have more than one hobby? Can a person ever be associated with more than one special date? Can you ever contact a person more than once? If the answer is ever “Yes” to these questions, then it means that you need another table to store the data in. Like below:

Contact_History_ID

Hobby_ID

Special_Date_ID

Name_ID

Name_ID

Name_ID

Contact_Date

Hobby_Description

Special_Date

Contact_Method

 

Special_Date_Description

Contact_Description

 

 

Going to the next level, we need to eliminate repeating data. At first, you may say there is nothing that is repeating there. But, look closer. Don’t we repeat addresses, phones, and emails? In this case, just the type of address, phone, and email are different. We would normalize those out like this:

Name_Phone_ID 

Name_Email_ID

Name_Address_ID

Name_ID

Name_ID

Name_ID

Phone_Type

EmailType

AddressType

Phone_Number

Email_Address

Address_Line_1

 

 

Address_Line_2

 

 

City

 

 

State

 

 

Postal_Code

Take a look at what we have done. Now, if you want to store a new phone number and associate it to a name, you can. Before, you would have had to change the database structure to accommodate something so simple.
But, looking at it, I also see some things that are repeating again. Can there ever be more than one phone type? The same applies to addresses and email addresses. So, we need to go a bit further here.

Name_Phone_ID 

Name_Email_ID

Name_Address_ID

Name_ID

Name_ID

Name_ID

Phone_Type

EmailType

AddressType

Phone_Number

Email_Addres

Address_Line_1

 

 

Address_Line_2

 

 

City

 

 

State

 

 

Postal_Code

Phone_Type_ID

Email_Type_ID

Address_Type_ID

Phone_Type

Email_Type

Address_Type

Now, we will never be repeating the text “home” in Address type, because it is stored in a table that is linked by an ID column to the address itself. This is a real timesaver when you have to update the address type. For instance, if you wanted to change “home” to “residence,” you would have to edit all the records in the address table that had home in the address type. But with the design just above, you would simply go to the address type lookup table and make the change in a single record.
Our initial table now contains only the following columns:


Name_ID

First_Name

Middle_Name

Last_Name

Company_Name

If we make our design a little bit more generic, and call it a names database, we can go a bit further in normalization.

Name_ID

Related_Names_ID

Relationship_Type_ID

First_Name

Name_ID

Relationship_Type

Middle_Name

Related_Name_ID

 

Last_Name

Relationship_Type_ID

 

Now, we have what is sometimes known as an intersection or resolver table. This is where we resolve what is known as a many-to-many relationship. Going back to our question, “Can a company name ever be related to more than one person?” Of course the answer is “Yes.” By pulling the company name out and storing the company in with all the other names in the name table, we have eliminated it repeating. But, we also need a way of relating it to the people that are associated with that company and identifying what kind of relationship they have. This can work between individuals within the names table also. There is actually no limit to the number of relationships you can set between names using this method. In the example above, I would store company names in the Last_Name column.
Well, there you have it, a well-designed and normalized database to hold names and information about them. This kind of design can be extended much further than what we started out with. And, we do not have to change the structure of the database every time we want to add a new type of something we want to store. You will also find that doing complex reporting is much easier using this normalized structure rater that the “spreadsheet” design we started out with.
To learn more about database and GUI design, visit the Database and GUI Design Workshop at HAL-PC on the 2nd Saturday of the month from 9 AM to Noon. For more information, feel free to contact me at Robert@WeBeDb.com.