Thursday, September 2, 2010

Primary Keys Do Not or Should Not Equal Clustered Index....Repeat After Me...Learn How To Add A Clustered Index On A Non Primary Key Column

Today I'm going to tell you about a novel concept that I just had to demonstrate to one of my employees who didn't know/believe that it was possible. Before I tell you about this concept, I need to give you a little bit of background information. If you are using SQL Server, you know that when you create a table and add a primary key, a clustered index is automagically created for you on the primary key column. However, the problem with that is that the primary key you create may not be the clustered index you are looking for. There can be only one clustered index on a table. SO what are YOU going to do if the clustered index on your primary key is not on the column you want? What are you going to do if you need to add a clustered index on a column that is not the primary key? The answer to those questions is to change the clustered index first and then create your primary key. In order to change your clustered index, you need to remember a few things.


  1. A primary key is not an index. Primary keys uniquely identify columns in your table - nothing more and nothing less. You don't have to use them for indexing.
  2. A clustered index, reorders the physical rows of your table. So if you are consistently doing searches on a particular column AND it is unique in your table, you can use that column in a clustered index. Don't forget this. I always use the example of a telephone book. Clustered Indexes are like telephone books. Why would you want to have a telephone book ordered by a randomly generated Primary Key? The answer is that you wouldn't. Instead, you need a real world way to order your information.
  3. You SHOULD NOT let SQL Server decide the clustered index for you. It happens by default, but accounting for this is part of good database design.
  4. Setting up a clustered index that IS NOT your primary key is easy to do.

I love #4.... I love #4....repeat after me....

So with all of those things in mind, let me give you an example. Lets say that you have a table called Options. Your table is defined like this.



CREATE TABLE [dbo].[Options](
[OptionId] [uniqueidentifier] NOT NULL CONSTRAINT [DF_Options_OptionId] DEFAULT (newid()),

[OptionName] [nvarchar](200) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[OptionDisplayName] [nvarchar](200) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[OptionDescription] [nvarchar](max) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

[OptionStatus] [nvarchar](50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL CONSTRAINT [DF_Options_OptionStatus] DEFAULT ('ACTIVE'),

[LastStatusChangeDate] [datetime] NOT NULL CONSTRAINT [DF_Options_LastStatusChangeDate] DEFAULT (getdate()),

[CreatedBy] [uniqueidentifier] NOT NULL,

[DisplayOrder] [int] NOT NULL,

[DateAdded] [datetime] NOT NULL CONSTRAINT [DF_Options_DateAdded] DEFAULT (getdate())

) ON [PRIMARY]



Here is what this table looks like in the flesh without a key:


Looks good right? Now what? Well if you are like most people, your next step is to add the primary key! But this isn't correct. Adding the primary key will automatically create a clustered index. INSTEAD, we are going to create the clustered index FIRST! That's right! I said FIRST! You create the clustered index first and then you add the primary key after the fact.

Here is what you end up with.

You didn't see it, but the table is now physically ordered by the Display Order column - NOT the OptionId column. As you can see, we have a Clustered Index AND a Primary Key column which is unique, yet non-clustered. How did we do that? We did it by remembering to do the following:

  1. Create your Clustered Index FIRST - before you add a Primary Key!
  2. Add your unique Primary Key after you have added the Index.

This will allow you to have the clustered index set on the column that you actually should have it on instead of using the default set up that SQL Server offers you.

Problem:
You don't know how to add a non primary key column as a clustered index in SQL Server.

Solution:
Add the clustered index first on the column the table should be ordered on and then add the primary key.

* Note - Always remember to add your Clustered Index and then save and then add your primary key and then save AND THEN add your relationships. If you are using the SQL Database Diagram feature, don't try to add your clustered index and then use the diagram to add your primary key and relationships without saving the primary key BEFORE YOU ADD RELATIONSHIPS. If you add the clustered index and then go to the diagram to add the primary key and relationships WITHOUT SAVING the primary key before you add the relationships, when you save the primary key and relationships, the system will remove your custom clustered index and try to add the clustered index on the primary key only. This seems to be a quirk of using the database diagram feature.

Always remember the order -
  1. Add clustered index on your column
  2. Save
  3. Add primary key
  4. Save
  5. Add relationships
  6. save

Smooches,

Kila Morton

No comments: