8 Weeks of Indexes: Types of Indexes

This is the 3rd post in a series entitled “8 Weeks of Indexes”.  The purpose of this series was more for me to learn more about indexes. 

There are many types of indexes used in Microsoft SQL. I could easily use up the remainder of my weekly posts to discuss each one in depth, but in order to actually get to the meat and bones of how to use indexes, this post is an overview of the more commonly used indexes.

Each Edition of MS SQL from 2005 and up, continues to add types of index to the engine. SQL 2005 had a total of 7 types of indexes, SQL 2008 added 1 additional type, SQL 2012 has a total of 10 types of indexes and SQL 2014 lists 12 types of different indexes.

My personal experiences with index usage unfortunately has been limited to numbers 1-6 for SQL 2005. I currently do not manage any data store with spatial data, so I haven’t had a need to use that one.  I am really excited about experimenting with Filtered Indexes any I do not (yet) have a SQL 2012 server in production, so I have not looked at the two new added indexes.

Here is the break down of each type of index by SQL Engine. Highlighted indexes are new to that specific edition. 

SQL 2005

  1. Clustered – based on a column or multiple columns, this index determines the physical sort order of how the data is physically written to the data base.  A clustered index for a phone book would be on “Last Name”
  2. Non-clustered – can be created on any column or multiple columns in addition to a Clustered index. this index create a “pointer” to find what row data is located on. If you wanted to know all the “Robert”s in a phone book, you could create a Non-Clustered index on First Name column.
  3. Unique – this ensures that either a clustered or non-clustered indexes key does not have duplicate values.
  4. Index with included columns – extends the functionality of non-clustered indexes to include “non-key” items in the index.
  5. Indexed Views – a unique way to present structured, indexed data in a view rather than a table. can use Non-Clustered as well, but only after a Clustered index was created.
  6. Full-text – a special type of index used exclusively with the Microsoft Full-Text Engine to facilitate word searches within data columns.
  7. XML – a way to index XML data type columns

SQL 2008 and 2008 R2

  1. Clustered
  2. Non-Clustered
  3. Unique
  4. Index with included columns
  5. Full-Text
  6. Spatial  provides the ability to index “spatial data” in a column of geometry data type.
  7. Filtered—provides a way to filter out unnecessary rows of a commonly used subset of data
  8. XML

SQL 2012

  1. Clustered
  2. Non-Clustered
  3. Unique
  4. Column store – based on vertical partitioning of the data by columns
  5. Index with included columns
  6. Index on computed columns – an index on a column that is derived of one or more columns
  7. Filtered
  8. Spatial
  9. XML
  10. Full-Text

SQL 2014

  1. Hash – an index that provides data through an in-memory hash table
  2. memory-optimized nonclustered indexes – used in the new feature of “in memory optimized tables”
  3. Clustered
  4. Non-clustered
  5. Unique
  6. Column store
  7. Index with Included columns
  8. Index on computed columns
  9. Filtered
  10. Spatial
  11. XML
  12. Full-Text

8 Weeks of Indexes: What is an Index?

First and foremost, I am a slacker!  It takes discipline to write a weekly blog, I am not sure how some people can do it daily!  However, I hope to get back on track for this series.

Merriam-Webster defines index as:

a list … arranged usually in alphabetical order of some specified datum

One of the most common real world examples of an Index is your telephone book. The book stores information (Name, address and phone number) sorted alphabetically by last name. The pages are written in such a way that A comes before B and B comes before C, etc.. If one knows their alphabet, then any name can be easily looked up.  Typically the first “key” to finding a name is at the top of the page, which tells you what section of the book you are in.  If you were to locate my entry in the phonebook, you would quickly scan through the key until you found the letter B at the top of the page.  Then you would continue to scan until you find the group of entries for BISHOP.  And of course, then locate which entry matched my name, BISHOP, ROBERT. If there were no key at the top of the page, you would have to seek through all the pages one at a time until you got to the B section.  Another excellent real-world example of an index system, is the Dewey Decimal System.  Libraries have been indexing their books with a numbering system for years. 

So, how does this all relate to SQL Server?  There are several bold print words above that translate to SQL Server terms and how SQL works the same way as a phone book.  To fully understand how SQL Indexes work one really needs to know how SQL stores data. We know SQL has the .mdf files that actually stores all your data.  However, the data file is made of pages that are 8 KB in size.  At the top of each page is a “page header” used to store system information about that page.  There are many different types of pages that store different things, but two specific types of pages I want to talk about are “data” pages and “index” pages.

A “data page” is where your actual data (based on data types) is stored and as you guessed it, the index page stores index information. The “key” to proper storage of data is a Clustered Index.  A Clustered Index, physically writes and stores a row of data in a page by selected column and by sort order selected.  So a Clustered Index on a user table could be by the column “Last Name”, just like a phone book. This will ensure that the data rows are written in alphabetical order on each page and in turn each page will be in alphabetical order as well, very efficient.  SQL Engine “scans” the index to determine exactly what page the “B” last names are located.  If a table did not have a clustered index, the data would be stored in a “first come-first served” fashion.  In this scenario, the SQL Engine would have to scan then entire page or multiple pages to find your entry, very inefficient.  Imagine how inefficient a phone book would be if the publisher just kept adding rows to the end of the book every year without being sorted by name.  How long would it take you to find my name then?

So, the key to storing data in SQL, is to have a pre-determined way you want the data rows saved to the page. Ideally this would be the most used method of finding a row, i.e. by “Last Name”. 

Next week…..Types of Indexes