Search This Blog & Web

Tuesday, May 18, 2010

Index structure and Types -- Part 2 : Index Considerations


Database Considerations
When you design an index, consider the following database guidelines:
·         Large numbers of indexes on a table affect the performance of INSERT, UPDATE, and DELETE statements because all indexes must be adjusted appropriately as data in the table changes.
·         Avoid over-indexing heavily updated tables and keep indexes narrow, that is, with as few columns as possible.
·         Use many indexes to improve query performance on tables with low update requirements, but large volumes of data. Large numbers of indexes can help the performance of queries that do not modify data, such as SELECT statements, because the query optimizer has more indexes to choose from to determine the fastest access method.
·         Indexing small tables may not be optimal because it can take the query optimizer longer to traverse the index searching for data than to perform a simple table scan. Therefore, indexes on small tables may never by used, but must still be maintained as data in the table changes.
·         Indexes on views can provide significant performance gains when the view contains aggregations, table joins, or a combination of aggregations and joins. The view does not have to be explicitly referenced in the query for the query optimizer to use it.
·         Use the Database Engine Tuning Advisor to analyze your database and make index recommendations.
Query Considerations
When you design an index, consider the following query guidelines:
·         Create nonclustered indexes on all columns that are frequently used in predicates and join conditions in queries. Covering indexes can improve query performance because all the data needed to meet the requirements of the query exists within the index itself. That is, only the index pages, and not the data pages of the table or clustered index, are required to retrieve the requested data; therefore, reducing overall disk I/O. For example, a query of columns a and b on a table that has a composite index created on columns a, b, and c can retrieve the specified data from the index alone.
·         Write queries that insert or modify as many rows as possible in a single statement, instead of using multiple queries to update the same rows. By using only one statement, optimized index maintenance could be exploited.
·         Evaluate the query type and how columns are used in the query. For example, a column used in an exact-match query type would be a good candidate for a nonclustered or clustered index.
Column Considerations
When you design an index consider the following column guidelines:
·         Keep the length of the index key short for clustered indexes. Additionally, clustered indexes benefit from being created on unique or nonnull columns.
·         Columns that are of the ntext, text, image, varchar(max), nvarchar(max), and varbinary(max) data types cannot be specified as index key columns. However, varchar(max), nvarchar(max), varbinary(max), and xml data types can participate in a nonclustered index as nonkey index columns.
·         An xml data type can only be a key column only in an XML index.
·         Examine column uniqueness. A unique index instead of a nonunique index on the same combination of columns provides additional information for the query optimizer that makes the index more useful.
·         Examine data distribution in the column. Frequently, a long-running query is caused by indexing a column with few unique values, or by performing a join on such a column. This is a fundamental problem with the data and query, and generally cannot be resolved without identifying this situation. For example, a physical telephone directory sorted alphabetically on last name will not expedite locating a person if all people in the city are named Smith or Jones.
·         Consider the order of the columns if the index will contain multiple columns. The column that is used in the WHERE clause in an equal to (=), greater than (>), less than (<), or BETWEEN search condition, or participates in a join, should be placed first. Additional columns should be ordered based on their level of distinctness, that is, from the most distinct to the least distinct.

For example, if the index is defined as
LastName, FirstName the index will be useful when the search criterion is WHERE LastName = 'Smith' or WHERE LastName = Smith AND FirstName LIKE 'J%'. However, the query optimizer would not use the index for a query that searched only on FirstName (WHERE FirstName = 'Jane')
.
Considerations
·         A unique index, UNIQUE constraint, or PRIMARY KEY constraint cannot be created if duplicate key values exist in the data.
·         If the data is unique and you want uniqueness enforced, creating a unique index instead of a nonunique index on the same combination of columns provides additional information for the query optimizer that can produce more efficient execution plans. Creating a unique index (preferably by creating a UNIQUE constraint) is recommended in this case.
·         A unique nonclustered index can contain included nonkey columns. 

Implementation Considerations
The following table lists the maximum values that apply to clustered, nonclustered, and XML indexes. Unless specified, the limitations apply to all index types.
Maximum index limits
Value
Additional information
Clustered indexes per table
1

Nonclustered indexes per table
249
Includes nonclustered indexes created by PRIMARY KEY or UNIQUE constraints, but not XML indexes.
XML indexes per table
249
Includes primary and secondary XML indexes on xml data type columns.
Indexes on xml Data Type Columns
Number of key columns per index
16*
Maximum Size of Index Keys.
Clustered index is limited to 15 columns if the table also contains a primary XML index.
Maximum index key record size
900 bytes*
Does not pertain to XML indexes.
Maximum Size of Index Keys.
*You can avoid index key column and record size limitations of nonclustered indexes by including nonkey columns in the index.
Data Types
Generally, any column in a table or view can be indexed. The following table lists data types that have restricted index participation.
Data type
Index participation
Additional information
CLR user-defined type
Can be indexed if the type supports binary ordering.
Large object (LOB) data types: image, ntext, text, varchar(max), nvarchar(max), varbinary(max), and xml
Cannot be an index key column. However, an xml column can be a key column in a primary or secondary XML index on a table.
Can participate as nonkey (included) columns in a nonclustered index except for image, ntext, and text.
Can participate if part of a computed column expression.
Computed columns
Can be indexed. This includes computed columns defined as method invocations of a CLR user-defined type column, as long as the methods are marked deterministic.
Computed columns derived from LOB data types can be indexed either as a key or nonkey column as long as the computed column data type is allowed as an index key column or nonkey column.
Varchar columns pushed off-row
The index key of a clustered index cannot contain varchar columns that have existing data in the ROW_OVERFLOW_DATA allocation unit. If a clustered index is created on a varchar column and the existing data is in the IN_ROW_DATA allocation unit, subsequent insert or update actions on the column that would push the data off-row will fail.
Additional Considerations
The following are some additional considerations for creating an index:
·         You can create an index, if you have CONTROL or ALTER permission on the table.
·         When created, the index is automatically enabled and available for use. You can remove access to an index by disabling it.
Disk Space Requirements
The amount of disk space required to store the index depends on the following factors:
·         The size of each data row in the table and the number of rows per page. This determines the number of data pages that must be read from disk to create the index.
·         The columns in the index and the data types used. This determines the number of index pages that have to be written to disk. For more information,
·         Temporary disk space required during the index creation process.
Performance Considerations
The time taken to physically create an index is largely dependent on the disk subsystem. Important factors to consider are the following:
·         The recovery model of the database. The bulk-logged recovery model provides greater performance and reduced log-space consumption during the index creation operation than full recovery. However, bulk-logged recovery reduces the flexibility for point-in-time recovery
·         The RAID (redundant array of independent disks) level used to store the database and transaction log files. Generally, RAID levels that use striping will have better I/O bandwidth.
·         Number of disks in the disk array, if RAID was used. More drives in the array increases data transfer rates proportionally.
·         Where the intermediate sort runs of the data is stored. Using the SORT_IN_TEMPDB option can reduce the time that is required to create an index when tempdb is on a different set of disks than the user database
·         Creating the index offline or online.

When an index is created offline (the default), exclusive locks are held on the underlying table until the transaction creating the index has completed. The table is inaccessible to users while the index is being created.

In SQL Server 2005, you can specify the index be created online. When the online option is set to ON, long-term table locks are not held, enabling queries or updates to the underlying table to continue while the index is being created. Although we recommend online index operations, you should evaluate your environment and specific requirements. It may be better to run index operations offline. In doing this, users have restricted access the data during the operation, but the operation finishes faster and uses fewer resources.
The SQL Server 2005 Database Engine automatically maintains indexes whenever insert, update, or delete operations are made to the underlying data. Over time these modifications can cause the information in the index to become scattered in the database (fragmented). Fragmentation exists when indexes have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the data file. Heavily fragmented indexes can degrade query performance and cause your application to respond slowly.
In SQL Server 2005 you can remedy index fragmentation by either reorganizing an index or by rebuilding an index.
For partitioned indexes built on a partition scheme, you can use either of these methods on a complete index or on a single partition of an index.

for more information go to Index structure and Types -- Part 3 : Index Creation & Detecting Fragmentation

No comments: