R.DBMSUnit2.3

Unit-I Unit-II  Unit-III Unit-IVUnit-V
Part-I Part-II Part-I Part-IPart-ITotal
Part-III Part-IV Part-II Part-IIPart-II
Part-VPart-III


 

TYPES OF INDEXES

1.     Explain Types of indexes?

Ans:

One of the term used during the file organisation is the term index.

In databases an index is defined as the <index value, address> pair.

 

The databases that have very large number of records.  

 

A database index allows fast search on the index value in database records.

Thus, without an index it may be difficult to search a database.

·        An index file is very small compared to a data file that stores a relation.

·        Also index entries are ordered, so that an index can be searched using an efficient search method like binary search.

Primary index

If the index is created on the basis of the primary key of the table, then it is known as primary indexing.

A primary index is a file that contains a sorted sequence of records having two columns: the ordering key field; and a block address for that key field in the data file.

The ordering key field for this index can be the primary key of the data file. An entry in primary index file contains the index value of the first record of the data block and a pointer to that data block.

Let us assume a student database as (Assuming that one block stores only four student records.):

Fig 4 A student file stored in the order of student enrolment numbers

The primary index on this file would be on the ordering field – enrolment number.

The primary index on this file would be


Fig5: The Student file and the Primary Index on Enrolment Number

Please note the following points in the figure above.

·        We have used the first key value stored in the block as the index key value. This is also called the anchor value.

·        The number of entries in the primary index file is the same as the number of disk block in the ordered data file.

·        All the records need to have an entry in the index file. This type of index is called non-dense index. Thus, the primary index is non-dense index.

To locate the record of a student whose enrolment number is 2238422, we need to find two consecutive entries of indexes such that

index value1 < 2238422 < index value 2.

In the figure above we find the third and fourth index values as: 2238412 and 2258015 respectively satisfying the properties as above. Thus, the required student record must be found in Block 3.

 Disadvantages:

A primary index requires the data file to be ordered, this causes problems during insertion and deletion of records in the file.

 

Clustering Indexes:

A clustered index can be defined as an ordered data file. Sometimes the index is created on non-primary key columns which may not be unique for each record.

In this case, to identify the record faster, we will group two or more columns to get the unique value and create index out of them.

This method is called a clustering index.




Please note the following points about the clustering index as shown in the Figure 6.

• The clustering index is an ordered file having the clustering index value and a block pointer to the first block where that clustering field value first appears.

• Clustering index is also a sparse index.

• The size of clustering index is smaller than primary index as far as number of entries is concerned.

In the Figure 6 the data file have blocks where multiple Programme students exist.

• Clustering index is another example of a non-dense index as it has one entry for every distinct value of the clustering index field and not for every record in the file. 

Secondary Indexes

Secondary index can be defined on an alternate key or non-key attributes. A secondary index that is defined on the alternate key will be dense while secondary index on non-key attributes would require a bucket of pointers for one index entry.

Let us explain them in more detail with the help of Figures 8.


Figure 8: A dense secondary Index on a non-ordering key field of a file (The file is organised on the clustering field “Programme”

Please note the following in the Figure8.

• The names in the data file are unique and thus are being assumed as the alternate key. Each name therefore is appearing as the secondary index entry.

• The pointers are block pointers, thus are pointing to the beginning of the block and not a record. For simplicity of the figure we have not shown all the pointers 

• This type of secondary index file is dense index as it contains one entry for each record/district value.

• The secondary index is larger than the Primary index as we cannot use block anchor values here as the secondary index attributes are not the ordering attribute of the data file.

Let us now see an example of a secondary index that is on an attribute that is not an alternate key.


Figure 9: A secondary index on a non-key field implemented using one level of indirection so that index entries are fixed length and have unique field values (The file is organised on the primary key.

 

Sparse and Dense Indexes

Dense index

The dense index contains an index record for every search key value in the data file. It makes searching faster.

In this, the number of records in the index table is same as the number of records in the main table.

It needs more space to store index record itself. The index records have the search key and a pointer to the actual record on the disk.


Sparse index

In the data file, index record appears only for a few items. Each item points to a block.

In this, instead of pointing to each record in the main table, the index points to the records in the main table in a gap.

Multilevel Indexing Scheme

For a large file the size of index can also be very large. In such a case, one can create a hierarchy of indexes with the lowest level index pointing to the records, while the higher level indexes point to the indexes on indexes. The following figure shows this scheme.


Figure 10: Hierarchy of Indexes

Please note the following points about the multi-level indexes:

• The lowest level index points to each record in the file; thus is costly in terms of space.

• Updating, insertion and deletion of records require changes to the multiple index files as well as the data file. Thus, maintenance of the multi-level indexes is also expensive.

The indexes are implemented through B Trees.

No comments:

Post a Comment