Unit-I | Unit-II | Unit-III | Unit-IV | Unit-V | |
Part-I | Part-II | Part-I | Part-I | Part-I | Total |
Part-III | Part-IV | Part-II | Part-II | Part-II | |
Part-V | Part-III |
TYPES OF INDEXES
1.
Explain
Types of indexes?
Ans:
One of the term used during the file organisation is the term index.
In databases an index is defined as the <index value, address>
pair.
The databases that have very large number of records.
A database index allows fast search on the index value in database
records.
Thus, without an index it may be difficult to search a database.
·
An index file is very small compared to a data file
that stores a relation.
·
Also index entries are ordered, so that an index can
be searched using an efficient search method like binary search.
Primary index
If the index is created on the basis of the primary
key of the table, then it is known as primary indexing.
A primary index is a file
that contains a sorted sequence of records having two columns: the ordering
key field; and a block address for that key field in the data file.
The ordering key field for this index can be the primary key of the data file. An
entry in primary index file contains the index value of the first record of the
data block and a pointer to that data block.
Let us assume a student
database as (Assuming that one block stores only four student records.):
Fig 4 A student file stored in the order of student enrolment
numbers
The primary index on this
file would be on the ordering field –
enrolment number.
The primary index on this
file would be
Fig5: The Student file and the Primary Index on Enrolment Number
Please note the following
points in the figure above.
·
We have used the first key value stored in the block
as the index key value. This is also called the anchor value.
·
The number of entries in the primary index file is
the same as the number of disk block in the ordered data file.
·
All the records need to have an entry in the index
file. This type of index is called non-dense index. Thus, the primary
index is non-dense index.
To locate the record of a
student whose enrolment number is 2238422, we need to find two consecutive
entries of indexes such that
index value1 < 2238422 < index value 2.
In the figure above we find the third and fourth index values as: 2238412 and 2258015 respectively satisfying the properties as above. Thus, the required student record must be found in Block 3.
A primary index requires the
data file to be ordered, this causes problems during insertion and deletion of
records in the file.
Clustering Indexes:
A clustered index can be defined as an ordered data
file. Sometimes the index is created on non-primary key columns which may not
be unique for each record.
In this case, to identify the record faster, we will
group two or more columns to get the unique value and create index out of them.
This method is called a clustering index.
Please note the following
points about the clustering index as shown in the Figure 6.
• The clustering index is an
ordered file having the clustering index value and a block pointer to the first
block where that clustering field value first appears.
• Clustering index is also a
sparse index.
• The size of clustering index is smaller than primary index as far as number of entries is concerned.
In the Figure 6 the data
file have blocks where multiple Programme students exist.
• Clustering index is another example of a non-dense index as it
has one entry for every distinct value of the clustering index field and not
for every record in the file.
Secondary Indexes
Secondary index can be
defined on an alternate key or non-key attributes. A secondary index that is
defined on the alternate key will be dense while secondary index on non-key
attributes would require a bucket of pointers for one index entry.
Let us explain them in more
detail with the help of Figures 8.
Figure 8: A dense secondary
Index on a non-ordering key field of a file (The file is organised on the
clustering field “Programme”
Please note the following in
the Figure8.
• The names in the data file
are unique and thus are being assumed as the alternate key. Each name therefore
is appearing as the secondary index entry.
• The pointers are block pointers, thus are pointing to the beginning of
the block and not a record. For simplicity of the figure we have not shown all
the pointers
• This type of secondary
index file is dense index as it contains one entry for each record/district
value.
• The secondary index is
larger than the Primary index as we cannot use block anchor values here as the
secondary index attributes are not the ordering attribute of the data file.
Let us now see an example of a secondary index that is on an attribute that is not an alternate key.
Figure 9: A secondary index
on a non-key field implemented using one level of indirection so that index
entries are fixed length and have unique field values (The file is organised on
the primary key.
Sparse and Dense Indexes
Dense
index
The dense index contains an index record for every
search key value in the data file. It makes searching faster.
In this, the number of records in the index table is
same as the number of records in the main table.
It needs more space to store index record itself.
The index records have the search key and a pointer to the actual record on the
disk.
Sparse
index
In the data file, index record appears only for a
few items. Each item points to a block.
In this, instead of pointing to each record in the
main table, the index points to the records in the main table in a gap.
Multilevel Indexing Scheme
For a large file the size of
index can also be very large. In such a case, one can create a hierarchy of
indexes with the lowest level index pointing to the records, while the higher
level indexes point to the indexes on indexes. The following figure shows this
scheme.
Figure 10: Hierarchy of
Indexes
Please note the following
points about the multi-level indexes:
• The lowest level index
points to each record in the file; thus is costly in terms of space.
• Updating, insertion and
deletion of records require changes to the multiple index files as well as the
data file. Thus, maintenance of the multi-level indexes is also expensive.
The indexes are implemented
through B Trees.
No comments:
Post a Comment