Dbms1.3

BRIEF HISTORY OF THE RELATIONAL MODEL

Introduction:

The relational model was first proposed by E. F. Codd . In the relational model, all data is logically arranged within relations (tables). The relational model is based on the mathematical concept  Relation. E.F. Codd used mathematical terminology from the set theory and predicate logic. Each relation has a name and is made up of  named attributes (columns) of data. Each tuple (row) contains one value per attribute. This simple logical structure is a great strength of the relational model.

The relational model’s objectives are as follows:

• To allow a high degree of data independence. Application programs must not be affected by changes to the internal data representation.

• To deal with data semantics, consistency, and redundancy problems

• To enable the expansion of set-oriented data manipulation languages.

Three major projects are contributed for development of the Relational Model:

System R:

This is the first project that proved practicality the Relational Model. It was developed at IBM’s San José Research Laboratory in California. This  provides an implementation of  relational model its data structures and operations.

The System R project led to two major developments:

• The development of a structured query language called SQL.

•The production of various commercial relational DBMS products for example, DB2 and SQL/DS from IBM and Oracle from Oracle Corporation.

INGRES:

This is second project for relational model, which was active at the same time as the System R project. The INGRES(Interactive Graphics Retrieval System) project involved in development of  a prototype RDBMS. It was developed at the University of  California at Berkeley.

• This research led to development of commercial products  like INGRES from Relational Technology Inc. and the Intelligent Database Machine from Britton Lee Inc.

Peterlee Relational Test Vehicle:

The third project was the Peterlee Relational Test Vehicle at the IBM UK Scientific Centre in Peterlee .This project had a more theoretical orientation than the System R and INGRES projects. It supports query processing and optimization as well as  functional extension.

 TERMINOLOGY

The relational model is based on the mathematical concept of a relation, which is physically represented as a table. Codd used terminology taken from mathematics, principally set theory and predicate logic.

Relational Data Structure

Relation

A relation is a table with columns and rows. An RDBMS requires only that  database to be perceived by the user as tables. This applies only to the logical structure of the database: that is, the external and conceptual levels of the ANSI-SPARC architecture .

Attribute

          An attribute is a named column of a relation. The table columns treated as  attributes. Attributes can appear in any order .

For example, the information on branch offices is represented by the Branch relation, with columns for attributes branchNo, street, city, and postcode. Similarly, the information on staff is represented by the Staff relation, with columns for attributes staffNo, fName, IName, position, sex, DOB, salary, and branchNo. 




Domain

A domain is the set of allowable values for one or more attributes. Every attribute in a relation is defined on a domain. Domains may be distinct for each attribute, or two or more attributes may be defined on the same domain .

Tuple

A tuple is means a row in a relation. Tuples can appear in any order .

Degree

The degree of a relation is the number of attributes it contains.

1.     A relation with only one attribute would have degree one and be called a unary relation or one-tuple.

2.     A relation with two attributes is called binary.

3.     A relation  one with three attributes is called ternary, and after that the term n-ary is usually used.

Cardinality

The cardinality of a relation is the number of tuples it contains. This will changes as tuples are added or deleted.

Relational database

A relational database consists of relations that are appropriately structured. This can be known as normalization.

Alternative terminology

A relation may be referred as a file, the tuples as records, and the attributes as fields. This terminology comes from the fact that, physically, the RDBMS may store each relation in a file.

Properties of Relations

A relation has the following properties:

• The relation has a name that is distinct from all other relation names in the relational schema;

• Each cell of the relation contains exactly one atomic (single) value;

• Each attribute has a distinct name;

• The values of an attribute are all from the same domain;

• Each tuple is distinct; there are no duplicate tuples;

• The order of attributes has no significance;

• The order of tuples has no significance, theoretically.

 

Relational Keys

We need to be able to identify one or more attributes (called relational keys) that uniquely identifies each tuple in a relation.

Superkey

A superkey uniquely identifies each tuple within a relation.  It may be an attribute, or set of attributes, that uniquely identifies a tuple within a relation. However, a superkey may contain additional attributes that are not necessary for unique identification.

Candidate key

A Candidate key is a superkey such that no proper subset is a superkey within the relation.

A candidate key K for a relation R has two properties:

Uniqueness. In each tuple of R, the values of K uniquely identify that tuple.

Irreducibility. No proper subset of K has the uniqueness property.

There may be several candidate keys for a relation. When a key consists of more than one attribute, we call it a composite key.


Consider the Branch relation shown in figure we can determine one tuple with  branchNo. So it  is a candidate key. Similarly, postcode is also a candidate key for this relation.

For example, from the Figure the  suitable candidate key for the Staff relation would be IName, the employee’s surname.

Primary key

Because a relation has no duplicate tuples, it is always possible to identify each row uniquely. This means that a relation always has a primary key. In the worst case, the entire set of attributes could serve as the primary key, but usually some smaller subset is sufficient to distinguish the tuples.

Alternate keys

The candidate keys that are not selected to be the primary key are called alternate keys.

For the Branch relation, if we choose branchNo as the primary key, postcode would then be an alternate key.

Foreign key

An attribute, or set of attributes, within one relation that matches the candidate key of some relation. When an attribute appears in more than one relation, its appearance usually represents a relationship between tuples of the two relations.

For example, the inclusion of branchNo in both the Branch and Staff relations is it links each branch to the details of staff working at that branch. In the Branch relation, branchNo is the primary key. However, in the Staff relation, the branchNo attribute exists to match staff to the branch office they work in. In the Staff relation, branchNo is a foreign key.

INTEGRITY CONSTRAINTS

Integrity constraints are the rules that ensure the data is accurate. There are two important integrity rules for the relational models. Those are:

1. Entity integrity

2. Referential integrity

Nulls

Null represents a value for an attribute that is currently unknown or is not applicable for this tuple. It represents the absence of a value. A null mean the logical value “unknown.” It can mean that a value is not applicable to a particular tuple, or it could merely mean that no value has yet been supplied. Nulls are a way to deal with incomplete or exceptional data.

Entity Integrity

Entity integrity is a constraint that states that in a base relation no attribute of a primary key can be null.

For example, as branchNo is the primary key of the Branch relation, we should not be able to insert a tuple into the Branch relation with a null for the branchNo attribute.


Referential Integrity

If a foreign key exists in a relation, either the foreign key value must match a candidate key value of some tuple in its home relation or the foreign key value must be wholly null.

For example, branchNo in the Staff relation is a foreign key targeting the branchNo attribute in the Branch relation, Branch. It should not be possible to create a staff record with branch number B025, for example, unless there is already a record for branch number B025 in the Branch relation. However, we should be able to create a new staff record with a null branch number to allow for the situation where a new member of staff has joined the company but has not yet been assigned to a particular branch office.

General Constraints

Additional rules specified by the users or database administrators of a database that define or constrain some aspect of the enterprise.

For example, if an upper limit of 20 has been placed upon the number of staff that may work at a branch office, then the user must be able to specify this general constraint and expect the DBMS to enforce it. In this case, it should not be possible to add a new member of staff at a given branch to the Staff relation if the number of staff currently assigned to that branch is 20.

VIEWS

A view is a virtual or derived relation.

Terminology

Base relation

A named relation corresponding to an entity in the conceptual schema, whose tuples are physically stored in the database.

View

A view is a virtual relation that does not necessarily exist in the database but can be produced upon request by a particular user, at the time of request. It is dynamic  result of one or more relational operations operating on the base relations to produce another relation.

·        A view is a relation that appears to the user to exist, can be manipulated as if it were a base relation, but does not necessarily exist in storage in the sense that the base relations do.

·        The contents of a view are defined as a query on one or more base relations. Any operations on the view are automatically translated into operations on the relations from which it is derived.

·        Views are dynamic, meaning that changes made to the base relations that affect the view are immediately reflected in the view. When users make permitted changes to the view, these changes are made to the underlying relations.

Purpose of Views

The view mechanism is desirable for several reasons:

• It provides a powerful and flexible security mechanism by hiding parts of the database from certain users

• It permits users to access data in a way that is customized to their needs, so that the same data can be seen by different users in different ways, at the same time.

• It can simplify complex operations on the base relations.

Updating Views

If a view is updated, then the underlying base relation should reflect the change. The conditions most systems determine whether an update is allowed through a view:

• Updates are allowed through a view defined using a simple query involving a single base relation and containing either the primary key or a candidate key of the base relation.

• Updates are not allowed through views involving multiple base relations.

• Updates are not allowed through views involving aggregation or grouping operations.

Based on the updates allowed, Views are classified into the following classes:

1.     Theoretically not updatable

2.     Theoretically updatable

3.     Partially updatable

 THE RELATIONAL ALGEBRA

The Relational algebra is a procedural language . It can be used  to tell the DBMS how to build a new relation from one or more relations in the database.

The following are the possible operations of relational algebra:


Unary Operations

The two unary operations are:

1.     Selection

2.     Projection.

Selection (or Restriction) σ

·        Selection Operator (σ) is a unary operator in relational algebra that performs a selection operation.

·        It selects those rows or tuples from the relation that satisfies the selection condition.

 Syntax

σ<selection_condition>(R)

 

 

We may use logical operators like  , ! and relational operators like  = , ≠ , > , < , <= , >= with the selection condition.

·         Selection operator only selects the required tuples according to the selection condition.

·         It does not display the selected tuples.

Examples

·         Select tuples from a relation “Books” where subject is “database”

σsubject = “database” (Books)

·         Select tuples from a relation “Books” where subject is “database” and price is “450”

σsubject = “database” ^price="450" (Books)

 

Projection

Projection Operator (Ï€) is a unary operator in relational algebra that performs a projection operation.

It displays the columns of a relation or table based on the specified attributes.

 Syntax:

 Ï€<attribute list>(R)

 Projection operator automatically removes all the duplicates while projecting the output relation.

Example:




 







Set Operations

The relational algebra, starting with the set operations of Union, Set difference, Intersection, and Cartesian product.

 

Union

This operation is used to fetch data from two relations(tables) or temporary relation(result of another operation).

For this operation to work, the relations(tables) specified should have same number of attributes(columns) and same attribute domain. Also the duplicate tuples are autamatically eliminated from the result.

Syntax: A B

where A and B are relations.

For example, if we have two tables RegularClass and ExtraClass, both have a column student to save name of student, then,

Student(RegularClass) Student(ExtraClass)

Above operation will give us name of Students who are attending both regular classes and extra classes, eliminating repetition.

 

Set difference

This operation is used to find data present in one relation and not present in the second relation. This operation is also applicable on two relations, just like Union operation.

 

Syntax: A – B

 

where A and B are relations.

For example, if we want to find name of students who attend the regular class but not the extra class, then, we can use the below operation:

 

Student(RegularClass) - ∏Student(ExtraClass)

 

Intersection

Let R and S be two relations.

Then-

  • R ∩ S is the set of all tuples belonging to both R and S.
  • In R∩S, duplicates are automatically removed.
  • Intersection operation is both commutative and associative


 














d). Cartesian product:

The Cartesian product operation multiplies two  relations to define another relation consisting of all possible pairs of tuples from the two relations.

Therefore, if one relation has I tuples and N attributes and the other has J tuples and M attributes, the Cartesian product relation will contain (I * J) tuples with (N + M) attributes.

Product operation can be denoted by the following notation:

(A × B)

Example: Find the Cartesian Product of the relations A and B








Join Operations

The Join operation, which combines two relations to form a new relation. There are various forms of the Join operation

·        Theta join

·        Equijoin (a particular type of Theta join)

·        Natural join

·        Outer join

·        Semijoin

 

Natural Join ()

Natural join is a binary operator. Natural join between two or more relations will result set of all combination of tuples where they have equal common attribute.

Let us see below example



Equijoin

When a theta join uses only equivalence condition, it becomes an equi join.

Outer join

An outer join doesn't require each record in the two join tables to have a matching record. In this type of join, the table retains each record even if no other matching record exists.



Three types of Outer Joins are:

  • Left Outer Join
  • Right Outer Join
  • Full Outer Join

 Left Outer Join

The LEFT JOIN returns all the rows from the table on the left even if no matching rows have been found in the table on the right. Where no matching record found in the table on the right, NULL is returned.

Right Outer join

outer JOIN is the opposite of LEFT JOIN. The RIGHT JOIN returns all the columns from the table on the right even if no matching rows have been found in the table on the left. Where no matches have been found in the table on the left, NULL is returned.

Full Outer join

In a full outer join, all tuples from both relations are included in the result, irrespective of the matching condition.

Semi join

Semi-Join matches the rows of two relations and then show the matching rows of the relation whose name is mentioned to the left side of  Semi Join operator.












No comments:

Post a Comment