INTRODUCTION
A database to be a collection of related data .
A database management system
(DBMS) is the software that manages and controls access to the database.
A database application is simply a program that interacts with the
database at some point in its execution.
Database will be used in following scenarios
1.
Purchases from the supermarket
2.
Purchases using your credit card
3.
Booking a vacation with a travel agent
4.
Using the local library
5.
Taking out insurance
6.
Renting a DVD
7.
Using the Internet
8.
Studying at College
TRADITIONAL FILE-BASED SYSTEMS
Before database system
the files can be organized using file-based system.
File-based System Approach
File-based systems were an early attempt to computerize the manual filing system.
For example, an organization might
have physical files set up to hold all external and internal data relating to a
project, product, task, client, or employee. There are many such files, and for
safety they are labeled and stored in one or more cabinets. For security, the
cabinets may have locks and located in secure areas of the building.
The manual filing system works well
as long as the number of items to be stored is small. It even works adequately
when there are large numbers of items
and we only have to store and retrieve them. However, the manual filing system
breaks down when we have to cross-reference or process the information in the
files.
For example, a real
estate agent’s office might have a separate file for each property for sale or
rent, each buyer and renter, and each member of staff. The effort that would be
required to answer all the questions
However, rather than establish a
centralized store for the organization’s operational data, a decentralized
approach was taken, where each department, with the assistance of Data Processing (DP) staff, stored and
controlled its own data.
Limitations of the File-Based
Approach
1. Separation and isolation of data
When data is in separate
files, it is more difficult to access data . This difficulty is arises if we
require data from more than two files.
Duplication of data
• Duplication is wasteful. It costs
time and money to enter the data more than once.
• It takes up additional storage
space, again with associated costs.
• Duplication can lead to loss of
data integrity; in other words, the data is no longer consistent.
Data dependence
The physical structure
and storage of the data files and records
are defined in the application code. This means that changes to an existing
structure are difficult to make. Clearly, this process could be very
time-consuming and subject to error.
Incompatible file formats
For example, the
structure of a file generated by a COBOL program may be different from the structure
of a file generated by a C program. The direct incompatibility of such files makes
them difficult to process jointly.
Fixed queries/proliferation of application programs
file-based systems are
very dependent upon the application developer, who has to write any queries or
reports that are required. As a result, two things happened.
In some organizations, the type of
query or report that could be produced was fixed. There was no facility for
asking unplanned queries either about the data itself or about which types of data
were available.
In other organizations, there was a
proliferation of files and application programs.
• Recovery, in the event of a
hardware or software failure, was limited or nonexistent.
• Access to the files was restricted
to one user at a time—there was no provision for shared access by staff in the
same department.
DATABASE APPROACH
The Database
A shared collection of logically
related data and its description, designed to meet the information needs of an
organization.
Ø The database is a single, possibly
large repository of data.
Ø That can be used by many departments
and users simultaneously.
Ø All data items are integrated with a
minimum amount of duplication.
Ø The database not owned by one
department but is a shared corporate resource.
Ø The database holds not only the
organization’s operational data, but also a description of this data.
Ø The description of the data is known as the system catalog (or data dictionary or metadata—the “data about data”).
We analyze the information needs of an organization by identifing
entities, attributes, and relationships.
An entity is a distinct object
(a person, place, thing, concept, or event) in the organization that is to be represented
in the database.
An attribute is a property
that describes some aspect of the object that we wish to record, and a relationship
is an association between entities.
The database represents the entities, the attributes, and the logical relationships between the entities.
The Database Management System (DBMS)
A software system that enables users
to define, create, maintain, and control access to the database.
The DBMS is the software that
interacts with the users’ application programs and the database. Typically, a
DBMS provides the following facilities:
• It allows users to define the
database, using Data Definition Language (DDL). The DDL allows users to
specify the data types and structures and the constraints on the data to be
stored in the database.
• It allows users to insert, update, delete, and retrieve data from the database, using Data Manipulation Language (DML). DML to provide a general inquiry facility to the data, it is called query language.
The most common query language is
the Structured Query Language .
• It provides controlled access to the database. For example, it may provide:
– a security system, which
prevents unauthorized users accessing the database;
– an integrity system, which
maintains the consistency of stored data;
– a concurrency control system,
which allows shared access of the database;
– a recovery control system,
which restores the database to a previous consistent state following a hardware or software
failure;
– a user-accessible catalog,
which contains descriptions of the data in the database.
(Database) Application Programs
Users interact with the
database using application programs that are used to create and maintain
the database and to generate information. The application programs may be
written in a programming language.
Components of the DBMS Environment
The five major
components in the DBMS environment.
1. Hardware
Some DBMSs run only on particular hardware or
operating systems, while others run on a wide variety of hardware and operating
systems. A DBMS requires a minimum amount of main memory and disk space to run.
Software
The software component means
the DBMS software , application programs, operating system, networking
softwares. ..etc
Data
Perhaps the most
important component of the DBMS environment is the data. The data acts as a
bridge between the machine components and the human components.
The database contains both the operational data and the metadata.
Procedures
Procedures refer to the
instructions and rules for the design and use of the database. The users of the
system and the staff who manage the database should know the procedures on how
to run the system. These may consist of instructions on how to:
• Log on to the DBMS.
• Use a particular DBMS facility or
application program.
• Start and stop the DBMS.
• Make backup copies of the
database.
• Handle hardware or software
failures.
• Change the structure of a table
People
People refers the different types of
users who involved in the database environment. There are four different types
of people: data and database administrators, database designers, application developers,
and end-users.
ROLES IN THE DATABASE ENVIRONMENT
The database has four distinct types of people
Data and Database Administrators
Data and database
administration are associated with the management and control of a DBMS and its
data.
The Data Administrator (DA)
is responsible for the management of the data resource, including database
planning; development and maintenance of standards, policies and procedures;
and conceptual/logical database design.
The Database Administrator (DBA)
is responsible for physical database
design and implementation, security and integrity control, maintenance of the
operational system, and ensuring satisfactory performance of the applications
for users. DBA requires detailed knowledge of the target DBMS and the system
environment. The role of the DBA is more technically oriented than the role of
the DA
Database Designers
There is two types of designers:
logical database designers and physical database designers.
The logical database designer is
concerned with identifying the data the relationships between the data, and the
constraints on the data that is to be stored in the database.
The physical database designer decides
how the logical database design is to be physically implemented.
This involves:
• Mapping the logical database
design into a set of tables and integrity constraints;
• Selecting specific storage
structures and access methods for the data to achieve good performance;
• Designing any security measures
required on the data.
Logical database design
are concerned with the what, physical database design is concerned with
the how..
Application Developers
Typically, the application
developers work from a specification produced by systems analysts. Each program
contains statements that request the DBMS to perform some operation on the
database, which includes retrieving data, inserting, updating, and deleting
data.
End-Users
The end-users are the
“clients” of the database. End-users can be classified according to the way
they use the system:
• Naïve users are typically
unaware of the DBMS. They access the database through specially written
application programs .They invoke database operations by entering simple
commands or choosing options from a menu.
For example, the checkout assistant
at the local supermarket.
• Sophisticated users.
Sophisticated end-users may use a high-level query language such as SQL to
perform the required operations. Some sophisticated end-users may even write application
programs for their own use.
ADVANTAGES AND DISADVANTAGES OF DBMS
Advantages
Control of data redundancy
The database approach attempts
to eliminate the redundancy by integrating the files so that multiple copies of
the same data are not stored.
Data consistency
By controlling redundancy, we reduce the risk of inconsistencies
occurring. If a data item is stored only once in the database, any update to its
value has to be performed only once and the new value is available immediately to
all users.
improves data consistency.
More information from the same
amount of data
With the integration of
the operational data, it may be possible for the organization to derive additional
information from the same data.
Sharing of data
The database belongs to
the entire organization and can be shared by all authorized users. In this way,
more users share more of the data.
Improved data integrity
Database integrity
refers to the validity and consistency of stored data. Integrity is usually
expressed in terms of constraints, which are consistency rules that the
database is not permitted to violate.
Improved security
Database security is the
protection of the database from unauthorized users. This security may take the
form of user names and passwords to identify people authorized to use the
database.
Enforcement of standards
Database integration
allows the DBA to define and the DBMS to enforce the necessary standards. These
may include departmental, organizational, national, or international standards
for such things as data formats to facilitate exchange of data between systems,
naming conventions, documentation standards, update procedures, and access
rules.
Economy of scale
Combining all the
organization’s operational data into one database and creating a set of applications that work on
this one source of data can result in cost savings leading to an economy of
scale.
Balance of conflicting requirements
The database is under
the control of the DBA, the DBA can make decisions about the design and
operational use of the database.
Improved data accessibility and
responsiveness
The database integrity
feature provides the data be directly accessible to the end-users. This
provides more functionality and better services to the end-users.
Increased productivity
Many DBMSs provide a
fourth-generation environment with tools that simplify the development of
database applications. This results in increased programmer productivity and
reduced development time.
Improved maintenance through data
independence
A DBMS separates the
data descriptions from the applications, thereby making applications immune to
changes in the data descriptions. This is known as data independence.
Increased concurrency
Many DBMSs manage
concurrent database access and ensure that such problems cannot occur. DBMSs
employs various concurrency control techniques.
Improved backup and recovery
services
Modern DBMSs provide
facilities to minimize the amount of processing that is lost following a
failure. They also provide automatic backup facilities.
Disadvantages
Complexity
The functionality of a
good DBMS makes the DBMS an extremely complex piece of software. People fail to understand the system can lead to bad
design decisions, which can have serious consequences for an organization.
Size
DBMS an extremely large
piece of software, occupying many megabytes of disk space and requiring substantial
amounts of memory to run efficiently.
Cost of DBMSs
The cost of DBMSs varies
significantly, depending on the environment and functionality provided. For
example, a single-user DBMS for a personal computer may only cost $100.
However, a large mainframe multi-user DBMS servicing hundreds of users can be extremely expensive, perhaps
$100,000 .
Additional hardware costs
The disk storage
requirements for the DBMS and the database leads to purchase of additional
storage space. To achieve the required performance, We need to purchase a
larger machine, and a machine dedicated to running the DBMS.
Cost of conversion
The cost of converting
existing applications to run on the new DBMS and hardware. This cost also
includes the cost of training staff to use these new systems, and possibly the
employment of specialist staff to help with the conversion and running of the
systems.
Performance
The DBMS is written to
be more general, to serve many applications rather than just one. The result is
that some applications may not run as fast as they used to.
Greater impact of a failure
The centralization of resources increases the vulnerability of the system. Because all users and applications rely on the availability of the DBMS, the failure of certain components can bring operations to a halt.
No comments:
Post a Comment