Sunday, 3 July 2011

Relational Database Management System (RDBMS)

A DBMS that is based on relational model is called as RDBMS. Relation model is most successful mode of all three models. Designed by E.F. Codd, relational model is based on the theory of sets and relations of mathematics.

Relational model represents data in the form a table. A table is a two dimensional array containing rows and columns. Each row contains data related to an entity such as a student. Each column contains the data related to a single attribute of the entity such as student name.

One of the reasons behind the success of relational model is its simplicity. It is easy to understand the data and easy to manipulate.

Another important advantage with relational model, compared with remaining two models is, it doesn’t bind data with relationship between data item. Instead it allows you to have dynamic relationship between entities using the values of the columns.

Almost all Database systems that are sold in the market, now- a-days, have either complete or partial implementation of relational model.

Figure 1 shows how data is represented in relational model and what are the terms used to refer to various components of a table. The following are the terms used in relational model.

Tuple / Row
A single row in the table is called as tuple. Each row represents the data of a single entity.

Attribute / Column
A column stores an attribute of the entity. For example, if details of students are stored then student name is an attribute; course is another attribute and so on.

Column Name
Each column in the table is given a name. This name is used to refer to value in the column.

Table Name
Each table is given a name. This is used to refer to the table. The name depicts the content of the table.

The following are two other terms, primary key and foreign key, that are very important in relational model.

Primary Key
A table contains the data related entities. If you take STUDETNS table, it contains data related to students. For each student there will be one row in the table. Each student’s data in the table must be uniquely identified. In order to identify each entity uniquely in the table, we use a column in the table. That column, which is used to uniquely identify entities (students) in the table is called as primary key.

In case of STUDENTS table (see figure 1) we can use ROLLNO as the primary key as it in not duplicated.

So a primary key can be defined as a set of columns used to uniquely identify rows of a table.

Some other examples for primary keys are account number in bank, product code of products, employee number of an employee.

Composite Primary Key
In some tables a single column cannot be used to uniquely identify entities (rows). In that case we have to use two or more columns to uniquely identify rows of the table. When a primary key contains two or more columns it is called as composite primary key.

In figure 2, we have PAYMENTS table, which contains the details of payments made by the students. Each row in the table contains roll number of the student, payment date and amount paid. Neither of the columns can uniquely identify rows. So we have to combine ROLLNO and DP to uniquely identify rows in the table. As primary key is consisting of two columns it is called as composite primary key.

Foreign Key
In relational model, we often store data in different tables and put them together to get complete information. For example, in PAYMENTS table we have only ROLLNO of the student. To get remaining information about the student we have to use STUDETNS table. Roll number in PAYMENTS table can be used to obtain remaining information about the student.

The relationship between entities student and payment is one-to-many. One student may make payment for many times. As we already have ROLLNO column in PAYMENTS table, it is possible to join with STUDENTS table and get information about parent entity (student).

Roll number column of PAYMENTS table is called as foreign key as it is used to join PAYMENTS table with STUDENTS table. So foreign key is the key on the many side of the relationship.

ROLLNO column of PAYMENTS table must derive its values from ROLLNO column of STUDENTS table.

Integrity Rules
Data integrity is to be maintained at any cost. If data loses integrity it becomes garbage. So every effort is to be made to ensure data integrity is maintained. The following are the main integrity rules that are to be followed.

Domain integrity
Data is said to contain domain integrity when the value of a column is derived from the domain. Domain is the collection of potential values. For example, column date of joining must be a valid date. All valid dates form one domain. If the value of date of joining is an invalid date, then it is said to violate domain integrity.

Entity integrity
This specifies that all values in primary key must be not null and unique. Each entity that is stored in the table must be uniquely identified. Every table must contain a primary key and primary key must be not null and unique.

Referential Integrity
This specifies that a foreign key must be either null or must have a value that is derived from corresponding parent key. For example, if we have a table called BATCHES, then ROLLNO column of the table will be referencing ROLLNO column of STUDENTS table. All the values of ROLLNO column of BATCHES table must be derived from ROLLNO column of STUDENTS table. This is because of the fact that no student who is not part of STUDENTS table can join a batch

Relational Algebra
A set of operators used to perform operations on tables is called as relational algebra. Operators in relational algebra take one or more tables as parameters and produce one table as the result.

The following are operators in relational algebra:

 Union
 Intersect
 Difference or minus
 Project
 Select
 Join

This takes two tables and returns all rows that are belonging to either first or second table (or both).

This takes two tables and returns all rows that are belonging to first and second table.

Difference or Minus
This takes two tables and returns all rows that exist in the first table and not in the second table.

Takes a single table and returns the vertical subset of the table.

Takes a single table and returns a horizontal subset of the table. That means it returns only those rows that satisfy the condition.

Rows of two table are combined based on the given column(s) values. The tables being joined must have a common column.

Top 50 SQL Interview Questions & Answers

1. What is DBMS? A Database Management System (DBMS) is a program that controls creation, maintenance and use of a database. DBMS can be...