Search

ചിതറിയ ചിന്തകള്‍…

Mahi's Blog: ഒരു സാധാരണക്കാരന്‍റെ മനോവ്യാപാരങ്ങള്‍…

Category

RDBMS

Relational Database Management System (RDBMS) is an unavoidable Part of IT Industry. It is the base of most of the software applications. This is a Domain which “Eagles” Dare to explore. It’s not kid’s play, but of course, it is enjoyable once you understand the concepts.

Use “INDEX”-ing to make your queries faster and faster…

For searching a data in a faster way, we actually need some sort of cross reference facilities where for certain columns of information within a table, it should be possible to get whole records of information quickly. When the data became larger and larger, it is very much necessary to get to the data very quickly. This is where an index within RDBMS comes in. Using index you can access of information Rapidly and efficiently also Enforce for uniqueness constraints.

Def: An index is an on-disk structure associated with a table or views that speed retrieval of rows from the table or view. An index contains keys built from one or more columns in the table or view. These keys are stored in a B-tree structure that enables the RDBMS (eg: SQL Server) to find the row or rows associated with the key values quickly and efficiently . It is a database object created and maintained by DBMS. It is essentially a list of the contents of a column or group of columns. Indexes are ordered so that extremely first search can be computed through them to find data.

 There are two type of indexes “Clustered Index” and “Non-Clustered Index”.

  • A clustered index is sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. Only one clustered index is possible for a table, since only one sort order can be defined for a table.
  • A non-clustered index is stored in a separate structure to the base table. So it is possible to create the non-clustered index on a different file group to the base table. It have a structure separate from the data rows. A non-clustered index contains the non-clustered index key values and each key value entry has a pointer to the data row that contains the key value. The pointer from an index row in a non-clustered index to a data row is called a row locator. The structure of the row locator depends on whether the data pages are stored in a heap or a clustered table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the clustered index key. You can add non-key columns to the leaf level of the Non-Clustered index to by-pass existing index key limits and execute fully covered, indexed, queries.

An index (Both clustered and non-clustered) can be defined either as unique or non-unique. A unique index ensures that the data contained within the unique index columns appear only once within the table, including “NULL”. A unique index is commonly implemented to support the constraints.  Indexes are automatically maintained for a table or view whenever the table data is modified. A non-unique index is also applicable as there can be duplicate data; a non-unique index has more overhead than a unique index when retrieving data.

Compare the Top Bulls – Oracle Vs SQL Server

A White Paper on the RDBMS Giants

Coming up soon…

The “Keys”

In the relational model of databases, a Candidate Key of a relation is a minimal superkey for that relation; that is, a set of attributes such that

  1. the relation does not have two distinct tuples with the same values for these attributes
  2. there is no proper subset of these attributes for which holds.

Since a relation contains no duplicate tuples, the set of all its attributes is a superkey. It follows that every relation will have at least one candidate key. The candidate keys of a relation tell us all the possible ways we can identify its tuples. As such they are an important concept for the design database schema. For practical reasons RDBMSs usually require that for each relation one of its candidate keys is declared as the primary key, which means that it is considered as the preferred way to identify individual tuples. Foreign keys, for example, are usually required to reference such a primary key and not any of the other candidate keys.

In relational database design, a unique key or primary key is a candidate key to uniquely identify each row in a table. A unique key or primary key comprises a single column or set of columns. No two distinct rows in a table can have the same value (or combination of values) in those columns. Depending on its design, a table may have arbitrarily many unique keys but at most one primary key.

A Unique Key (UK) must uniquely identify all possible rows that exist in a table and not only the currently existing rows. Examples of unique keys are Social Security numbers, Passport Numbers, Permanent Account Number (associated with a specific person) or ISBNs (associated with a specific book). Telephone books and dictionaries cannot use names, words, or Dewey Decimal system (inLibrary System)  numbers as candidate keys because they do not uniquely identify telephone numbers or words.

A Primary Key (PK) is a special case of unique keys. The major difference is that for unique keys the implicit NOT NULL constraint is not automatically enforced, while for primary keys it is enforced. Thus, the values in unique key columns may or may not be NULL. Another difference is that primary keys must be defined using another syntax.

The relational model, as expressed through relational calculus and relational algebra, does not distinguish between primary keys and other kinds of keys. Primary keys were added to the SQL standard mainly as a convenience to the application programmer.

An Alternate Key (or Secondary Key) is any candidate key which is not selected to be the primary key (PK), but it can be act like primary key. The alternate key also have all the defined properties of Primary key. this also uniquely identifies the record and this key also doesnot hold any null value. For Example consider an Employee Information table, suppose there are some columns like EmployeeId (PK), PAN Number, Passport Number etc. All these fields uniquely Identifies the employee, but EmployeeId is supposed as PK, and PAN/Passport numbers can be considered as Secondary/Alternate Key.

A Superkey can be defined as a set of attributes of a relation variable upon which all attributes of the reation variable are functionally dependent.

A Foreign Key (FK) is a referential constraint between two tables.

The foreign key identifies a column or a set of columns in one (referencing) table that refers to a column or set of columns in another (referenced) table. The columns in the referencing table must be the primary key or other candidate key in the referenced table. The values in one row of the referencing columns must occur in a single row in the referenced table. Thus, a row in the referencing table cannot contain values that don’t exist in the referenced table (except potentially NULL). This way references can be made to link information together and it is an essential part of database normalization. Multiple rows in the referencing table may refer to the same row in the referenced table. Most of the time, it reflects the one (master table, or referenced table) to many (child table, or referencing table) relationship.
The referencing and referenced table may be the same table, i.e. the foreign key refers back to the same table. Such a foreign key is known as self-referencing or recursive foreign key.
A table may have multiple foreign keys, and each foreign key can have a different referenced table. Each foreign key is enforced independently by the database system. Therefore, cascading relationships between tables can be established using foreign keys.
Improper foreign key/primary key relationships or not enforcing those relationships are often the source of many database and data modeling problems.

ø A relational database matches data using common characteristics found within the data set. The resulting groups of data are organized and are much easier for people to understand.

ø ø Referential integrity is a property of data which, when satisfied, requires every value of one attribute (column) of a relation (table) to exist as a value of another attribute in a different (or the same) relation (table).
For referential integrity to hold, any field in a table that is declared a foreign key can contain only values from a parent table’s primary key or a candidate key. For instance, deleting a record that contains a value referred to by a foreign key in another table would break referential integrity. Some relational database management systems (RDBMS) can enforce referential integrity, normally either by deleting the foreign key rows as well to maintain integrity, or by returning an error and not performing the delete. Which method is used may be determined by a referential integrity constraint defined in a data dictionary.

øøø A functional dependency (FD) is a constraint between two sets of attributes in a relation from a database.

øøøø A compound key is a key that consists of 2 or more attributes that uniquely identify an entity occurrence. Each attribute that makes up the compound key is a simple key in its own right.
This is often confused with a composite key whereby even though this is also a key that consists of 2 or more attributes that uniquely identify an entity occurrence, at least one attribute that makes up the composite key is not a simple key in its own right.

Database Normalization – What, Why and How?

Coming up Next…

ACID Properties of Relational DBMS

In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction.
An example of a transaction is a transfer of funds from one bank account to another, even though it might consist of multiple individual operations (such as debiting one account and crediting another).In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction.
An example of a transaction is a transfer of funds from one bank account to another, even though it might consist of multiple individual operations (such as debiting one account and crediting another).
In computer science, ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee that database transactions are processed reliably. In the context of databases, a single logical operation on the data is called a transaction.

An example of a transaction is a transfer of funds from one bank account to another, even though it might consist of multiple individual operations (such as debiting one account and crediting another).

Atomicity

Atomicity refers to the ability of the DBMS to guarantee that either all of the tasks of a transaction are performed or none of them are.

For example, the transfer of funds from one account to another can be completed or it can fail for a multitude of reasons, but atomicity guarantees that one account won’t be debited if the other is not credited.

Atomicity states that database modifications must follow an “all or nothing” rule. Each transaction is said to be “atomic” if when one part of the transaction fails, the entire transaction fails. It is critical that the database management system maintain the atomic nature of transactions in spite of any DBMS, operating system or hardware failure.

Consistency

The consistency property ensures that the database remains in a consistent state before the start of the transaction and after the transaction is over (whether successful or not).

Consistency states that only valid data will be written to the database. If, for some reason, a transaction is executed that violates the database’s consistency rules, the entire transaction will be rolled back and the database will be restored to a state consistent with those rules. On the other hand, if a transaction successfully executes, it will take the database from one state that is consistent with the rules to another state that is also consistent with the rules.

Isolation

Isolation refers to the requirement that other operations cannot access or see the data in an intermediate state during a transaction. This constraint is required to maintain the performance as well as the consistency between transactions in a DBMS. Thus, each transaction is unaware of other transactions executing concurrently in the system.

Durability

Durability refers to the guarantee that once the user has been notified of success, the transaction will persist, and not be undone. This means it will survive system failure, and that the database system has checked the integrity constraints and won’t need to abort the transaction.

Many databases implement durability by writing all transactions into a transaction log that can be played back to recreate the system state right before a failure. A transaction can only be deemed committed after it is safely in the log.

Durability does not imply a permanent state of the database. Another transaction may overwrite any changes made by the current transaction without hindering durability.

Differences Between DBMS & RDBMS

DBMS – Data Base Management System

RDBMS – Relational Data Base Management System or Relational DBMS

A DBMS has to be persistent, that is it should be accessible when the program created the data ceases to exist or even the application that created the data restarted. A DBMS also has to provide some uniform methods independent of a specific application for accessing the information that is stored.

RDBMS adds the additional condition that the system supports a tabular structure for the data, with enforced relationships between the tables. This excludes the databases that don’t support a tabular structure or don’t enforce relationships between tables. The father of RDBMS is famous Mathematician E. F. Codd.

DBMS does not impose any constraints or security with regard to data manipulation it is user or the programmer responsibility to ensure the ACID Properties of the database whereas the RDBMS is more with this regards because RDBMS defines the integrity constraint for the purpose of holding ACID Properties. RDBMS may be or may not be Client Server Database System.

See the Comparison Chart:

Comparison Chart DBMS RDBMS
Concept No relationship concept Establishes Relationships concepts with two DB Objects (e.g. tables, views etc)
Data Treatment Data as files internally Data as Tables Internally
Accessibility Support Single User Support Multiple Users
Rules Support 3 Rules of “Codd’s 12 Rules of relational Model Support minimum 6 Rules of “Codd’s 12 Rules of relational Model
Infrastructure Low Software and Hardware Requirements High Software and harware requirements
Examples XML, FoxPro SQL Server, Oracle

 

Blog at WordPress.com.

Up ↑