Unit-4 notes PDF

Title	Unit-4 notes
Author	Harman Singh
Course	Database Management System
Institution	Guru Gobind Singh Indraprastha University
Pages	43
File Size	1.6 MB
File Type	PDF
Total Downloads	119
Total Views	255

Preview

CLICK TO PREVIEW PDF

Summary

Database Management SystemUnit-NotesTransaction Processing A transaction is a logical unit of work of database processing that includes one or more database access operations. A transaction can be defined as an action or series of actions that is carried out by a single user or application program t...

Description

Database Management System Unit-4 Notes Transaction Processing • A transaction is a logical unit of work of database processing that includes one or more database access operations. • A transaction can be defined as an action or series of actions that is carried out by a single user or application program to perform operations for accessing the contents of the database. The operations can include retrieval, insertion, deletion and modification • A transaction must be either completed or aborted. • A transaction is a program unit whose execution may change the contents of a database. It can either be embedded within an application program or can be specified interactively via a high level query language such as SQL. • If the database is in a consistent state before a transaction executes, then the database should still be in consistent state after its execution. • Therefore, to ensure these conditions and preserve the integrity of the database transaction must be atomic (also called serialisability). • Atomic transaction is a transaction in which either all actions associated with the transaction are executed to completion or none are performed. Basic operations on database are read and write 1. read_item(X): Reads a database item named X into a program variable. To simplify our notation, we assume that the program variable is also named X. 2. write_item(X): Writes the value of program variable X into the database item named X.

Example: You are working on a system for a bank. A customer goes to the ATM and instructs it to transfer Rs. 1000 from savings to a checking account. This simple transaction requires two steps: • Subtracting the money from the savings account balance. Savings -1000 • Adding the money to the checking account balance. Checking + 1000 The code to create this transaction will require two updates to the database. For example, there will be two SQL statements: one UPDATE command to decrease the balance in savings and a second UPDATE command to increase the balance in the checking account. You have to consider what would happen if a machine crashed between these two operations. The money has already been subtracted from the savings account will not be added to the checking account. It is lost. You might consider performing the addition to checking first, but then the customer ends up with extra money, and the bank loses. The point is that both changes must be made successfully. Thus, a transaction is defined as a set of changes that must be made together. What causes a Transaction to fail There are several reasons for a transaction to fail in the middle of execution. 1. Computer failure: a hardware, software or network error occurs in the computer system during transaction execution. 2. Transaction or system: some operations in the transaction may cause it to fail such as integer overflow or division by 0. The user may also interrupt the transaction during its execution.

3. Local errors or exception conditions detected by the transaction :during transaction execution , certain condition may occur that necessitate the cancellation of transaction. Example: insufficient account balance in a banking database may cause a transaction to be cancelled. 4. Concurrency control enforcement: this method may decide to abort the transaction because several transactions are in a state of deadlock. 5. Disk failure: some disk blocks may lose their data because of a disk read/write head crash. This may happen during a read/ write operation of a transaction. 6. Physical problem & catastrophes: this refers to an endless list of problems that include fire, theft etc.

A transaction can be in one of the following states:1. Active state:- After the transaction starts its operation. 2. Partially committed:- When the last state is reached. 3. Aborted:- After the transaction has been rolled back and the database has been restored to its state prior to the start of the transaction. 4. Committed:- After successful completion of transaction. 5. Failed:- When the normal execution can no longer proceed. State diagram of transition/transaction

Transaction Properties A transaction must have the following four properties, called ACID properties, to ensure that a database remains in stable state after the transaction is executed: • Atomicity • Consistency • Isolation • Durability

Atomicity(All or nothing):- The atomicity property of a transaction requires that all operations of a transaction be completed, if not, the transaction is aborted. In other words, a transaction is treated as single, individual logical unit of work. Therefore, a transaction must execute and complete each operation in its logic before it commits its changes.

Consistency(No violation of integrity constraints):-Database consistency is the property that every transaction sees a consistent database instance. In other words, execution of a transaction must leave a database in either its prior stable state or a new stable state that reflects the new modifications (updates) made by the transaction.

If the transaction fails, the database must be returned to the state it was in prior to the execution of the failed transaction. If the transaction commits, the database must reflect the new changes. Isolation(concurrent changes invisibles):- Isolation property of a transaction means that the data used during the execution of a transaction cannot be used by a second transaction until the first one is completed. This property isolates transactions from one another. In other words, if a transaction T1 is being executed and is using the data item X, that data item cannot be accessed by any other transaction (T2…Tn) until T1 ends. Durability(committed update persist):It states that the changes made by a transaction are permanent. They cannot be lost by either a system failure or by the erroneous operation of a faulty transaction. When a transaction is completed, the database reaches a consistent state and that state cannot be lost , even in the event of system’s failure. Durability property is the responsibility of the recovery subsystem of the DBMS. Let Ti be a transaction that transfers Rs 50 from account A to account B. This transaction can be defined as:-

Overview of serializability When multiple transactions are being executed by the operating system in a multiprogramming environment, there are possibilities that instructions of one transaction are interleaved with some other transaction.

Schedule A schedule is a sequence of actions or operations (for example, reading writing, aborting or committing) that is constructed by merging the actions of a set of transactions, respecting the sequence of actions within each transaction. e.g. if in a transaction T1 the instruction write(A) appears before read(A), then in any valid schedule this sequence must also be preserved. A schedule S of n transactions is a sequential ordering of the operations of the n transactions. A schedule maintains the order of operations within the individual transaction. For each transaction T if operation a is performed in T before operation b, then operation a will be performed before operation b in S. The operations are in the same order as they were before the transactions were interleaved A transaction schedule is a tabular representation of how a set of transactions were executed over time Types of schedule:

1. 2. 3. 4. 5.

Complete schedule Serial schedule Non-serial schedule Equivalent schedule Serializable schedule

Complete Schedule: A Schedule that contains either an abort or a commit for each transaction whose actions are listed in it is called a complete schedule. A complete schedule must contain all the actions of every transaction that appears in it.

Serial Schedule: Each serial schedule consists of a sequence of instructions from various transactions, where the instructions belonging to one single transaction appear together in that schedule. If the actions of different transactions are not interleaved-that is, transactions are executed from start to finish, one by one-we call that schedule a serial schedule. T1 A=A+100 B=B-100

T2

A=A*7.06 B=B*7.06 Schedule 1 T1

T2 A=A+100 B=B-100

A=A*7.06 B=B*7.06 Schedule 2

Non Serial Schedule: A schedule in which the operations from a set of concurrent transactions are interleaved is called a non- serial schedule. T1 A=A+100

T2 A=A*7.06

B=B-100 B=B*7.06 Schedule 3

Ser i alSchedul e

NonSer i alSchedul e

A serial schedule is a sequence of operation by a set A non-serial schedule is a schedule where the operations of a of concurrent transaction that preserves the order of group of concurrent transactions are interleaved. operations in each of the individual transactions. Transactions are performed in serial order.

Transactions are performed in non-serial order, but result should be same as serial.

No interference between transactions

Concurrency problem can arise here.

It does not matter which transaction is executed first, The problem we have seen earlier lost update, uncommitted as long as every transaction is executed in its entirely data, inconsistent analysis is arise if scheduling is not proper. from the beginning to end. EXAMPLE: EXAMPLE: If some transaction T is long, the other transaction In this schedule the execution of other transaction goes on must wait for T to complete all its operations. without waiting the completion of T. In case of banking transaction, there are 2 transactions – one transaction calculates the interest on the account and another transaction deposits some money into the account. Here the order of the execution is important, as the results will be different

depending on whether the interest is calculated before or after the money is deposited into the account.

Equivalent Schedule: For any database state, the effect (on the set of objects in the database) of executing the first schedule is identical to the effect of executing the second schedule. e.g. Schedule 3 is equivalent to Schedule 1 Serializable Schedule: A non serial schedule that is equivalent to some serial execution of transactions is called a serializable schedule. For e.g. Schedule 3 is serializable schedule, which is equivalent to schedule-1 and schedule-2. The objective of Serializability is to find non-serial schedules that allow transactions to execute concurrently without interfering with one another, and thereby produce a database state that could be processed by a serial execution. It is important to guarantee serializability of concurrent transactions in order to

prevent inconsistency from transactions interfering with one another. In serializability the ordering read and write operations are important.

Operations Conflict • In a given schedule, the order of read/write operations can be changed but not always. • Such read/write operations whose order cannot be changed without affecting the consistency of data are called as conflict operations.

Rules to define conflict operations • Rule 1: If two transactions just read a data object then they do not conflict and the order is not important. • Rule 2: If two transactions either read or write completely separate data objects then they do not conflict and the order is not important. • Rule 3: If one transaction writes a data object and another either reads or writes the same data object then the order of execution is important.

• Rule 4: Two actions on the same data object conflict if at least one of them is a write. Consider the following Schedule:-

• Are there any conflicting operations in T1 or T2? • Find out a serial schedule for it? Make it conflict serializable?

• If a schedule ,S, can be transformed into a schedule, S’, by a series of swaps of non-conflicting instructions, we say that S and S’ are conflict equivalent.

• Here Schedule S and S’ are conflict equivalent as Read and Write of T1 can be swapped with Read and Write of T2. • This leads to the concept of conflict serializability also. • We say that a Schedule S, is conflict serializable if it is conflict equivalent to a serial schedule.

• Here, Schedule S is conflict serializable as it is conflict equivalent to S’, serial schedule

Testing for Serializability The test is as follows:Let us assume that there is a schedule, S. We construct it’s directed graph or precedence graph. This graph G, is a pair G=(V, E) Where ‘V’ is a set of vertices ‘E’ is a set of edges. Set of vertices consists of all the transactions participating in the schedule. Set of edges consists of all edges Tià Tj for which one of the three conditions hold: Ti executes Write(A) before Tj executes Read(A)  Ti executes Read(A) before Tj executes Write(A)

 Ti executes Write(A) before Tj executes Write(A) We also say that, if an edge Tià Tj exists in the precedence graph then in any serial schedule S’ equivalent to S, Ti must appear before T. e.g. Say Schedule S is:-

• Its precedence graph is:-

• There is one edge and two vertices T1 and T2. Arrow indicates that all instructions of T1 are executed before the execution of first instruction of T2. • But if the schedule is

• Then the precedence graph is:-

• This is because firstly T1 is executed then T2 then again T1 and then T2. Here the graph contains a cycle. • Note: If the precedence graph for S has a cycle, then the schedule, S is not conflict serializable. But, if the graph contains no cycles, then the schedule, S, is conflict serializable.

Concurrency Control • Concurrency control is the process of managing simultaneous execution of transactions (such as queries, updates, inserts, deletes and so on) in a multiprocessing database system without having them interfere with one another. • This property of DBMS allows many transactions to access the same database at the same time without interfering with each other. • The primary goal of concurrency is to ensure the atomicity of the execution of transactions in a multi-user database environment.

Problems of Concurrency Control When concurrent transactions are executed in an uncontrolled manner, several problems can occur. The concurrency control has the following three main problems:• Lost updates • Dirty read • Unrepeatable read Lost updates:- A lost update problem occurs when two transactions that access the same database items have their operations in a way that makes the value of some database item incorrect. In other words, if transaction T1 and T2 both read a record and then update it, the effects of the first update will be overwritten by the second update.

eg:

T1

T2

read (A) A:= A-100 read (A) X:= A*0.02 A:= A+X write (A) read (B) write (A) B:=B+100 write (B)

Suppose that the initial values of account A and B are $2000 and $1500, respectively. Then, after the serial execution of transactions T1 and T2 (T1 followed by T2), the value of account A should be $1938 and that of account B should be $1600. Suppose that the operations of T1 and T2 are interleaved in such a way that T2 reads the value of account A before T1 updates its value in the database. Now, when T2 updates the value of account A in the database, the value of account A updated by transaction T1 is overwritten and hence is lost. This is known as lost update problem. Dirty Read:- A dirty read problem occurs when one transaction updates a database item and then the transaction fails for some reason. The updated database item is accessed by another transaction before it is changed back to the original value. In other words, a transaction T1 updates a record, which is read by the transaction T2. Then T1 aborts and T2 now has values which have never formed part of the stable database. eg: Assume that T1 fails after debiting $100 from account A, but before crediting this amount to account B. This will leave the database in an inconsistent state. The value of account A is now $1900, which must be changed back to original one, that is, $2000. However, before the transaction T1 is rolled back, let another transaction T2 reads the incorrect value of account A. This incorrect value of account A that is read by transaction T2 is called dirty data, and the problem is called dirty read problem. Unrepeatable Read:- The third problem occurs when a transaction tries to read the value of the data item twice, and another transaction updates the same data item in between the two read operations of the first transaction. As a result, the first transaction reads varied values of same data item during its execution. eg:

T3 read (A)

T4

read (A) A:= A+100 write (A) read (A) sum:= sum + A write (sum) Consider a transaction T3 that reads the value of account A. At this point let another transaction T4 updates the value of account A. Now if T3 again tries to read the value of account A, it will get a different value. As a result, the transaction T3 receives different values for two reads of account A. This interleaved schedule of transaction T3 and T4 that leads to a problem of unrepeatable read. Lock-Based Protocols A lock is defined as a variable associated with a data item that describes the status of the item with respect to possible operations that can be applied to it. It prevents access to a database record by a second transaction until the first transaction has completed all of its actions. Lock Types Locks are put under two categories:• Binary Locks • Shared/Exclusive (for R/W) Locks. Binary Locking:- In binary locking there are two states of locking namely (a) locked (‘1’) (b) unlocked (‘0’). If an object of a database table, page, tuple or attribute is locked by a transaction, no other transaction can use that object. A distinct lock is associated

with each database item. If the value of lock on data item X is 1, item X cannot be accessed by a database operation that requires the item. If a data item X is unlocked, any transaction can lock the object for its use. Two operations are used with binary locking:1. lock_item (data item) 2. Unlock_item (data item) Rules followed during Binary Locking • A transaction, T must issue the operation, lock_item(A) before any read_item(A) or write_item(A) operations are performed in T. • T must issue the operation, unlock_item(A) after all read_item(A) and write_item(A) operations are completed in T. • T will not issue a lock_item(A) operation if it already holds the lock on itemA. • T will not issue an unlock_item(A) operation unless it already holds the lock on item-A. • At the most, only one transaction can hold the lock on a particular item. No two transactions can access the same item concurrently. Advantages and Disadvantages of Binary Locks • Binary locking is easy to implement but it is restrictive to yield optimal concurrency conditions. • DBMS will not allow the transactions to read the same database object even if neither of the transaction updates the database. • At most one transaction can hold the lock on a particular item. No two transactions can access the same data concurrently.

Shared/Exclusive locking:- A shared/exclusive (or Read/Write) lock uses multiplemode lock. In this type of locking, there are three locking operations namely (a) Read_lock (A) (b) Write_lock (B) (c) Unlock (A) A read_locked item is also called share-locked, because other transactions are allowed to read the item. A write_locked item is called exclusive lock, because a single transaction exclusively holds the lock on the item. A shared lock is denoted by S and the exclusive lock is denoted by X. Here, there are three locking Operations: • Lock_S(A) • LOCK_X(A) • UNLOCK(A)

There are two approaches to write concurrency control Algorithms: • Pessimistic Approach • Optimistic Approach

ctgn2lpo ceao-sp pncia ah2kmp bpais ahsniet eaglc delPpka lorpg cop kto nioa gnch go l

n i o re m s ys t t lm g p p s a t s ci ds o sp...