Lecture for Chapter 6.3 (Fall 13)

Topic 6.3: Transactions and Concurrency Control
Hari Uday
* Part 1: Basic Knowledge
* Originally Developed for Databases, but can be
applied to Distributed File management
* Motivation is to make use of clean and
powerful atomic transactions semantics [ACID]
* Transaction Services:
* Clients at each distributed site issues transaction
service requests to the local transaction processing
system (TPS)
* TPS consists of transaction manager(TM), Scheduler
(SCH) and Object Manager (OM)
* A transaction may invoke operations on remote
objects [sub-transactions], send to remote TPS
* Transaction Manager works in tandem with other
TMs to make sure local as well as remote
transactions work consistently.
* Scheduler makes sure that scheduling is done to
objects and that conflicts are avoided.
* Object Manager ensures the coherency of replicas
and caches and provides interface to file systems.
* Atomicity, Consistency, Isolation and Durability
* Atomicity: either all actions are carried or none
* Consistency: when a user’s transaction runs to
completion by itself
* against a ‘consistent’ database instance, it will
leave the database in a ‘consistent’ state.
* Isolation: users can understand a transaction
without considering the effect of other concurrently
executing transactions.
* Durability: once a transaction has been successfully
completed, its effects should persist even if the
system crashes before all its changes are reflected
on disk
* Client Process: A client initiates start of
transaction by issuing a begin transaction to its
local TM. Tm generates transaction ID and work
space for subsequent RW operations.
* TM then sends the transaction ID to the
scheduler . Called execution phase. Access
requests from the TM are handled by
Scheduler. To maintain atomicity, a rejected
operation causes TM to abort. Finally, the TM
either commits or aborts.
* SCH chooses a concurrency control protocol.
* Three approaches: Prevent inconsistency or validate
consistency: Locks (Shared and Exclusive)
* 2nd approach: Individual access operations is
checked by the SCH and is either accepted,
tentatively accepted or rejected. Timestamps are
* 3rd approach: Conflicts are ignored in exec. phase.
Consistency is validated at the end of exec. phase.
Only transactions that are globally validated are
allowed to commit.
* OM is responsible for interfacing with the
underlying file service for actual operations on
an object.
* Provides cache management for efficiency
* Participates in failure recovery protocols to
support durability
* If objects are replicated, consistency of
replicas are managed by Replica Management
* Concept of Transactions are very important in
multiuser file system (NFS, NTFS)
* Scheduling Transactions: A transaction is a series of
actions including READs / WRITEs of files and
* A schedule is a list of actions from a set of
transactions, and the order in which two actions of
a transaction T appear in a schedule must be the
same as the order in which they appear in T
* Serial schedule: Schedule that does not interleave
the actions of different transactions
* Equivalent schedules: For any file system, the
effect (on the set of objects in the database)
of executing the first schedule is identical to
the effect of executing the second schedule
* Serializable schedule: A schedule that is
equivalent to some serial execution of the
* If each transaction preserves consistency, every
serializable schedule preserves consistency
* Two actions on the same data object conflict if at
least one of them is a WRITE.
* – write-read (WR) conflict: T2 could read a DB
object that has been modified by T1, which has not
yet committed.
* – read-write (RW) conflict: T2 could change the
value of an object that has been read by T1, while
T1 is still in progress.
* – write-write (WW) conflict: T2 could overwrite the
value of an object which has already been modified
by T1, while T1 is still in progress.
* A serializable schedule is a schedule whose
effect on any consistent database instance is
guaranteed to be identical to that of some
complete serial schedule over the set of
committed transactions
* In a recoverable schedule, transactions commit
only after all transactions whose changes they
read commit
* A lock is a small bookkeeping object associated with a
* A locking protocol is a set of rules to be followed by each
transaction to ensure that, even though actions of several
transactions might be interleaved, the net effect is
identical to executing all transactions in some serial order.
* Strict 2 PL Locking
* Timestamp
* Strict 2PL Protocol:
* Each transaction must obtain a S (shared) lock on object
before reading, and an X (exclusive) lock on object
before writing.
* All locks held by a transaction are released when the
transaction completes.
* If a transaction holds an X lock on an object, no other
transactions can get a lock (S or X) on that object.
* Strict 2PL allows only serializable schedules
* If a transaction Ti is aborted, all its actions have to be
undone. Not only that, if Tj reads an object last written
* by Ti, Tj must be aborted as well.
* Most systems avoid such cascading aborts by releasing a
transaction’s locks only at commit time.
- If Ti writes an object, Tj can read this only after Ti
* In order to undo the actions of an aborted transaction, the
OS maintains a log in which every write is recorded. This
mechanism is also used to recover from system crashes: all
active transactions at the time of the crash are aborted when
the system comes back up.
* The following actions are recorded in the log:
– Ti writes an object: the old value and the new value.
Log record must go to disk before the changed page!
– Ti commits/aborts: a log record indicating this
* Log records are chained together by transaction id, so
it’s easy to undo a specific transaction
* Log is often duplexed and archived on stable storage
* All log related activities are handled transparently by
the DBMS
* Crash Recovery
* A recovery manager of is responsible for ensuring
transaction atomicity and durability.
* Can the changes made to an object in the buffer
pool by a transaction T be written to disk before T
commits? (steal?)
* When a transaction commits, must all the changes
it has made to objects in the buffer pool be
immediately forced to disk? (force?)
* Recommend: Use a steal, no-force approach.
* Three phases for recovery algorithm:
* Analysis: Scan the log forward (from the most recent
checkpoint) to identify all transactions that were
active, and all dirty pages in the buffer pool at the
time of the crash.
* Redo: Redo all updates to dirty pages in the buffer
pool, as needed, to ensure that all logged updates are
in fact carried out and written to disk.
* Undo: The writes of all transactions that were active
at the crash are undone, working backwards in the
log. (Some care must be taken to handle the case of a
crash occurring during the recovery process!)
* Development, Visualization, and Application
of the ARIES Systems Code
Carlson, L. ; Center for Energy
Res., Univ. of California-San Diego, La Jolla, CA, USA ; Tillack,
M. ; Najmabadi, F. ; Kessel, C.
* A model of atomicity for multilevel
Blaustein, B.T. ; MITRE Corp., McLean, VA, USA
; Jajodia, S. ; McCollum, C.D. ; Notargiacomo, L.

similar documents