Transactions and Locks in SQL Server
• What is a “Database Transactions “?
A database transaction is a unit of work performed against a database
management system or similar system that is treated in a coherent and
reliable way independent of other transactions. A database transaction,
by definition, must be atomic, consistent, isolated and durable. These
properties of database transactions are often referred to by the acronym
ACID.
Transactions provide an “all-or-nothing” proposition stating that
work units performed in a database must be completed in their entirety
or take no effect whatsoever. Further, transactions must be isolated
from other transactions, results must conform to existing constraints in
the database and transactions that complete successfully must be
committed to durable storage.
In some systems, transactions are also called LUWs for Logical Units of Work.
• What is ACID?
The ACID model is one of the oldest and most important concepts of
database theory. It sets forward four goals that every database
management system must strive to achieve: atomicity, consistency,
isolation and durability. No database that fails to meet any of these
four goals can be considered reliable.
Let’s take a moment to examine each one of these characteristics in detail:
Atomicity states that database modifications must
follow an “all or nothing” rule. Each transaction is said to be
“atomic.” If one part of the transaction fails, the entire transaction
fails. It is critical that the database management system maintain the
atomic nature of transactions in spite of any DBMS, operating system or
hardware failure.
Consistency states that only valid data will be
written to the database.If, for some reason, a transaction is executed
that violates the database’s consistency rules, the entire transaction
will be rolled back and the database will be restored to a state
consistent with those rules. On the other hand, if a transaction
successfully executes, it will take the database from one state that is
consistent with the rules to another state that is also consistent with
the rules.
Isolation requires that multiple transactions
occurring at the same time not impact each other’s execution. For
example, if Joe issues a transaction against a database at the same time
that Mary issues a different transaction, both transactions should
operate on the database in an isolated manner. The database should
either perform Joe’s entire transaction before executing Mary’s or
vice-versa. This prevents Joe’s transaction from reading intermediate
data produced as a side effect of part of Mary’s transaction that will
not eventually be committed to the database. Note that the isolation
property does not ensure which transaction will execute first, merely
that they will not interfere with each other.
Durability ensures that any transaction committed to
the database will not be lost. Durability is ensured through the use of
database backups and transaction logs that facilitate the restoration
of committed transactions in spite of any subsequent software or
hardware failures.
• What is “Begin Trans”, “Commit Tran”, “Rollback Tran” and “Save Tran”?
Transactions group a set of tasks into a single execution unit. Each
transaction begins with a specific task and ends when all the tasks in
the group successfully complete. If any of the tasks fails, the
transaction fails. Therefore, a transaction has only two results:
success or failure. Incomplete steps result in the failure of the
transaction.
Users can group two or more Transact-SQL statements into a single transaction using the following statements:
* Begin Transaction
* Rollback Transaction
* Commit Transaction
Begin Transaction
Marks the starting point of an explicit, local transaction. BEGIN TRANSACTION increments @@TRANCOUNT by 1.
Rollback Transaction
If anything goes wrong with any of the grouped statements, all changes
need to be aborted. The process of reversing changes is called rollback
in SQL Server terminology.
A ROLLBACK, on the other hand, works regardless of the level at which it
is issued, but rolls back all transactions, regardless of the nesting
level
Commit Transaction
If everything is in order with all statements within a single
transaction, all changes are recorded together in the database. In SQL
Server terminology, we say that these changes are committed to the
database.
A COMMIT issued against any transaction except the outermost one doesn’t
commit any changes to disk – it merely decrements the@@TRANCOUNT
automatic variable.
Save Tran
Savepoints offer a mechanism to roll back portions of transactions. A
user can set a savepoint, or marker, within a transaction. The savepoint
defines a location to which a transaction can return if part of the
transaction is conditionally canceled. SQL Server allows you to use
savepoints via the SAVE TRAN statement, which doesn’t affect the
@@TRANCOUNT value. A rollback to a savepoint (not a transaction) doesn’t
affect the value returned by @@TRANCOUNT, either. However, the rollback
must explicitly name the savepoint: using ROLLBACK TRAN without a
specific name will always roll back the entire transaction.
• What are “Checkpoint’s” in SQL Server?
Forces all dirty pages for the current database to be written to disk.
Dirty pages are data or log pages modified after entered into the buffer
cache, but the modifications have not yet been written to disk.
Syntax
CHECKPOINT
• What are “Implicit Transactions”?
Microsoft SQL Server operates in three transaction modes:
Autocommit transactions
Each individual statement is a transaction.
Explicit transactions
Each transaction is explicitly started with the BEGIN TRANSACTION
statement and explicitly ended with a COMMIT or ROLLBACK statement.
Implicit transactions
A new transaction is implicitly started when the prior transaction
completes, but each transaction is explicitly completed with a COMMIT or
ROLLBACK statement.
• Is it good to use “Implicit Transactions”?
If you want all your commands to require an explicit COMMIT or ROLLBACK
in order to finish, you can issue the command SET IMPLICIT_TRANSACTIONS
ON. By default, SQL Server operates in the autocommit mode; it does not
operate with implicit transactions. Any time you issue a data
modification command such as INSERT, UPDATE, or DELETE, SQL Server
automatically commits the transaction. However, if you use the SET
IMPLICIT_TRANSACTIONS ON command, you can override the automatic
commitment so that SQL Server will wait for you to issue an explicit
COMMIT or ROLLBACK statement to do anything with the transaction. This
can be handy when you issue commands interactively, mimicking the
behavior of other databases such as Oracle.
What’s distinctive about implicit transactions is that reissuing SET
IMPLICIT_TRANSACTIONS ON does not increase the value of @@TRANCOUNT.
Also, neither COMMIT nor ROLLBACK reduce the value of @@TRANCOUNT until
after you issue the command SET IMPLICIT_TRANSACTIONS OFF. Developers do
not often use implicit transactions; however, there is an interesting
exception in ADO. See the sidebar, Implicit Transactions and ADO
Classic.
• What is Concurrency?
When many people attempt to modify data in a database at the same time, a
system of controls must be implemented so that modifications made by
one person do not adversely affect those of another person. This is
called concurrency control.
Concurrency control theory has two classifications for the methods of instituting concurrency control:
Pessimistic concurrency control
A system of locks prevents users from modifying data in a way that
affects other users. After a user performs an action that causes a lock
to be applied, other users cannot perform actions that would conflict
with the lock until the owner releases it. This is called pessimistic
control because it is mainly used in environments where there is high
contention for data, where the cost of protecting data with locks is
less than the cost of rolling back transactions if concurrency conflicts
occur.
Optimistic concurrency control
In optimistic concurrency control, users do not lock data when they read
it. When an update is performed, the system checks to see if another
user changed the data after it was read. If another user updated the
data, an error is raised. Typically, the user receiving the error rolls
back the transaction and starts over. This is called optimistic because
it is mainly used in environments where there is low contention for
data, and where the cost of occasionally rolling back a transaction
outweighs the costs of locking data when read.
• What are “Dirty reads”?
Uncommitted dependency occurs when a second transaction selects a row
that is being updated by another transaction. The second transaction is
reading data that has not been committed yet and may be changed by the
transaction updating the row.
• What are “Unrepeatable reads”?
Inconsistent Analysis (Nonrepeatable Read)
Inconsistent analysis occurs when a second transaction accesses the same
row several times and reads different data each time. Inconsistent
analysis is similar to uncommitted dependency in that another
transaction is changing the data that a second transaction is reading.
However, in inconsistent analysis, the data read by the second
transaction was committed by the transaction that made the change. Also,
inconsistent analysis involves multiple reads (two or more) of the same
row and each time the information is changed by another transaction;
thus, the term nonrepeatable read.
• What are “Phantom rows”?
Phantom reads occur when an insert or delete action is performed against
a row that belongs to a range of rows being read by a transaction. The
transaction’s first read of the range of rows shows a row that no longer
exists in the second or succeeding read, as a result of a deletion by a
different transaction. Similarly, as the result of an insert by a
different transaction, the transaction’s second or succeeding read shows
a row that did not exist in the original read.
For example, an editor makes changes to a document submitted by a
writer, but when the changes are incorporated into the master copy of
the document by the production department, they find that new unedited
material has been added to the document by the author. This problem
could be avoided if no one could add new material to the document until
the editor and production department finish working with the original
document.
• What are “Lost Updates”?
Lost updates occur when two or more transactions select the same row and
then update the row based on the value originally selected. Each
transaction is unaware of other transactions. The last update overwrites
updates made by the other transactions, which results in lost data.
• What are different levels of granularity of locking resources?
Microsoft SQL Server 2000 has multigranular locking that allows
different types of resources to be locked by a transaction. To minimize
the cost of locking, SQL Server locks resources automatically at a level
appropriate to the task. Locking at a smaller granularity, such as
rows, increases concurrency, but has a higher overhead because more
locks must be held if many rows are locked. Locking at a larger
granularity, such as tables, are expensive in terms of concurrency
because locking an entire table restricts access to any part of the
table by other transactions, but has a lower overhead because fewer
locks are being maintained.
SQL Server can lock these resources (listed in order of increasing granularity).
RID: Row identifier. Used to lock a single row within a table.
Key: Row lock within an index. Used to protect key ranges in serializable transactions.
Page: 8 kilobyte –(KB) data page or index page.
Extent: Contiguous group of eight data pages or index pages.
Table: Entire table, including all data and indexes.
DB: Database.
• What are different types of Isolation levels in SQL Server?
READ COMMITTED
Specifies that shared locks are held while the data is being read to
avoid dirty reads, but the data can be changed before the end of the
transaction, resulting in nonrepeatable reads or phantom data. This
option is the SQL Server default.
READ UNCOMMITTED
Implements dirty read, or isolation level 0 locking, which means that no
shared locks are issued and no exclusive locks are honored. When this
option is set, it is possible to read uncommitted or dirty data; values
in the data can be changed and rows can appear or disappear in the data
set before the end of the transaction. This option has the same effect
as setting NOLOCK on all tables in all SELECT statements in a
transaction. This is the least restrictive of the four isolation levels.
REPEATABLE READ
Locks are placed on all data that is used in a query, preventing other
users from updating the data, but new phantom rows can be inserted into
the data set by another user and are included in later reads in the
current transaction. Because concurrency is lower than the default
isolation level, use this option only when necessary.
SERIALIZABLE
Places a range lock on the data set, preventing other users from
updating or inserting rows into the data set until the transaction is
complete. This is the most restrictive of the four isolation levels.
Because concurrency is lower, use this option only when necessary. This
option has the same effect as setting HOLDLOCK on all tables in all
SELECT statements in a transaction.
• If you are using COM+, what “Isolation” level is set by default?
SERIALIZABLE transaction isolation level is the default isolation level for the COM+ application.
• What are “Lock” hints?
A range of table-level locking hints can be specified using the SELECT,
INSERT, UPDATE, and DELETE statements to direct Microsoft SQL Server
2000 to the type of locks to be used. Table-level locking hints can be
used when a finer control of the types of locks acquired on an object is
required. These locking hints override the current transaction
isolation level for the session.
• What is a “Deadlock”?
Deadlocking occurs when two user processes have locks on separate
objects and each process is trying to acquire a lock on the object that
the other process has. When this happens, SQL Server identifies the
problem and ends the deadlock by automatically choosing one process and
aborting the other process, allowing the other process to continue. The
aborted transaction is rolled back and an error message is sent to the
user of the aborted process. Generally, the transaction that requires
the least amount of overhead to rollback is the transaction that is
aborted.
• What are the steps you can take to avoid “Deadlocks”?
Here are some tips on how to avoid deadlocking on your SQL Server:
* Ensure the database design is properly normalized.
* Have the application access server objects in the same order each time.
* During transactions, don’t allow any user input. Collect it before the transaction begins.
* Avoid cursors.
* Keep transactions as short as possible. One way to help accomplish
this is to reduce the number of round trips between your application and
SQL Server by using stored procedures or keeping transactions with a
single batch. Another way of reducing the time a transaction takes to
complete is to make sure you are not performing the same reads over and
over again. If your application does need to read the same data more
than once, cache it by storing it in a variable or an array, and then
re-reading it from there, not from SQL Server.
* Reduce lock time. Try to develop your application so that it grabs
locks at the latest possible time, and then releases them at the very
earliest time.
* If appropriate, reduce lock escalation by using the ROWLOCK or PAGLOCK.
* Consider using the NOLOCK hint to prevent locking if the data being locked is not modified often.
* If appropriate, use as low of an isolation level as possible for the user connection running the transaction.
* Consider using bound connections.
• What is Bound Connection?
Bound connections allow two or more connections to share the same
transaction and locks. Bound connections can work on the same data
without lock conflicts. Bound connections can be created from multiple
connections within the same application, or from multiple applications
with separate connections. Bound connections make coordinating actions
across multiple connections easier.
To participate in a bound connection, a connection calls
sp_getbindtoken or srv_getbindtoken (Open Data Services) to get a bind
token. A bind token is a character string that uniquely identifies each
bound transaction. The bind token is then sent to the other connections
participating in the bound connection. The other connections bind to the
transaction by calling sp_bindsession, using the bind token received
from the first connection.
• Specity the types of Bound Connections
Local bound connection
Allows bound connections to share the transaction space of a single transaction on a single server.
Distributed bound connection
Allows bound connections to share the same transaction across two or
more servers until the entire transaction is either committed or rolled
back by using Microsoft Distributed Transaction Coordinator (MS DTC).
• How can I know what locks are running on which resource?
Use SP_Locks system stored procedure