SQL Transactions

Transactions in SQL Server are very useful, however I have come across that not many people use them. Not sure why this could be, so thought I would share how transactions work and the different modes.

A transaction is a single unit of work. If a transaction is successful, all of the data modifications made during the transaction are committed and become a permanent part of the database. If a transaction encounters errors and must be canceled or rolled back, then all of the data modifications are erased.

SQL Server operates in the following transaction modes:
Autocommit transactions
Each individual statement is a transaction.

Explicit transactions
Each transaction is explicitly started with the BEGIN TRANSACTION statement and explicitly ended with a COMMIT or ROLLBACK statement.

Implicit transactions
A new transaction is implicitly started when the prior transaction completes, but each transaction is explicitly completed with a COMMIT or ROLLBACK statement.

Batch-scoped transactions
Applicable only to multiple active result sets (MARS), a Transact-SQL explicit or implicit transaction that starts under a MARS session becomes a batch-scoped transaction. A batch-scoped transaction that is not committed or rolled back when a batch completes is automatically rolled back by SQL Server.

From my experience Explicit transactions are the most commonly used out of all the different modes.

Distributed Transactions:
You can use Distributed transactions which specifies the start of a SQL distributed transaction managed by Microsoft Distributed Transaction Coordinator (MS DTC). This is only applicable to SQL Server and no Azure. Transaction-level snapshot isolation does not support distributed transactions.

Is a user-defined transaction name used to track the distributed transaction within MS DTC utilities. transaction_name must conform to the rules for identifiers and must be <= 32 characters.

Is the name of a user-defined variable containing a transaction name used to track the distributed transaction within MS DTC utilities. The variable must be declared with a char, varchar, nchar, or nvarchar data type.

The instance of the SQL Server Database Engine executing the BEGIN DISTRIBUTED TRANSACTION statement is the transaction originator and controls the completion of the transaction. When a subsequent COMMIT TRANSACTION or ROLLBACK TRANSACTION statement is issued for the session, the controlling instance requests that MS DTC manage the completion of the distributed transaction across all of the instances involved.

For example, if BEGIN DISTRIBUTED TRANSACTION is issued on ServerA, the session calls a stored procedure on ServerB and another stored procedure on ServerC. The stored procedure on ServerC executes a distributed query against ServerD, and then all four computers are involved in the distributed transaction. The instance of the Database Engine on ServerA is the originating controlling instance for the transaction.

USE AdventureWorks2012;  
-- Delete candidate from local instance.  
DELETE AdventureWorks2012.HumanResources.JobCandidate  
    WHERE JobCandidateID = 13;  
-- Delete candidate from remote instance.  
DELETE RemoteServer.AdventureWorks2012.HumanResources.JobCandidate  
    WHERE JobCandidateID = 13;  

Begin Transaction
Marks the starting point of an explicit, local transaction. Explicit transactions start with the BEGIN TRANSACTION statement and end with the COMMIT or ROLLBACK statement. BEGIN TRANSACTION increments @@TRANCOUNT by 1.

BEGIN TRANSACTION represents a point at which the data referenced by a connection is logically and physically consistent. If errors are encountered, all data modifications made after the BEGIN TRANSACTION can be rolled back to return the data to this known state of consistency. Each transaction lasts until either it completes without errors and COMMIT TRANSACTION is issued to make the modifications a permanent part of the database, or errors are encountered and all modifications are erased with a ROLLBACK TRANSACTION statement.

Although BEGIN TRANSACTION starts a local transaction, it is not recorded in the transaction log until the application subsequently performs an action that must be recorded in the log, such as executing an INSERT, UPDATE, or DELETE statement.

Naming multiple transactions in a series of nested transactions with a transaction name has little effect on the transaction. Only the first (outermost) transaction name is registered with the system. A rollback to any other name (other than a valid save point name) generates an error. None of the statements executed before the rollback is, in fact, rolled back at the time this error occurs. The statements are rolled back only when the outer transaction is rolled back.

	DELETE	FROM HumanResources.JobCandidate
	WHERE	JobCandidateID = 13

Hope this helps you in future when using transactions. If you are doing INSERTS, UPDATES and DELETES it is advisable that you use a transaction.

Row Versioning-based Isolation Levels

Following on from my previous post on SQL Transaction Isolation Levels, we get to the juicier part of Row Versioning-based Isolation Levels. Read Committed Snapshot and Snapshot isolation levels allow for the use of row versioning.

By using the row versioning-based isolation levels, it will improve read concurrency by eliminating locks for read operations.

If you are creating an application to interact with the database, it is recommended that you use read committed isolation using row versioning over snapshot isolation.

Benefits of read committed isolation using row versioning

  • Consumes less tempdb space
  • Works with distributed transactions
  • Works with most existing applications. The behaviour of read committed, whether to use row versioning or not, is determined by the database option setting, and this can be changed without affecting the application
  • Snapshot isolation is vulnerable to update conflicts, whereas this is not applicable to read committed isolation using row versioning. When a transaction running under snapshot isolation reads data that is then modified by another transaction, an update by the snapshot transaction to the same data causes an update conflict and the transaction terminates and rolls back

Benefits of Row Versioning-based Isolation Levels

  • Read operations retrieve a consistent snapshot of the database
  • SELECT statements do not lock data during a read operation (readers do not block writers, and vice versa)
  • SELECT statements can access the last committed value of the row, while other transactions are updating the row without getting blocked
  • The number of deadlocks is reduced
  • The number of locks required by a transaction is reduced, which reduces the system overhead required to manage locks
  • Fewer lock escalations take place

Costs of Row Versioning-based Isolation Levels
With anything we decide to implement in a database we need to look at the costs and determine whether the benefits outweigh the costs. With this we weigh the cost of the concurrency benefit of minimizing locking against increased resource usage needed to maintain and read the row versions.

Below is the list of costs:

  • The biggest thing to be aware of is that the tempdb needs to have enough disk space. All versions generated by update transactions during the time the transaction is running is kept in the tempdb. If the tempdb runs out of space, update operations do not fail, but read operations using row versioning might fail
  • Read performance can be affected when versions needed by queries become old and long version chains must be scanned
  • Row versioning increases resource usage during data modification as row versions are maintained in tempdb
  • When either the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION database options are ON, update and delete transactions for a particular database must maintain row versions even when there are no transactions using a row versioning-based isolation level.
  • Row versioning information requires 14 bytes added to the database row
  • Update performance can be slower due to the work involved in maintaining row versions.
  • The performance cost of versioned updates could be higher when larger amounts of data change during update operations
  • Data readers face the extra cost of traversing the version link list. The older the snapshot, the slower the process of accessing it in a snapshot isolation transaction
  • Some update transactions using snapshot isolation might have to be rolled back because of mandatory conflict detection for update operations

Now the question comes in on when to use which isolation level for you to get the most benefit out of it after considering the cost that comes with using the isolation levels.

Use Read Committed Isolation Using Row Versioning
A new data snapshot is taken is taken as each statement within the transaction executes, this remains consistent for each statement until the statement finishes execution. This provides statement-level read consistency.

Enable read committed isolation using row versioning when:

  • The concurrency benefits of reader/writer blocking occurrences outweigh the increased overhead of creating and managing row versions
  • You require data values to be consistent to the point in time that a query starts
  • The application requires absolute accuracy for long-running aggregations

Snapshot Isolation
A new data snapshot is taken when the snapshot transaction starts, this remains consistent for the duration of the transaction. This provides transaction-level read consistency.

Enable snapshot isolation when:

  • Optimistic concurrency control is desired.
  • Probability is low that a transaction would have to be rolled back because of an update conflict.
  • An application needs to generate reports based on long-running, multi-statement queries that must have point-in-time consistency.

Limitations of Transactions Using Row Versioning-based Isolation Levels

  • READ_COMMITTED_SNAPSHOT cannot be enabled in tempdb, msdb, or master
  • Global temp tables are stored in tempdb. When accessing global temp tables inside a snapshot transaction, one of the following must happen:
    • Set the ALLOW_SNAPSHOT_ISOLATION database option ON in tempdb
    • Use an isolation hint to change the isolation level for the statement.
  • Snapshot transactions fail when:
    • A database is made read-only after the snapshot transaction starts, but before the snapshot transaction accesses the database
    • If accessing objects from multiple databases, a database state was changed in such a way that database recovery occurred after a snapshot transaction starts, but before the snapshot transaction accesses the database. For example: the database was set to OFFLINE and then to ONLINE, database autoclose and open, or database detach and attach.
  • Distributed transactions, including queries in distributed partitioned databases, are not supported under snapshot isolation.
  • SQL Server does not keep multiple versions of system metadata. Data definition language (DDL) statements on tables and other database objects (indexes, views, data types, stored procedures, and common language runtime functions) change metadata. If a DDL statement modifies an object, any concurrent reference to the object under snapshot isolation causes the snapshot transaction to fail. Read-committed transactions do not have this limitation when the READ_COMMITTED_SNAPSHOT database option is ON.

Differences between snapshot isolation and read committed isolation using row versioning

Hope this has been as interesting for you as it was for me!


SQL Transaction Isolation Levels

Isolation levels can be quite useful when retrieving data and inserting data to maintain data integrity. Isolation levels control the locking and row versioning behaviour between SQL statements issued by a connection to SQL server. It defines the degree to which one transaction must be isolated from resource or other modification made by other transactions.

There are 5 types of isolation levels

  • Read uncommitted
  • Read committed
  • Repeatable Read
  • Serializable
  • Snapshot

Read Uncommitted
This allows you to read rows that have been modified by other transaction but not yet committed. This is the lowest isolation level. Read Uncommitted does not use shared locks and is not blocked by exclusive locks. This option has the same effect as setting a NOLOCK on all tables in a SELECT statement in a transaction.

Read Committed
This allows you to only read records that have been modified and committed by other transactions. This prevents dirty reads from happening. The SQL server default will still allow non-repeatable reads and phantom reads; this is where the data ban be changed by other transactions between individual statements within the current transaction.

Repeatable Read
Statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.

Shared locks are placed on all data read and are held until the transaction completes. Therefore, no transactions can modify any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. Phantom reads will occur if the current statement tries to read the new rows.

Statements cannot read data that has been modified but not yet committed by other transactions. No other transactions can modify data that has been read by the current transaction until the current transaction completes. Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.

This is exactly what it sounds like, this takes a snapshot of the data at time of execution. Any modification made to the data after the start of the transaction is not visible to the current transaction.

When a database is being recovered, snapshot transactions do not request lock transactions when reading data and they do not block other transactions from writing data.

Considering Isolation which level to use
When considering which isolation level to use, you need to think in terms of concurrency side effects, such as dirty reads, phantom reads or non-repeatable reads. You also need to consider the balancing of data integrity against the overhead of the isolation level.

The below table shows the concurrency side effects allowed by the different isolation levels.
Transaction isolation levels control the following:

  • Whether locks are taken when data is read, and what type of locks are requested.
  • How long the read locks are held.
  • Whether a read operation referencing rows modified by another transaction:
    • Block until the exclusive lock on the row is freed.
    • Retrieve the committed version of the row that existed at the time the statement or transaction started.
    • Read the uncommitted data modification.

Regardless of any isolation level set on a transaction, the transaction will always get an exclusive lock on any data it modifies and holds the lock until the transaction completes. If you choose to use a lower isolation level such as read uncommitted to increase the amount of possible users that can access the data at the same time, this shall also increase the number of concurrency effects such as dirty reads or lost updates that users may encounter.

A higher isolation level such as serializable will reduce the amount on concurrency effects, however it shall require more system resources and increases the possibility that one transaction will block another.

FILESTREAM-enabled databases support the following transaction isolation levels.
Look out for my next post on using row versioning-based isolation levels!