Monday, September 5, 2011

MS SQL Server vs Oracle: Row Versioning

I hear sometimes that SQL Server nowadays, when it supports row versioning, is just the same as Oracle. Is that so? I am going to compare and see.

Indeed, the basic principle is the same: when one transaction is modifying the data in the table, the original row is stored in some temporary place, so other transactions can read the unchanged data.

SQL Server uses the tempdb:

Oracle uses Undo Segments:
Which conceptually looks pretty much the same. Let's get deep now.



STEP 1: Create and setup databases.

A. SQL Server TEST_01, with Read Committed Snapshot isolation enabled:

ALTER DATABASE TEST_01 SET READ_COMMITTED_SNAPSHOT ON;

B. SQL Server TEST_02, with Snapshot isolation enabled:

ALTER DATABASE TEST_02 SET ALLOW_SNAPSHOT_ISOLATION ON;

C. Any Oracle database.


STEP 2: Create a table and fill with the demo data.

1. SQL Server (Read Committed Snapshot, Snapshot)


CREATE TABLE dbo.LockTest (I INT PRIMARY KEY, C VARCHAR(50));
INSERT INTO dbo.LockTest (I, C) VALUES (1, 'A'), (2, 'B'), (3, 'C');

2. Oracle

CREATE TABLE LockTest (I INT PRIMARY KEY, C VARCHAR(50));
INSERT INTO LockTest (I, C) VALUES (1, 'A');
INSERT INTO LockTest (I, C) VALUES (2, 'B');
INSERT INTO LockTest (I, C) VALUES (3, 'C');


STEP 3: Start the first transaction which modifies the table row (DONOT commit).

1. SQL Server (Read Committed Snapshot, Snapshot)


SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN

UPDATE dbo.LockTest SET C = 'B Modified' WHERE I = 2

Exclusive locks will be acquired:


We can check that the original version of the row is now stored in tempdb:

SELECT * FROM sys.dm_tran_version_store;




2. Oracle


SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

UPDATE LockTest SET C = 'B Modified' where I = 2;

Exclusive locks will be acquired:



STEP 4: Start another transaction and try to read

STEP 4.a READ UNCOMMITTED

1. SQL Server (Read Committed Snapshot)


SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
BEGIN TRAN

SELECT * FROM dbo.LockTest




Despite an original row is stored in tempdb, transaction reads the dirty data.

2. SQL Server (Snapshot)

The same result, transaction reads the dirty data.

3. Oracle

In fact, Oracle doesn't support READ UNCOMMITTED serialization level. Oracle always reads committed data. It simply can afford it, because, as we are going to see, it does not have any penalties.



STEP 4.b READ COMMITTED


1. SQL Server (Read Committed Snapshot)



SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN

SELECT * FROM dbo.LockTest



So we can see that SQL Server could read data, including row changed in another transaction, obtaining the original data from tempdb.

2. SQL Server (Snapshot)

SET TRANSACTION ISOLATION LEVEL READ COMMITTED
BEGIN TRAN

SELECT * FROM dbo.LockTest



Now SQL Server cannot get any results, being blocked by the first uncommitted transaction:


So we can see that SQL Server is trying to get Shared lock on the row in question (KEY, technically).

3. Oracle

SET TRANSACTION ISOLATION LEVEL READ COMMITTED;

SELECT * FROM LockTest;


Pretty much the same as in case of SQL Server in Read Committed Snapshot we can read rows, including one being updated by the first transaction. Oracle reads the original row from undo segments.


STEP 4.c REPEATABLE READ

1. SQL Server (Read Committed Snapshot)

SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRAN

SELECT * FROM dbo.LockTest


Once again with SQL Server, we are blocked. And the reason is that transaction is trying to obtain the Shared lock:


2. SQL Server (Snapshot)

Transaction will be blocked, the same reason.

3. Oracle

Oracle does not support REPEATABLE READ isolation level - it doesn't have to, and we'll see why.


STEP 4.d SERIALIZABLE

1. SQL Server (Read Committed Snapshot)

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN

SELECT * FROM dbo.LockTest


And we are blocked again. Now because transaction wants to get the Shared lock on the whole range of keys, which will satisfy the WHERE condition (empty).


2. SQL Server (Snapshot)

The same reason, the same result.

3. Oracle

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;

SELECT * FROM LockTest;



STEP 4.e SQL SERVER ONLY: SNAPSHOT

Because of unimpressive results with the SERIALIZABLE isolation level, SQL Server offers something different: SNAPSHOT isolation.

SET TRANSACTION ISOLATION LEVEL SNAPSHOT
BEGIN TRAN

SELECT * FROM dbo.LockTest


And now we finally got it - the same as Oracle does in SERIALIZABLE.


Wrap-up:

1. Oracle never reads uncommitted data.
SQL Server always reads uncommitted data when in READ UNCOMMITTED.

2. In READ COMMITTED, Oracle transaction reads the committed data and never gets blocked by parallel transaction which modifies the same data. When the data is modified by another uncommitted transaction, Oracle reads original data from undo segments.
SQL Server (READ COMMITTED  SNAPSHOT) transaction, the same as in Oracle, simply reads all the rows, modified in another uncommitted transaction from tempdb.

3. Oracle does not support REPEATABLE READ, because writers do not block readers in SERIALIZABLE.
SQL Server (READ COMMITTED SNAPSHOT) does not support REPEATABLE READ WELL ENOUGH, but SNAPSHOT can be used as an alternative.

4. Oracle does fine in SERIALIZABLE, writers do not block readers.
SQL Server does fine in SNAPSHOT, writers do not block readers.

Well, are they the same? When correctly used, pretty much they are. The performance (temptb vs undo segments) is the only question.

Saturday, September 3, 2011

Sql Server: Quering Locks



SELECT
    lock.request_session_id AS "Session Id"
    , task.blocking_session_id AS "Blocked By"
    , DB_NAME(lock.resource_database_id) AS "DB Name"
    , obj.[type] AS "Object Type"
    , SCHEMA_NAME(obj.schema_id) + '.' + obj.Name AS "Object Name"
    , lock.resource_type AS "Resource Type"
    , lock.resource_description AS "Resource Id"
    , lock.request_mode AS "Lock Type"
    , lock.request_status AS "Lock Status"
    , lock.request_reference_count AS "Ref Count"
    , stmt.text AS "Sql Statement"
    , ssntran.is_user_transaction as "User Transaction"
    , acttran.name as "Transaction Name"
    , '***' AS '***'
    , ssn.login_name AS "Login"
    , ssn.host_name AS "Host"
    , ssn.program_name AS "Program"
    , ssn.login_time AS "Login Time"
    , conn.net_transport AS "Transport"
    , conn.auth_scheme as "Authentication"
FROM
    sys.dm_tran_locks lock
    LEFT JOIN sys.dm_os_waiting_tasks task ON lock.lock_owner_address = task.resource_address
    JOIN sys.partitions prtn ON prtn.hobt_id = lock.resource_associated_entity_id
    JOIN sys.objects obj ON obj.object_id = prtn.object_id
    JOIN sys.dm_exec_sessions ssn ON ssn.session_id = lock.request_session_id
    JOIN sys.dm_tran_session_transactions ssntran ON ssn.session_id = ssntran.session_id
    JOIN sys.dm_tran_active_transactions acttran ON ssntran.transaction_id = acttran.transaction_id
    JOIN sys.dm_exec_connections conn ON conn.session_id = ssn.session_id
    CROSS APPLY sys.dm_exec_sql_text(conn.most_recent_sql_handle) AS stmt
WHERE resource_database_id = db_id()
ORDER BY lock.request_session_id

SQL Server vs Oracle: Memory Management

There are many things that MS SQL Server and Oracle do the same way. Let's look at the memory management.

Whatever happens to the data, it has to be read from permanent storage first, and written back after. Things might change in next ten years, but so far the reality is that this permanent storage is a hard disk.

The hard disk performs read/write operations in minimal units, which are sectors. No surprise the databases also read and write the data not in single bytes, but in some reasonably small pieces. In SQL Server this piece is called page, and in Oracle it is called block. The SQL Server page has a fixed size of 8 KB, the Oracle block size may vary, but 8 KB could be safely called "typical".


The optimal performance can be achieved when the data pieces are contiguous. Thus both SQL Server and Oracle group their minimal units, pages and blocks, respectively, into larger chunks of contiguous pieces, called extents.

In SQL Server, extent is 8 logically contiguous pages.


In Oracle, extent is 1 to N logically contiguous blocks, where N = 2 GB / Block Size.


"Logically contiguous" should be understood as "contiguous in a file". Files themselves might not be contiguous on disk.

With some exceptions, extents are the units in which the storage is consumed by tables and indexes, both in SQL Server and Oracle.

The difference is how the storage is optimized when objects are yet small enough to effectively occupy the whole extent.

SQL Server will not allocate the whole extent for an object until it grows beyond 64 KB, instead allocating pages from an extent shared between several objects (Mixed extent).

In Oracle this problem is solved by simply using different extent sizes: one for the newly created objects (INITIAL), and another one for the existing objects which grow further (NEXT).

Friday, September 2, 2011

Validation rules

Validation, as any other constraint, is a source of assumptions in the application code.

Not only it provides the data quality check at some certain point, the whole logic of processing that piece of data afterwards assumes the validation rule was enforced.

I.e. if you cannot create a customer account without specifying the postal code, you should not be surprised to see SQL statements which will break as soon as we remove this check. The primitive example would be the following SQL query:

SELECT c.FirstName + ' ' + c.LastName + ', ' + c.PostalCode FROM dbo.Customer c WHERE...

So relaxing the validation rule is not as simple as removing the condition check and an error message.
You have to trace the usage of the data and make sure none of the functionality is broken, neither from the technical, nor from the logical point of view.

Say, if you now allow the postal code to be NULL, then firstly, the previous select will now yield NULL in case we have a customer w/o the postal code.

Secondly, the reason might be not that we are not going to send customers their invoices anymore. This might be because now we have a default postal code somewhere else, which should be used when the customer's postal code is not set. I am making this up, but you've got the idea.