Setting Up SSIS in SQL Server Management Studio

Before you can deploy a SSIS project you need to set up SSIS in SQL Server Management Studio. When installing SQL you will select to install SSIS, however, it does not setup the SSISSDB or configure any settings for SSIS.

Here are the steps for setting up SSIS in SQL Server Management Studio.

Step 1:
Right-click on Integrations Services Catalogs and select Create Catalog.

Step 2:
Check the box next to Enable CLR Integration and set a password for your SSISDB and click Ok.

You will then see you SSISDB listed under Databases and under Integration Services Catalogs.

Now you will be able to deploy your SSIS project to your server where you have set up SSIS.

The next post will be on deploying an SSIS project.

GenesisOne™ T-SQL Source Code Unscrambler™

Have you ever worked with code that is impossible to read and decipher? You could use SQL Prompt which will format the code into a readable format, however what if there is so much going on you are still unable to decipher what is happening?

Recently I came across this problem where a function was doing so many things I couldn’t tell where it started and where it ended. There is an amazing tool that was super helpful, it is the T-SQL Source Code Unscrambler that is offered by GenesisOne.

This was a lifesaver for me, saving many hours of trying to figure out code. This tool allows you to connect to your SQL Server database and view any tables, views, stored procedures and functions.

This tool allowed me to select the function that was a mess and in the diagram view I could see what the function was trying to do. It shows if there are conditions, variables and the flow of the function.

This tool is also brilliant for doing technical documentation with diagrams of the systems and data flow in stored procedures and functions.

Below are a few screen shots of the different views and information this tool provides:

Diagram View:

Table View:

Summary View:

Stored Procedure View:

Dependency View:

Graphical View of Dependencies:

How to Export:

For a full list of all the amazing features that comes with this application, click here.

Hope you all decide to try out this tool and then get it after you realise how much time it shall save you in development!

MDS Backup Strategies

Recently a colleague and I had a discussion about which method is the best to use for backing up MDS (Master Data Services) for disaster recovery. We came to the conclusion it all depended on the SQL environment version you were recovering to. Below are the 2 different methods for backing up an MDS database.

Model Backup
Pros:

  • Fast to create, can backup specific versions
  • Will always be able to deploy a model to a MDS install
  • Can be automated via SSIS
  • Can be restore to a server version that is +1, 0 and -1

Cons:

  • If you want to save backups of different models, then the models will have to be backed up separately.
  • Security is not retained; it will have to be redone. Can take about 2 minutes per user. On a small model this should not be a problem, however, when scaled out it could be very time consuming.
  • Saving more than one backup of a model would require a script/maintenance task to remove backups older than X amount of time.

When deploying the model, the option DeployClone must be used, not DeployNew. DeployNew will remove the MUID and therefore will not allow any changes to be deployed to that model at a later stage. DeployClone keeps it all intact and will allow for future updates.

Database Backup
Pros:

  • Security is retained
  • Simple to set-up and maintain
  • Easy to restore
  • Maintenance tasks to clean up old backups

Cons:

  • Cannot be restored to a server version less than the current version.

The database backup for MDS can be scheduled just like other backups. You can use the maintenance wizard to set up the backup and clean up task on a schedule.

If you know what version the disaster recovery environment will be on, for example the same version as the current MDS server version or higher, then it is simpler to use the Database Backup method. If you are uncertain of what version the disaster recovery version will be then you would want to use the Model Backup method in case you have to restore to a version less than the current MDS server version.

Hope this helps you to choose what is best suited for your environment!

Special thanks to Janlo Du Toit for the discussion on these methods 🙂

Row Versioning-based Isolation Levels

Following on from my previous post on SQL Transaction Isolation Levels, we get to the juicier part of Row Versioning-based Isolation Levels. Read Committed Snapshot and Snapshot isolation levels allow for the use of row versioning.

By using the row versioning-based isolation levels, it will improve read concurrency by eliminating locks for read operations.

If you are creating an application to interact with the database, it is recommended that you use read committed isolation using row versioning over snapshot isolation.

Benefits of read committed isolation using row versioning

  • Consumes less tempdb space
  • Works with distributed transactions
  • Works with most existing applications. The behaviour of read committed, whether to use row versioning or not, is determined by the database option setting, and this can be changed without affecting the application
  • Snapshot isolation is vulnerable to update conflicts, whereas this is not applicable to read committed isolation using row versioning. When a transaction running under snapshot isolation reads data that is then modified by another transaction, an update by the snapshot transaction to the same data causes an update conflict and the transaction terminates and rolls back

Benefits of Row Versioning-based Isolation Levels

  • Read operations retrieve a consistent snapshot of the database
  • SELECT statements do not lock data during a read operation (readers do not block writers, and vice versa)
  • SELECT statements can access the last committed value of the row, while other transactions are updating the row without getting blocked
  • The number of deadlocks is reduced
  • The number of locks required by a transaction is reduced, which reduces the system overhead required to manage locks
  • Fewer lock escalations take place

Costs of Row Versioning-based Isolation Levels
With anything we decide to implement in a database we need to look at the costs and determine whether the benefits outweigh the costs. With this we weigh the cost of the concurrency benefit of minimizing locking against increased resource usage needed to maintain and read the row versions.

Below is the list of costs:

  • The biggest thing to be aware of is that the tempdb needs to have enough disk space. All versions generated by update transactions during the time the transaction is running is kept in the tempdb. If the tempdb runs out of space, update operations do not fail, but read operations using row versioning might fail
  • Read performance can be affected when versions needed by queries become old and long version chains must be scanned
  • Row versioning increases resource usage during data modification as row versions are maintained in tempdb
  • When either the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION database options are ON, update and delete transactions for a particular database must maintain row versions even when there are no transactions using a row versioning-based isolation level.
  • Row versioning information requires 14 bytes added to the database row
  • Update performance can be slower due to the work involved in maintaining row versions.
  • The performance cost of versioned updates could be higher when larger amounts of data change during update operations
  • Data readers face the extra cost of traversing the version link list. The older the snapshot, the slower the process of accessing it in a snapshot isolation transaction
  • Some update transactions using snapshot isolation might have to be rolled back because of mandatory conflict detection for update operations

Now the question comes in on when to use which isolation level for you to get the most benefit out of it after considering the cost that comes with using the isolation levels.

Use Read Committed Isolation Using Row Versioning
A new data snapshot is taken is taken as each statement within the transaction executes, this remains consistent for each statement until the statement finishes execution. This provides statement-level read consistency.

Enable read committed isolation using row versioning when:

  • The concurrency benefits of reader/writer blocking occurrences outweigh the increased overhead of creating and managing row versions
  • You require data values to be consistent to the point in time that a query starts
  • The application requires absolute accuracy for long-running aggregations

Snapshot Isolation
A new data snapshot is taken when the snapshot transaction starts, this remains consistent for the duration of the transaction. This provides transaction-level read consistency.

Enable snapshot isolation when:

  • Optimistic concurrency control is desired.
  • Probability is low that a transaction would have to be rolled back because of an update conflict.
  • An application needs to generate reports based on long-running, multi-statement queries that must have point-in-time consistency.

Limitations of Transactions Using Row Versioning-based Isolation Levels

  • READ_COMMITTED_SNAPSHOT cannot be enabled in tempdb, msdb, or master
  • Global temp tables are stored in tempdb. When accessing global temp tables inside a snapshot transaction, one of the following must happen:
    • Set the ALLOW_SNAPSHOT_ISOLATION database option ON in tempdb
    • Use an isolation hint to change the isolation level for the statement.
  • Snapshot transactions fail when:
    • A database is made read-only after the snapshot transaction starts, but before the snapshot transaction accesses the database
    • If accessing objects from multiple databases, a database state was changed in such a way that database recovery occurred after a snapshot transaction starts, but before the snapshot transaction accesses the database. For example: the database was set to OFFLINE and then to ONLINE, database autoclose and open, or database detach and attach.
  • Distributed transactions, including queries in distributed partitioned databases, are not supported under snapshot isolation.
  • SQL Server does not keep multiple versions of system metadata. Data definition language (DDL) statements on tables and other database objects (indexes, views, data types, stored procedures, and common language runtime functions) change metadata. If a DDL statement modifies an object, any concurrent reference to the object under snapshot isolation causes the snapshot transaction to fail. Read-committed transactions do not have this limitation when the READ_COMMITTED_SNAPSHOT database option is ON.

Differences between snapshot isolation and read committed isolation using row versioning
row-versioning-isolation-level

Hope this has been as interesting for you as it was for me!

10452355_653838834692651_9207738174004700610_n

SQL Transaction Isolation Levels

Isolation levels can be quite useful when retrieving data and inserting data to maintain data integrity. Isolation levels control the locking and row versioning behaviour between SQL statements issued by a connection to SQL server. It defines the degree to which one transaction must be isolated from resource or other modification made by other transactions.

There are 5 types of isolation levels

  • Read uncommitted
  • Read committed
  • Repeatable Read
  • Serializable
  • Snapshot

Read Uncommitted
This allows you to read rows that have been modified by other transaction but not yet committed. This is the lowest isolation level. Read Uncommitted does not use shared locks and is not blocked by exclusive locks. This option has the same effect as setting a NOLOCK on all tables in a SELECT statement in a transaction.

Read Committed
This allows you to only read records that have been modified and committed by other transactions. This prevents dirty reads from happening. The SQL server default will still allow non-repeatable reads and phantom reads; this is where the data ban be changed by other transactions between individual statements within the current transaction.

Repeatable Read
Statements cannot read data that has been modified but not yet committed by other transactions and that no other transactions can modify data that has been read by the current transaction until the current transaction completes.

Shared locks are placed on all data read and are held until the transaction completes. Therefore, no transactions can modify any rows that have been read by the current transaction. Other transactions can insert new rows that match the search conditions of statements issued by the current transaction. Phantom reads will occur if the current statement tries to read the new rows.

Serializable
Statements cannot read data that has been modified but not yet committed by other transactions. No other transactions can modify data that has been read by the current transaction until the current transaction completes. Other transactions cannot insert new rows with key values that would fall in the range of keys read by any statements in the current transaction until the current transaction completes. This option has the same effect as setting HOLDLOCK on all tables in all SELECT statements in a transaction.

Snapshot
This is exactly what it sounds like, this takes a snapshot of the data at time of execution. Any modification made to the data after the start of the transaction is not visible to the current transaction.

When a database is being recovered, snapshot transactions do not request lock transactions when reading data and they do not block other transactions from writing data.

Considering Isolation which level to use
When considering which isolation level to use, you need to think in terms of concurrency side effects, such as dirty reads, phantom reads or non-repeatable reads. You also need to consider the balancing of data integrity against the overhead of the isolation level.

The below table shows the concurrency side effects allowed by the different isolation levels.
isolation-level
Transaction isolation levels control the following:

  • Whether locks are taken when data is read, and what type of locks are requested.
  • How long the read locks are held.
  • Whether a read operation referencing rows modified by another transaction:
    • Block until the exclusive lock on the row is freed.
    • Retrieve the committed version of the row that existed at the time the statement or transaction started.
    • Read the uncommitted data modification.

Regardless of any isolation level set on a transaction, the transaction will always get an exclusive lock on any data it modifies and holds the lock until the transaction completes. If you choose to use a lower isolation level such as read uncommitted to increase the amount of possible users that can access the data at the same time, this shall also increase the number of concurrency effects such as dirty reads or lost updates that users may encounter.

A higher isolation level such as serializable will reduce the amount on concurrency effects, however it shall require more system resources and increases the possibility that one transaction will block another.

FILESTREAM-enabled databases support the following transaction isolation levels.
filestream-isolation-level
Look out for my next post on using row versioning-based isolation levels!
10506845_669993303077204_4716785257621391777_o

Contained Database

A contained database is a database that is isolated from other databases and from the instance of SQL Server that hosts the database. Basically it includes all database settings and the metadata within itself thereby resulting in no configuration dependencies on the instance of the SQL Server Database Engine where the database is actually installed.

This feature is available from SQL Server 2012 and it is available on Azure SQL Database. This feature is not available on Azure SQL Data Warehouse or Parallel Data Warehouse.

One of the benefits of contained databases is that you will no longer struggle with orphaned users when moving databases or working with high availability. From SQL Server 2012 it introduces the notion of boundaries which are borders that define where a database and its metadata and settings end and where server-level features, metadata, and dependencies begin. In addition to implicitly allowing increased portability, boundaries are also used to help explicitly isolate data, settings, features, and capabilities.

There are 3 types of containment levels:

  • NONE. This default containment mode is what exists prior to SQL Server 2012. There are no boundaries defined or visible.
  • PARTIAL. With partially contained databases, you have the ability to define clearer boundaries between databases and the server, making it easier for metadata to be hosted within the databases. This, in turn, makes SQL Server databases more portable and less dependent on underlying hosts.
  • FULL. Full containment will enable greater database portability and allow for strict enforcement of containment boundaries as a means of fully isolating databases from each other and from the underlying host.

In SQL Server 2016 there are 4 ways in which SQL Server helps users to isolate the database from the instance.

  • The metadata that describes the database is maintained in the database and not in the master database.
  • All metadata are defined using the same collation
  • User authentication can be performed by the database, reducing the databases dependency on the logins of the instance of SQL Server
  • The SQL Server environment (DMV’s, XEvents, etc.) reports and can act upon containment information.

There are two types of users for contained databases:

  • Contained database user with password. These are authenticated by the database.
  • Windows principals. Authorized windows users and members of authorized Windows groups can connect directly to the database and do not need logins in the master database. The database trusts the authentication by Windows.

Benefits of using Partially Contained Databases
Database Movement
When moving a database, problems occur with some important information being unavailable after moving to another instance. Login information is stored within an instance and not within a database, therefore when moving the database to another instance will lose this information. The partially contained database can store important information in the database so the database still has the information after it is moved.

Benefit of Contained Database Users with Always On
By reducing the ties to the instance of SQL Server, partially contained databases can be useful during failover when you use Always On Availability Groups. The contained users is a very significant feature in an Always On solution. If the users are contained users, in the event of a failover, people would be able to connect to the failover node without creating logins on the instance hosting the failover node.

Initial Database Development
When creating a database, the developer may not always know where the database will be deployed and all the environmental variables that could impact the database. With a non-contained model, the developer would need to consider all the environmental variables and the impacts it could have on the database and develop accordingly. With using partially contained databases, the developer can identify instance-level impacts on the database.

Database Administration
Database administration becomes easier with maintaining database settings, instead of having the settings in the master database you can now maintain them in the database. This allows each database owner to have more control over their database without giving the database owner sysadmin permission.

Limitations with Partially Contained Databases
With almost any feature it comes with certain limitations. The following are limitations of a Partially Contained Database where the below features are not allowed:

  • Cannot use replication, change data capture, or change tracking.
  • Numbered procedures
  • Schema-bound objects that depend on built-in functions with collation changes
  • Binding change resulting from collation changes, including references to objects, columns, symbols, or types.

Identifying Database Containment
There are two tools to identify the containment status of a database, sys.dm_db_uncontained_entities and database_uncontained_usage event. sys.dm_db_uncontained_entities is a view that shows all potentially uncontained entities in the database. However if any entities are using dynamic SQL, it cannot be determined until runtime whether the entity is contained or not. Database_uncontained_usage Xevent occur whenever an uncontained entity is identified at run time. Therefore, any uncontained user entities you have not run will not be identified by this XEvent.

This could be an awesome feature to use for database developers from SQL Server 2012, hope this post was helpful for you!

10252137_641249792618222_6085988990639587497_n

Temp Table VS Table Variable

There have been many discussions on whether a temp table is better to use or a table variable. It depends on what you would like to do. They both have their pros, cons and limitations.

Table Variable Limitations

  • Table variable changes do not get rolled back during a transaction ROLLBACK whereas temp table changes will.
  • Table variables cannot have Non-Clustered Indexes
  • You cannot create constraints in table variables
  • You cannot create default values on table variable columns
  • Statistics cannot be created against table variables
  • Table variables cannot have a parallel plan, whereas temp tables are not restricted in this manner
  • Table variables cannot be selected into, altered, truncated or be the target of the DBCC commands such as DBCC CHECKDENT or of SET IDENTITY INSERT
  • Query hints are not supported
  • CHECK constraints on table variables are not considered by the optimizer for simplification, implied predicates or contradiction detection.

Similarities between Temp Tables and Table Variables

  • Clustered indexes can be created on both
  • Both are logged in the transaction log
  • Instantiated in the tempdb
  • You can perform DML queries against the tables

Logical Location
Table variables behave more as if they were part of the current database. The table variable column collations if not specified explicitly will be that of the current database. Temp tables will use the default collation of tempdb. User-defined data types and XML collections must be in tempdb in order for you to use it in temp tables, however, table variables can use them from the current database.

  • In a contained database, temporary table data is collated in the collation of the contained database. All metadata associated with the temporary tables will be in the catalog collation.
  • Named constraints may not be used in temporary tables.
  • Temporary tables may not refer to user-defined functions, user-defined types or XML collections.

Visibility to different scopes
Table variables are limited in such that they can only be accessed in the current batch and scope in which they are declared. Temp tables are accessible within child batches (nested triggers, procedures, exec calls). Temp tables created at the outer scope (@@NESTLEVEL=0) can span batches too as they persist until the session ends.

Lifetime
Table variables are created implicitly when containing a DECLARE @MyTableVariable TABLE statement above any user code is executed, and they are dropped implicitly at the end. Temp tables are created explicitly when the CREATE TABLE #MyTemp statement is encountered and can be dropped explicitly with the DROP TABLE or will be dropped when the session ends. If created in a child batch with @@NESTLECT > 0 it will be dropped implicitly.

Within stored routines both types can be cached rather than repeatedly creating and dropping new tables. There are restrictions on when this can occur, it is possible to violate temp tables, the restrictions on table variables prevent any violations. The maintenance overhead on cached temp tables is slightly greater than for table variables. For more info on this, take a look here.

Transactions
Temp table operations are carried out as part of the user transaction itself, whereas table variable operations are carried out as system transactions, independent of any outer user transaction. A ROLLBACK command will therefore affect a temp table but leave a table variable untouched.

Logging
Both generate log records in the tempdb transaction log. A common misconception is that this is not the case for table variables. The quantity of logging is very similar, a very important difference is that log records related to temp tables cannot be cleared out until any containing user transaction finishes. A long running transaction that at some point writes to temp table will prevent log truncation in the tempdb whereas the autonomous transaction spawned for table variables do not.

Indexes
For versions prior to SQL Server 2016 indexes can only be created implicitly on table variables as a side effect of adding a unique constraint or primary key. This means that only unique indexes are supported. Table variables do not support included columns, filtered indexes (until 2016) or partitioning, whereas temp tables do.

In SQL Server 2014 non unique indexes can be declared inline in the table variable definition. Now in SQL Server 2016 filtered indexes can be applied to table variables.

Hope the above gives you a bit more clarity on the benefits of each one in helping you decide whether to use a temp table or table variable in your scenarios.

whatittakes

Limit Memory Used By SQL

Ever noticed SQL using most of the Memory (RAM) on the server and all your other applications start slowing down? This could be because SQL is using the maximum amount of memory it can get a hold of.

When SQL is installed the default max amount of memory it is allowed to use is set to 2147483647 MB. You can check this by right-clicking on the server at the top in the Object Explorer, selecting Properties, then selecting memory.
memory
Properties
To decide what to set the max memory of SQL to use you can use this simple rule; leave 4GB or 10% of total memory free, whichever is larger on your instance to start with, and adjust this as needed. The free memory will be consumed by the OS and other applications on the server.

To calculate this, remember 1GB is 1024MB. If your server has 32GB of RAM then you would take 28 x 1024 = 28672MB. You would then insert this into the max memory.
memoryset
NB! This does not require restarting your SQL Server instance, but it will cause execution plans to recompile.
motivational-quotes-for-students-studying-quotesgram-NQVh0v-quote

SSRS Reporting Server Recycle Time

Many people complain about their reports taking a long time to load when they access them first thing in the day. This is due to the reporting server’s recycle time. What happens is SSRS regularly restarts the application domain after a specific time period. Upon first request to the report server after the application domain has been restarted, SSRS needs to load all the settings and this takes quite a long time.

The default recycle time for the application domain is 720 minutes. This can be increased to a more reasonable time, however this does still not solve the problem. The first request to the report server will still take a long time to process.

If you would like to increase the recycle time you can do this in the rsreportserver.config file. The file can be located in “C:\Program Files\Microsoft SQL Server\MSRS13.MSSQLSERVER\Reporting Services\ReportServer”

You can search for the below line of code in the config file and increase the time:

<RecycleTime>720</RecycleTime>

The better solution would be to schedule a PowerShell script that refreshes the report server twice a day. This will make the access to the report server for the first user fast. They will no longer have to wait long for the report to load.

There are two PowerShell scripts that you can use. The first script will access the report server:

$root = 'https://Hostname/reports/'
Invoke-Webrequest $root

The second script will start up IE and open up a report that is deployed on the server. You can specify which report to open up.

v=$root = 'http://Hostname/reports/report/ReportNameToAccess'
$IE=new-object -com internetexplorer.application
$IE.navigate2("$root")
$IE.visible=$true

You can schedule either of these scripts to run with Windows Task Scheduler.

Hope this helps you with speeding up the load time of the reports!

Persistent