20 Dec 2006

To Collate or not to Collate

Collation plays an important role within the database, as it sets out how the SQL Server has to manage string data within storage & queries, yet I don’t script out the collation when scripting out my databases.  So why don’t I script out the collation if it so important? Its because if the collation is specified but not used properly within the database it can cause some of most common problems, like “Cannot resolve collation conflict for EQUAL TO operation” or “Cannot resolve collation conflict for UNION operation”. These errors are caused when string data within different collations are being compared or combined into the same result set; this is nothing new so why talk about collation now? Well it’s to do with the release of Visual Studio 2005 Team Edition for Database Professionals (VSTDB) something I think needed to be taken in consideration if this is being used to manage the database side of the project.


A cause for theses types of problem is when a table is created on a database with the collation scripted out and deployed on another database which has a different collation to the create script.  With the VSTDB this will not occur as much because a project will be assigned a collation and the database will be created with this collation or if the database does not get recreated and has a different collation then the tables will be created with the project collation (This is the desired behavior when I saw this last and was confirmed when I spoke to the VSTDB Team in November). So I can see that VSTDB will help with reducing these types of errors.


Another cause of the error is when creating temp tables without setting the collation to the same as the database collation and the server collation is different.  There are 2 ways to resolve this issue. The first is to rebuild your system tables to match the collation of the database that has been created or another way is by using the collation clause within the create temp table script(s) within the T-SQL code.  These errors could be become more frequent when using the VSTDB; because as mentioned before the database or tables that are create under the VSTDB are assigned the project collation, which will have to be of a type collation as there is no server default option and this could be different to that of the server that is being deployed to. 


By taking the time to put in the extra collation clause in the temp table script(s) it reduce the risk of database code failing with “cannot resolve collation” errors on any SQL server configuration.  Also it reduce the temptation of people changing the VSTDB project collation to match there server collation if they can’t change their SQL server default collation and causing different behavior of SQL statements.


I can understand the reason for making the database project ask for a collation because it means that when you unit test the code on any server it will always behave in the same way. Also it means as stated before, the database can run on any server with less risk of collation errors.

No comments: