Comparing Master Data Services instances
As I have mentioned in previous posts, I've been working on a data warehouse project. Within the project we decided to use SQL Server 2008 R2 Master Data Services (MDS) to store all the warehouse specific reference data. Here are some reasons why MDS was used:
I came up with a method which helped to identify the changes fairly easily, below is the method I used. I will say that it’s not a perfect solution and might not work for everyone or continue to work when MDS updates have been applied.
Run the following SQL script on both instance of the MDS database servers:
Then to test that all the entity which are required have been created use the following script:
Then use the visual studio 2010 schema compare tool against the two instances of the MDSCompare databases to highlight any structural changes which have been made to the entities.
To find data changes use the following script on the MDSCompare databases to create primary keys:
Then use the visual studio 2010 data compare tool to highlight any data differences.
While using the above methods to do the comparisons I found some differences which I had to ignore. The reason we had to ignore them was because of the following scenario:
While loading data through the batch staging process we had to set the default name attribute of the entities to be an empty string. This is because it would not allow nulls and we did not want to use this attribute. However the development instance has the default name attribute set with nulls. I believe this was because the MDS deployment tool had converted the empty element of the deployment package, which was created because of the empty string, to null while uploading.
- Allowed us to manage the reference data.
- Reference data could be loaded into the warehouse like a source system
- Would allow the client’s data governance team to easily update the reference data and keep the warehouse up-to-date
I came up with a method which helped to identify the changes fairly easily, below is the method I used. I will say that it’s not a perfect solution and might not work for everyone or continue to work when MDS updates have been applied.
Run the following SQL script on both instance of the MDS database servers:
1: CREATE DATABASE MDSCompare
2:
3: DECLARE @vColList AS VARCHAR(MAX)
4: DECLARE @vViewName AS SYSNAME
5: DECLARE @vSQL AS VARCHAR(MAX)
6: DECLARE @vEntityID AS INT
7: DECLARE @vModelID AS INT = (SELECT id FROM mdm.tblModel where Name = '<ModelName,Char,Master Data>')
8:
9: DECLARE EntityID_Cursor CURSOR LOCAL FORWARD_ONLY FAST_FORWARD READ_ONLY
10: FOR SELECT ID FROM mdm.tblEntity e WHERE model_id= @vModelID order by id
11:
12: OPEN EntityID_Cursor
13:
14: FETCH NEXT FROM EntityID_Cursor
15: INTO @vEntityID
16:
17: WHILE @@FETCH_STATUS = 0
18: BEGIN
19:
20: SELECT @vViewName = REPLACE(e.Name,' ','')
21: , @vColList = COALESCE(@vColList + ', [' + a.name + ISNULL('_' + la.name,'') +']', '[' + a.name + ISNULL('_' + la.name,'') +']')
22: FROM mdm.tblAttribute a
23: INNER JOIN mdm.tblEntity e
24: ON e.id = a.entity_id
25: LEFT OUTER JOIN mdm.tblAttribute la
26: ON a.domainEntity_Id = la.entity_id
27: AND la.attributeType_id = 1
28: AND la.IsSystem = 1
29: WHERE a.entity_id = @vEntityID
30: AND a.attributeType_id <> 3
31:
32: SET @vSQL = 'SELECT ' + @vColList + ' INTO MDSCompare.dbo.' + @vViewName + ' FROM mdm.' + @vViewName
33:
34: EXEC (@vSQL)
35:
36: FETCH NEXT FROM EntityID_Cursor
37: INTO @vEntityID
38:
39: SELECT @vColList = null
40: , @vViewName = null
41: , @vSQL = null
42:
43: END
44:
45: CLOSE EntityID_Cursor
46: DEALLOCATE EntityID_Cursor
Then to test that all the entity which are required have been created use the following script:
1: DECLARE @vModelID AS INT = (SELECT id FROM mdm.tblModel where Name = '<ModelName,Char,Master Data>')
2:
3: ;WITH MDSEntity(Entityname)
4: AS
5: (
6: select REPLACE(e.Name,' ','') Entityname
7: from mdm.tblEntity e WHERE model_id= @vModelID
8: )
9: , CompareTables (TableName)
10: AS
11: (
12: SELECT Table_Name
13: FROM MDSCompare.INFORMATION_SCHEMA.TABLES t
14: where t.table_schema = 'dbo'
15: )
16: SELECT *
17: FROM MDSEntity e
18: LEFT OUTER JOIN CompareTables c
19: ON e.Entityname = c.TableName
20: WHERE c.TableName IS NULL
Then use the visual studio 2010 schema compare tool against the two instances of the MDSCompare databases to highlight any structural changes which have been made to the entities.
To find data changes use the following script on the MDSCompare databases to create primary keys:
1: USE MDSCompare
2:
3: SELECT 'ALTER TABLE [' + table_schema + '].[' + table_name + '] WITH NOCHECK ADD CONSTRAINT PK_' + table_name+'_Code PRIMARY KEY CLUSTERED (Code)'
4: FROM INFORMATION_SCHEMA.COLUMNS
5: WHERE column_name = 'code'
Then use the visual studio 2010 data compare tool to highlight any data differences.
While using the above methods to do the comparisons I found some differences which I had to ignore. The reason we had to ignore them was because of the following scenario:
While loading data through the batch staging process we had to set the default name attribute of the entities to be an empty string. This is because it would not allow nulls and we did not want to use this attribute. However the development instance has the default name attribute set with nulls. I believe this was because the MDS deployment tool had converted the empty element of the deployment package, which was created because of the empty string, to null while uploading.
Comments