Azure Data Factory Pre and Post Deployment for Octopus Deploy
This post covers how I solve a deployment and maintenance problem with Azure Data Factory (ADF). The problem was ensuring that the ADF instance only contains the active datasets, pipelines, dataflows and triggers. As mentioned in a previous post, my current company has tight security controls, which prevents us from linking our ADF instance to our git repository; our deployment method uses Octopus Deploy with a mixture of ARM template and Terraform.
I use the Terraform to create the ADF instance and then ARM template for the ADF objects, like datasets, pipelines and dataflows, which causes the maintenance problem because the ARM templates deployment process does not delete objects that don’t exist in the ARM template.
We found this out after we did a recent significant change to our ADF code and found many unused objects still in our ADF instance. I didn’t want to leave the unused objects for the following reasons:
- It would make it hard for other developers to understand what ADF objects our project uses and those that are not.
- Waste time with updating unused ADF objects when underlying data structures were changed.
- Being too scared to remove unused ADF objects because no one can remember what is used and don’t want to break the process.
Also, I was not too fond of the idea of deleting unused objects manually, as it would take a while to do and could remove the wrong object by mistake. I tried to see if there was a built-in automated method to remove unused objects, I didn’t find one, but I came across this link: Sample pre- & post-deployment script (https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-delivery-sample-script).
The sample script was useful because within Octopus you can create PowerShell Script Modules and use them in your deployment projects for Pre-Deployment, Post-Deployment and Custom Deployment scripts. After making some changes to the sample script to work with our ARM template and by converting it to be more to a PS module with 2 functions:
- Invoke-ADFPreDeploymentStep
- Invoke-ADFPostDeploymentStep
The Invoke-ADFPreDeploymentStep helped resolve another problem with deploying changes to our ADF instance: disable any triggers linked to pipelines in the ARM template. Invoke-ADFPostDeploymentStep is where all the deleting of the unused objects happened; I also added an extra flag to control the starting of the triggers to ensure that triggers could be configured only to be running in our production environment.
A copy of this script can be downloaded from this location: DeployDataFactory.ps1.
There is a test script that you can use to see if the script works with your ARM template from this location: Testing-DataFactory.ps1
Comments
Indium Software
Data and analytics Services
Digital Engineering Solutions
Databricks Consulting Services
Visit us - Azure Data Factory Online Training
Terraform Online Training
Terraform Training in Hyderabad
Terraform Course Online
Terraform Online Training Institute in Hyderabad
Automation with Terraform on Azure Cloud
Terraform Automation in Azure Cloud Trainingt