Shared Integration Runtime in Azure Data Factory

Azure Data Factory is an ETL and orchestrator tool for building cloud-native data engineering pipelines. It has a lot of source connectors available and this list is overgrowing. Azure Data Factory uses Integration Runtimes as a compute infrastructure to execute data movement and data transformation activities. This article describes how to set up self-hosted shared integration runtime in Azure Data Factory.

What is Integration Runtime?

Integration Runtime (IR) in Azure Data Factory provides computational infrastructure for pipelines to run. It is also instrumental to secure data movement in the cloud. Read more about it here.

It consists of three types of Integration runtimes:

  1. Azure: This is the default one that is fully managed by Azure. Suitable to connect to Azure resources. Since it is managed by Azure, the data movement is in the public network.
  2. Self-hosted: This IR can be used to connect to on-prem sources as well as secure data movement between the systems. This IR supports the data movement in the private network.
  3. Azure-SSIS: This is used for running SSIS packages in Azure Data Factory

Let’s see it in Action

This section describes how to set up self-hosted shared integration runtime in Azure Data Factory.

Pre-requisites:
1. Azure subscription and a resource group with Azure Data Factory.
2. Self Hosted Integration runtime setup in Azure Data Factory.

  1. Select the integration runtime that needs to be shared.

Select the integration runtime which needs to be shared with other Azure Data Factories and edit.

Shared Integration Runtime in Azure Data Factory

Copy the ResourceID and save it. We will use in our next steps.

Click on Grant Permission to another Data Factory or user-assigned managed identity button as shown in the above image.

2. Grant permission to target Azure data factory:

Grant access to target Azure Data Factory which will use shared integration runtime.

3. Now, go to the Target Data Factory and add an Integration Runtime

Click New and select Azure, self-hosted

Under External Resources, select Linked Self-Hosted and Click Continue.

It will ask few details about the shared IR.

Name: put the name of the new IR.

Resource ID: copy and paste the resource ID which we have saved previously and click on Create.

and we are done!

Now we are using the same IR in 2 different azure data factory instances.

Pro tips:
1. Please make sure you have Microsoft.Authorization/roleAssignments/write permission on subscription.
2. Self-hosted IR, requires a virtual machine and Integration runtime software. This means we need to pay the cost of the hosted virtual machine. By sharing the IR between the multiple Data Factory instances, we can use the same virtual machine amongst multiple Data Factories.

See more

Demo- Migrate SSIS Catalog from SQL Server to Azure data factory SSIS IR.

Pavan Bangad

9+ years of experience in building data warehouse and big data application. Helping customers in their digital transformation journey in cloud. Passionate about data engineering.