Use Key Vault Secrets in Azure Data Factory

Azure Data Factory is an ETL and orchestrator tool for building cloud-native data engineering pipelines. It has a lot of source connectors available, and this list is growing rapidly. To secure connection details in the Data Factory, we can store credentials in Azure key vault and access them in Azure Data Factory. This article describes how to access key vault secrets from Azure Data Factory linked services.

What is Azure Key Vault?

Azure Key Vault is a cloud service used to securely save and access secrets. The secret could be anything we want to secure, like API keys, credentials, etc. It provides data encryption when it’s moving from a key vault to a client application, making it more secure. Read more about it here.

Let’s see it in action

Pre-requisites:
To setup Azure key vault for storing Azure Data Factory credentials, we need:
1. Azure subscription and a resource group with Azure Data Factory and Azure key-vault.
2. Permission on key vault for setting up Access policies.

Follow the below steps to use Azure key vault secrets in Azure Data Factory.

1. Create linked service for the key vault in Azure Data Factory (ADF).

Key vault base URL is https://<keyvaultName>.vault.azure.net

2. Grant access to Azure Data Factory service principal on Azure key vault. Follow the below steps to do this.

a. Navigate to key vault resource in Azure portal and click on Access policies

b. Click on Add Access Policy followed by Select principal and search for Azure Data Factory resource by its name. Here, the service principal is nothing but an app (managed identity) created for the data factory in Azure active directory,

In case the managed identity for Azure Data Factory does not exist in the Azure active directory, you can create it by running below Azure CLI.

#Generate managed identity for Data Factory.
Set-AzDataFactoryV2 -ResourceGroupName $resourceGroupName -Name $factoryName -Location $location

Select at least get and set permission for Key permissions and Secret permissions, as shown in the image below.

3. After permission configuration, the connection with the key vault should be successful.

Pro tips:
1. Data Factory cannot store credentials in a Git repository; hence, it is advisable to use Azure key vault to store credentials. This will avoid the immediate publishing of linked services during the development.

See more

Pavan Bangad

9+ years of experience in building data warehouse and big data application.
Helping customers in their digital transformation journey in cloud.
Passionate about data engineering.