Azure Data Factory is a Microsoft cloud ETL service for serverless data integrations and orchestrations. It supports a lot of source and target connectors, including the Azure SQL database. In this article, I will describe how to connect to Azure SQL from Data Factory using managed Identity authentication.
Pre-requisites:
To implement linked service to Azure SQL database from Azure data factory, you would need:
1. Azure subscription and a resource group with Azure Data Factory and Azure SQL database.
2. Admin permission to grant access to Azure SQL database.
3. Allow Azure services and resources to access this server checkbox is checked on Azure SQL database. This will enable data factory to access Azure SQL database.
What is Managed Identity?
Managed identity is a service principal associated with resources in Azure. When we create a data factory in Azure, it automatically creates an app in Azure Active Directory. Suppose we want to connect Azure SQL using Azure Data Factory. In that case, we need to grant Azure Data Factory app access to the Azure SQL database like any other Azure Active Directory user. This makes access management more secure and easy.
Benefits of using managed identity authentication
1. You don’t need to manage credentials. Credentials aren’t even accessible to you.
2. You can use managed identities to authenticate to any resource that supports Azure AD authentication, including your own applications.
3. Managed identities can be used at no extra cost.
Types of managed identities
There are two types of managed identities:
1. System assigned managed identity – This is the identity that is associated with Azure resources like Azure Data Factory.
2. User assigned managed identity – This identity is created and managed by user in Azure portal. Learn more about it here.
This article will cover how to grant access to Azure data factory’s system assigned managed identity on Azure SQL database.
Enable Managed Identity on Azure Data Factory
System Assigned Managed, identity for Azure Data Factory will be automatically created when we create it through the Azure portal or PowerShell. If you do not find the managed identity of Azure Data Factory, you can create it using the below PowerShell command. You would need the Az PowerShell module installed on the local machine.
#Generate managed identity for Data Factory.
Set-AzDataFactoryV2 -ResourceGroupName $resourceGroupName -Name $factoryName -Location $location
You can also achieve this in the Azure portal. You can go to the Azure Data Factory in the Azure portal and enable the managed identity for the data factory, as shown in the image below. This will create a managed identity for the data factory in Azure Active Directory.
Note: The managed identity of Azure Data Factory must be in the same Azure Entra ID tenant as the Azure SQL database server. If the managed identity is not in the same tenant, you can use a service principal with an access token instead.
Permission to grant Azure Data Factory managed identity access to Azure SQL database
Azure Active Directory users with admin permission on Azure SQL database can only grant access to the managed identity of Azure Data Factory. In case you do not have an Active Directory Admin account set for Azure SQL database, you can do this with the below steps.
- Navigate to Azure SQL database resource in the Azure portal and Click on Azure Active Directory under the Settings section.
- Click on the Set admin button.
- Search for the Azure active directory user or security group to which you want to grant admin access.
- Click Select, and this will select the admin account.
- Click Save to enable the admin.
Grant Azure Data Factory managed identity access to Azure SQL database
Now that we have an Azure Active Directory admin account set for Azure SQL database, we can grant access to Azure Data Factory Managed Identity to Azure SQL database. You can use Azure SQL database client tools like SQL Server Management Studio 18 or Directly from Azure SQL database portal. If you want to grant permission using client tools like SSMS 18, log in to Azure SQL database using Active Active Directory – Universal with MFA and use an Azure SQL admin account.
Create a user in Azure SQL database for Azure Data Factory and grant roles to this newly created user based on your needs.
--Add Azure Data Factory managed identity as an account to Azure SQL database.
CREATE USER [name_of_the_adf] FROM EXTERNAL PROVIDER;
--Grant appropriate role to Azure Data Factory based on your needs.
ALTER ROLE [db_owner] ADD MEMBER [name_of_the_adf];
Create a Linked service to Azure SQL Database in Azure Data Factory
We are all set to connect Azure SQL from Data Factory using managed identity. Follow the below steps to create a linked service in Azure Data Factory for Azure SQL using managed identity authentication.
- Navigate to Azure Data Factory and Click on New Linked Service.
- Search for Azure SQL Database.
- Click Continue.
- On the new linked service page, specify Azure subscription, Azure SQL Server name, and Database name.
- In the Authentication type, select System Assigned Managed Identity.
- Test the connection and click on Create.
Retrieve Azure SQL connection string from key vault.
If you want to use Azure key vault instead of choosing the Azure SQL database details from Azure subscription, you would need to:
1. Create a Azure key vault with a secret containing connection string of the Azure SQL database. Format of the connection string should be like below.
Server=tcp:<yourservername>.database.windows.net,1433;Initial Catalog=<yourdatabasename>;
2. Add access policy in Azure key vault for Azure data factory. Minimum access could be ‘get’, ‘list’ on secrets.
3. Now, on Data Factory linked service to Azure SQL database, you can choose Azure Key Vault instead of Connection string and choose the secret from the key vault as shown in the below image.
You can now use this linked service in ADF activities to perform various operations on the Azure SQL database.
Pro tips:
1. If you use version control (Azure DevOps or git integration) for Azure Data Factory, you need to use a Key vault-based connection string instead of manually specifying Azure SQL database details in the linked service.
2. While creating a user for Azure Data Factory in Azure SQL database, if you get an error ‘Principal ” could not be found or this principal type is not supported.’. This means Azure Data Factory does not have managed identity set. Please refer to this article to fix it.
3. Learn how to manage Azure Data Factory using Azure automation account using managed identity authentication.
See more
Kunal Rathi
With over 13 years of experience in data engineering and analytics, I've assisted countless clients in gaining valuable insights from their data. As a dedicated supporter of Data, Cloud and DevOps, I'm excited to connect with individuals who share my passion for this field. If my work resonates with you, we can talk and collaborate.