<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Databricks Archives - AzureOps</title>
	<atom:link href="https://azureops.org/articles/category/azure/databricks/feed/" rel="self" type="application/rss+xml" />
	<link>https://azureops.org/articles/category/azure/databricks/</link>
	<description>Notable things about Cloud, Data and DevOps.</description>
	<lastBuildDate>Sun, 12 Oct 2025 08:16:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	

<image>
	<url>https://i0.wp.com/azureops.org/wp-content/uploads/2021/04/cropped-android-chrome-512x512-1.png?fit=32%2C32&#038;ssl=1</url>
	<title>Databricks Archives - AzureOps</title>
	<link>https://azureops.org/articles/category/azure/databricks/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">190208641</site>	<item>
		<title>Automate Databricks Infrastructure as Code with Terraform</title>
		<link>https://azureops.org/articles/automate-databricks-infrastructure-as-code-with-terraform/</link>
		
		<dc:creator><![CDATA[James Sandy]]></dc:creator>
		<pubDate>Fri, 04 Apr 2025 18:47:31 +0000</pubDate>
				<category><![CDATA[Azure]]></category>
		<category><![CDATA[Databricks]]></category>
		<category><![CDATA[DevOps]]></category>
		<category><![CDATA[Terraform]]></category>
		<category><![CDATA[Databricks cicd]]></category>
		<category><![CDATA[IAAC]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=8414</guid>

					<description><![CDATA[<p>This article describes how to Implement Infrastructure as Code Databricks with Terraform.</p>
<p>The post <a href="https://azureops.org/articles/automate-databricks-infrastructure-as-code-with-terraform/">Automate Databricks Infrastructure as Code with Terraform</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="">Terraform can be used to provision, manage, and scale Databricks environments; it allows i<a href="https://learn.microsoft.com/en-us/devops/deliver/what-is-infrastructure-as-code">nfrastructure to be defined as code (IaC)</a>, enabling version control, reproducibility, and automation. Using Terraform to deploy Databricks workspace streamlines resource provisioning, access management, and workspace configuration to ensure consistency across environments and simplify scaling or modifications. This article describes how to Automate Databricks Infrastructure with Terraform.</p>



<h2 class="wp-block-heading"><a></a>Terraform Environment Setup</h2>



<p class="">Databricks is a cloud-based tool built on Apache Spark used in machine learning, big data analytics, and data engineering to enable AI model deployment, fast data processing, and collaboration. Its scalable Delta Lake infrastructure supports its efficient, secure data management.</p>



<p class="">To manage Databricks infrastructure with Terraform, you have to install and configure some of the systems; the first step is installing Terraform and the Databricks provider. Download Terraform from the <a href="https://developer.hashicorp.com/terraform/downloads">official website</a> and ensure it is available in the system&#8217;s path, and then initialize a Terraform working directory by creating a new directory and running :</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; title: ; notranslate">
terraform init
</pre></div>


<p class="">Authentication in Databricks is configured using the credentials from cloud providers. A service principal is used to set up authentication for Azure:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [2]; title: ; notranslate">
provider &quot;databricks&quot; {
 host  = &quot;https://&lt;your-databricks-instance&gt;&quot;
 azure_client_id     = var.azure_client_id
 azure_client_secret = var.azure_client_secret
 azure_tenant_id     = var.azure_tenant_id
}
</pre></div>


<p class="">An Access key or service accounts will be required for AWS and GCP</p>



<h2 class="wp-block-heading"><a></a>Databricks Infrastructure with Terraform Setup</h2>



<p class="">Terraform allows configuration using HCL (Hashicorp Configuration Language). and configurations are made in the primary configuration file, main.tf, declares a Databricks workspace:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [1,2,3,4]; title: ; notranslate">
resource &quot;databricks_workspace&quot; &quot;example&quot; {
 name          = &quot;example-workspace&quot;
 resource_group = &quot;example-rg&quot;
 location      = &quot;East US&quot;
}
</pre></div>


<p class="">Configuring clusters within a workspace requires some important setup, like versioning, autoscaling, and specific nodes:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [1,2]; title: ; notranslate">
resource &quot;databricks_cluster&quot; &quot;example&quot; {
 cluster_name            = &quot;example-cluster&quot;
 spark_version           = &quot;12.0.x-scala2.12&quot;
 node_type_id            = &quot;Standard_D3_v2&quot;
 autotermination_minutes = 20
 autoscale {
   min_workers = 2
   max_workers = 8
 }
}
</pre></div>


<h2 class="wp-block-heading"><a></a>Access Control and Permissions</h2>



<p class="">Terraform will be able to control user roles and access control lists using the Databricks permission control user and group access</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [1,2,5,6,7]; title: ; notranslate">
resource &quot;databricks_group&quot; &quot;data_scientists&quot; {
 display_name = &quot;Data Scientists&quot;
}

resource &quot;databricks_user&quot; &quot;analyst&quot; {
 user_name  = &quot;analyst@example.com&quot;
 groups     = &#x5B;databricks_group.data_scientists.id]
}
</pre></div>


<p class="">Integrating Terraform with a cloud IAM system like <a href="https://www.microsoft.com/en-us/security/business/identity-access/microsoft-entra-id">Microsoft Entra Id</a> ensures authentication and access policies are in line with enterprise governance best practices.</p>



<figure class="is-style-default wp-block-image size-large is-resized"><a href="https://marketplace.visualstudio.com/items?itemName=AzureOps.ssiscatalogerpro&amp;ssr=false#overview" target="_blank" rel="noopener"><img data-recalc-dims="1" fetchpriority="high" decoding="async" width="1200" height="148" data-attachment-id="4839" data-permalink="https://azureops.org/articles/azure-data-studio-for-sql-developers/scmw-horizontal-ad/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1326%2C163&amp;ssl=1" data-orig-size="1326,163" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SCMW-horizontal-ad" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=300%2C37&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1200%2C148&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&#038;ssl=1" alt="" class="wp-image-4839" style="object-fit:cover;width:811px;height:99px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=450%2C55&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=600%2C74&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=300%2C37&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=768%2C94&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?w=1326&amp;ssl=1 1326w" sizes="(max-width: 1200px) 100vw, 1200px" /></a></figure>



<h2 class="wp-block-heading"><a></a>Deploying Databricks Jobs with Terraform</h2>



<p class="">Databricks jobs are used to automate batch and streaming workloads; the databricks_job resources define the job execution parameters:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [1,2,9]; title: ; notranslate">
resource &quot;databricks_job&quot; &quot;example_job&quot; {
 name = &quot;Example Job&quot;
 new_cluster {
   spark_version = &quot;12.0.x-scala2.12&quot;
   node_type_id  = &quot;Standard_D3_v2&quot;
   num_workers   = 4
 }
 notebook_task {
   notebook_path = &quot;/Shared/example_notebook&quot;
 }
}
</pre></div>


<p class="">Scheduling jobs and dependencies are managed within Terraform, allowing seamless automation of recurring tasks.</p>



<h2 class="wp-block-heading"><a></a>Versioning with Terraform</h2>



<p class="">Terraform uses a state file to track all deployed resources to ensure collaborations and prevent conflicts, so remote state storage must be configured.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [3,4,5,6]; title: ; notranslate">
terraform {
 backend &quot;azurerm&quot; {
   resource_group_name  = &quot;terraform-state-rg&quot;
   storage_account_name = &quot;tfstateaccount&quot;
   container_name       = &quot;tfstate&quot;
   key                  = &quot;databricks.tfstate&quot;
 }
}
</pre></div>


<p class="">This ensures changes made to infrastructure are managed safely using <em>terraform plan</em> and <em>terraform apply</em>, preventing unintended changes.</p>



<h2 class="wp-block-heading"><a></a>Automating Deployment with CI/CD</h2>



<p class="">To ensure infrastructure with consistency, Terraform needs to integrate with CI/CD; a Github action pipeline can be used for automated deployments:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; title: ; notranslate">
name: Terraform Deployment
on: &#x5B;push]
jobs:
 deploy:
   runs-on: ubuntu-latest
   steps:
     - uses: actions/checkout@v3
     - name: Setup Terraform
       uses: hashicorp/setup-terraform@v1
     - name: Terraform Init
       run: terraform init
     - name: Terraform Apply
       run: terraform apply -auto-approve
</pre></div>


<p class="">This pipeline can also be configured in GitLab CI/CD or Azure DevOps for streamlined deployments.</p>



<h2 class="wp-block-heading"><a></a>Monitoring and Scaling Databricks infrastructure</h2>



<p class="">Databricks environments require monitoring for performance and cost efficiency; Terraform can be used to configure monitoring solutions:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [1,2]; title: ; notranslate">
resource &quot;databricks_instance_profile&quot; &quot;monitoring&quot; {
 instance_profile_arn = &quot;arn:aws:iam::123456789012:instance-profile/DatabricksMonitor&quot;
}
Autoscaling is enabled on the clusters to optimize resource utilization without over-provisioning:
autoscale {
 min_workers = 4
 max_workers = 16
}
</pre></div>


<h2 class="wp-block-heading"><a></a>Errors and debugging Terraform Deployments</h2>



<p class="">Some common Terraform errors include authentication failures and state mismatches. Provider authentication issues can be resolved by verifying permissions and credentials. State mismatches often require manual state adjustments:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [1]; title: ; notranslate">
terraform state rm databricks_cluster.example
terraform apply
</pre></div>


<p class="">Quota restrictions and API rate limits must be handled by adjusting Terraform configurations to include retries and back-off mechanisms.</p>



<h2 class="wp-block-heading"><a></a>Databricks Deployments Optimization.</h2>



<p class="">More resources like <a href="https://www.databricks.com/product/unity-catalog">Unity Catalog </a>for data governance can be managed with Terraform:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: yaml; gutter: false; highlight: [2]; title: ; notranslate">
resource &quot;databricks_metastore&quot; &quot;unity&quot; {
 name = &quot;example-metastore&quot;
}
</pre></div>


<p class=""><a href="https://mlflow.org/">MLflow</a> tracking servers can also be set up to manage ML models and notebooks through Git integration, ensuring streamlined workflows.</p>



<p class="">This tutorial has shown how to automate Databricks infrastructure with Terraform, including infra provisioning, access control, job automation, CI/CD integration, and monitoring, but more optimizations like cost management, security best practices, and additional Databricks features can be included.</p>



<p class="has-background" style="background-color:#beefca"><strong>Pro tips:</strong><br>1. Follow this <a href="https://azureops.org/articles/manage-secret-scopes-in-databricks-using-gui/" target="_blank" rel="noreferrer noopener">guide </a>to know how to manage secret scopes in databricks.</p>



<h2 class="wp-block-heading">See more</h2>



<iframe width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>



<p class=""></p>
<p>The post <a href="https://azureops.org/articles/automate-databricks-infrastructure-as-code-with-terraform/">Automate Databricks Infrastructure as Code with Terraform</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">8414</post-id>	</item>
		<item>
		<title>Databricks VACUUM Command</title>
		<link>https://azureops.org/articles/databricks-vacuum-command/</link>
		
		<dc:creator><![CDATA[Kunal Rathi]]></dc:creator>
		<pubDate>Tue, 17 Sep 2024 00:22:00 +0000</pubDate>
				<category><![CDATA[Azure]]></category>
		<category><![CDATA[Databricks]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=2860</guid>

					<description><![CDATA[<p>Databricks is a unified big data processing and analytics cloud platform for transforming and processing vast volumes of data. Apache Spark is the building block of Databricks, an in-memory analytics engine for big data and machine learning. In this article, we will see how to use the Databricks VACUUM command to remove unused files from [&#8230;]</p>
<p>The post <a href="https://azureops.org/articles/databricks-vacuum-command/">Databricks VACUUM Command</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="">Databricks is a unified big data processing and analytics cloud platform for transforming and processing vast volumes of data. Apache Spark is the building block of Databricks, an in-memory analytics engine for big data and machine learning. In this article, we will see how to use the Databricks VACUUM command to remove unused files from the delta table.</p>



<h2 class="wp-block-heading">What is VACUUM in the Delta table?</h2>



<p class="">VACUUM empties all files from the table directory that are not managed by Delta and data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold.</p>



<h2 class="wp-block-heading">How to use Databricks VACUUM on Databricks Delta tables</h2>



<pre class="wp-block-preformatted">database_names_filter = "20_silver_zendesk_eg"
dbs = spark.sql(f"SHOW DATABASES LIKE '{database_names_filter}'").select("databaseName").collect()
dbs = [(row.databaseName) for row in dbs]
for database_name in dbs:
print(f"Found database: {database_name}, performing actions on all its tables..")
tables = spark.sql(f"SHOW TABLES FROM {database_name}").select("tableName").collect()
tables = [(row.tableName) for row in tables]
for table_name in tables:
print(f"performing vaccum on {table_name}")
spark.sql(f"ALTER TABLE {database_name}.{table_name} SET TBLPROPERTIES ('delta.logRetentionDuration'='interval 2 days', 'delta.deletedFileRetentionDuration'='interval 1 days')")
spark.sql(f"VACUUM {database_name}.{table_name}")
spark.sql(f"ANALYZE TABLE {database_name}.{table_name} COMPUTE STATISTICS")</pre>



<p class="">If you run&nbsp;<code>VACUUM</code>&nbsp;on a Delta table, you lose the ability to&nbsp;<a href="https://learn.microsoft.com/en-us/azure/databricks/delta/history">time-travel</a>&nbsp;back to a version older than the specified data retention period.</p>



<h2 class="wp-block-heading">See more</h2>



<iframe width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>
<p>The post <a href="https://azureops.org/articles/databricks-vacuum-command/">Databricks VACUUM Command</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2860</post-id>	</item>
		<item>
		<title>Kafka Streaming vs Spark Streaming</title>
		<link>https://azureops.org/articles/kafka-streaming-vs-spark-streaming/</link>
		
		<dc:creator><![CDATA[Kunal Rathi]]></dc:creator>
		<pubDate>Wed, 15 Feb 2023 18:39:39 +0000</pubDate>
				<category><![CDATA[Confluent]]></category>
		<category><![CDATA[Databricks]]></category>
		<category><![CDATA[Apache Kafka]]></category>
		<category><![CDATA[Apache Spark]]></category>
		<category><![CDATA[kafka vs spark]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=4988</guid>

					<description><![CDATA[<p>Kafka Streams and Spark Streams are potent tools for real-time processing, Here are the key differences Kafka Streaming vs Spark Streaming.</p>
<p>The post <a href="https://azureops.org/articles/kafka-streaming-vs-spark-streaming/">Kafka Streaming vs Spark Streaming</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Kafka Streaming and Spark streaming are distributed computing frameworks that allow the processing of real-time data streams. In this article, you will see some differences between Kafka Streaming vs. Spark Streaming. </p>



<h2 class="wp-block-heading" id="h-what-is-data-streaming">What is Data Streaming?</h2>



<p>Data Streaming is a method in which input is produced continuously to perform transformations. The output is also retrieved as a constant data stream, also called setting data in motion.</p>



<h2 class="wp-block-heading">What is Kafka Stream?</h2>



<p>Kafka Streams is a library for building streaming applications that transform input Kafka topics into output Kafka topics.  Kafka Streams (Kstreams) internally uses producer and consumer libraries. It is coupled with Kafka, and the API allows you to leverage the abilities of Kafka by achieving Data Parallelism, Fault-tolerance, low latency, and much more.</p>



<h2 class="wp-block-heading">What is Spark Stream?</h2>



<p>Spark stream is an extension of the core Spark API that provides scalable, high-throughput, fault-tolerant stream processing of live data streams.  It allows real-time data processing from various sources like Kafka topics, Flume, Amazon Kinesis, etc. The processed data can be sink to file systems, databases, live dashboards, etc. </p>



<p>This article describes the difference between streaming part of Spark vs Kafka.</p>



<h2 class="wp-block-heading">Key difference between Kafka streaming and Spark streaming</h2>



<figure class="wp-block-table is-style-stripes"><table class="has-fixed-layout"><thead><tr><th></th><th>Kafka Streaming</th><th>Spark Streaming</th></tr></thead><tbody><tr><td>Technology stack</td><td>Kafka Streams is a Java library built on Apache Kafka, a distributed messaging system for real-time data streams.</td><td>Spark Streaming is a part of the Apache Spark ecosystem, a general-purpose big data processing engine.</td></tr><tr><td>Initial release</td><td>2016</td><td>2013</td></tr><tr><td>Processing model</td><td>Kafka Streams is a stream processing library that processes data records/events one at a time as they arrive in a stream.&nbsp;The processing logic assumes an independent record and some contextual/state information about the record. That limits the type of algorithms/computations you can implement in real-time. </td><td>Spark Streaming uses a micro-batch processing model, which simultaneously processes small batches of data records collected over time. The processing logic assumes you have all the related records available in the batch, allowing you to implement a wide range of algorithms/computations. </td></tr><tr><td>Fault tolerance</td><td>Kafka Streams leverages the built-in fault tolerance features of Kafka</td><td>Spark Streaming uses RDD (Resilient Distributed Datasets) to achieve fault tolerance.</td></tr><tr><td>Ease of use</td><td>Kafka Streams is known for its ease of use, as it has a simple and lightweight API designed to be developer-friendly.</td><td>Spark Streaming can be more complex to set up and configure, but it offers more features and tools for data processing and analysis.</td></tr><tr><td>Data sources and destinations</td><td>Can handle data from Kafka topics</td><td>Can handle data from Kafka topics and other sources like HDFS, AWS S3, data lakes, etc.</td></tr><tr><td>Integration</td><td>Kafka Streams is designed to work specifically with Kafka and requires a Kafka cluster to be set up.</td><td>It can run on various platforms, including Hadoop, Kubernetes, and Apache Mesos.</td></tr><tr><td>Managed cloud providers</td><td><a href="https://www.confluent.io/" target="_blank" rel="noreferrer noopener">Confluent</a>, AWS MSK, Azure Event Hub, GCP Pub/Sub, etc.</td><td><a href="https://www.databricks.com/" target="_blank" rel="noreferrer noopener">DataBricks</a>, AWS EMR, Azure HDInsight, GCP Dataproc, etc.</td></tr><tr><td>No-Code Low-Code API</td><td>kSQL</td><td>Spark SQL</td></tr><tr><td>When to go for</td><td>If your streaming application requires low latency processing of data from Kafka topics and you don&#8217;t need to process data from other sources, </td><td>If you need to process data from multiple sources or require a larger ecosystem and latency is not critical for your application. </td></tr><tr><td><br>Real-world examples</td><td><strong>Airbnb</strong>: Airbnb uses Kafka Streams to process and analyze real-time data from their website, mobile applications, and other platforms to provide personalized recommendations to their users, optimize their operations, and detect fraudulent activities.<br><strong>Goldman Sachs</strong>: Goldman Sachs uses Kafka Streams to process and analyze real-time financial data from different sources to monitor their trading activities, detect anomalies, and optimize their trading strategies.</td><td><strong>Uber</strong>: Uber uses Spark Streaming to process real-time data from their ride-hailing platform to monitor and improve the quality of their service, detect fraudulent activities, and optimize their operations.<br><strong>Netflix</strong>: Netflix uses Spark Streaming to analyze real-time customer data, monitor their streaming service, and perform real-time personalization to recommend personalized content to users.</td></tr></tbody></table><figcaption class="wp-element-caption">Kafka Streaming vs Spark Streaming</figcaption></figure>



<h2 class="wp-block-heading">Summary</h2>



<p>Kafka Streams and Spark Streaming are potent tools for real-time data processing, but they have different strengths and weaknesses depending on the specific use case and requirements. All the above differences are based on my experiences and research and may not be accurate.</p>



<h2 class="wp-block-heading">See more</h2>



<iframe loading="lazy" width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>
<p>The post <a href="https://azureops.org/articles/kafka-streaming-vs-spark-streaming/">Kafka Streaming vs Spark Streaming</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">4988</post-id>	</item>
		<item>
		<title>Databricks Certified Data Engineer Associate</title>
		<link>https://azureops.org/articles/databricks-certified-data-engineer-associate/</link>
		
		<dc:creator><![CDATA[Kunal Rathi]]></dc:creator>
		<pubDate>Sun, 13 Nov 2022 22:50:47 +0000</pubDate>
				<category><![CDATA[Certification]]></category>
		<category><![CDATA[Databricks]]></category>
		<category><![CDATA[Certifications]]></category>
		<category><![CDATA[Databricks certification]]></category>
		<category><![CDATA[Databricks certified data engineer associate exam questions]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=4124</guid>

					<description><![CDATA[<p>In this post, We have documented how to conquer the Databricks Certified Data Engineer Associate certification in this article. Databricks has introduced the Data Engineering Associate exam which consists of 45 multiple-choice questions. The time slot allocated is 1.30 hr. The passing score is 32 which is 70%. High-level topics with their weightage in the exam are:</p>
<p>The post <a href="https://azureops.org/articles/databricks-certified-data-engineer-associate/">Databricks Certified Data Engineer Associate</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="">Databricks is a unified big data processing and analytics cloud platform that transforms and processes enormous volumes of data. Apache Spark is the building block of Databricks, an in-memory analytics engine for big data and machine learning. This article has documented how to conquer the Databricks Certified Data Engineer Associate and Microsoft <a href="https://www.edureka.co/microsoft-azure-data-engineering-certification-course" target="_blank" rel="noreferrer noopener sponsored nofollow">Azure data engineer certification</a> (DP-203).<br>Databricks has introduced the Data Engineering Associate exam, which consists of 45 multiple-choice questions. The time slot allocated is 1.30 hr. The passing score is 32, which is 70%. High-level topics with their weightage in the exam are:</p>



<figure class="wp-block-image size-full is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" width="723" height="423" data-attachment-id="4153" data-permalink="https://azureops.org/articles/databricks-certified-data-engineer-associate/image-1-5/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?fit=723%2C423&amp;ssl=1" data-orig-size="723,423" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?fit=300%2C176&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?fit=723%2C423&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?resize=723%2C423&#038;ssl=1" alt="Databricks Certified Data Engineer Associate exam topics with weightages" class="wp-image-4153" style="width:550px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?w=723&amp;ssl=1 723w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?resize=450%2C263&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?resize=600%2C351&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image-1.png?resize=300%2C176&amp;ssl=1 300w" sizes="auto, (max-width: 723px) 100vw, 723px" /></figure>



<h2 class="wp-block-heading">Learning Pathway</h2>



<figure class="wp-block-image size-full is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" width="846" height="97" data-attachment-id="4131" data-permalink="https://azureops.org/articles/databricks-certified-data-engineer-associate/image-12/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?fit=846%2C97&amp;ssl=1" data-orig-size="846,97" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="image" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?fit=300%2C34&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?fit=846%2C97&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?resize=846%2C97&#038;ssl=1" alt="databricks certified data engineer associate certification roadmap." class="wp-image-4131" style="width:635px;height:73px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?w=846&amp;ssl=1 846w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?resize=450%2C52&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?resize=600%2C69&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?resize=300%2C34&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/11/image.png?resize=768%2C88&amp;ssl=1 768w" sizes="auto, (max-width: 846px) 100vw, 846px" /><figcaption class="wp-element-caption">Source Databricks</figcaption></figure>



<p class="has-pale-cyan-blue-background-color has-background"><strong>Before you begin</strong><br>1. You should understand Databricks, data engineering concepts, python, and SQL. <br>2. Databricks offers three certifications in the Data Engineering space. To attempt Data Engineer Associate Certification, learning Databricks Lakehouse Fundamentals and passing Databricks Lakehouse Fundamentals Accreditation is advisable.<br>3. Databricks data engineer exam and certification-related details on the Databricks website.<br>4. Databricks offers free exam vouchers to those who attend a three days Data Lake seminar organized by Databricks. Please watch out for the next event from <a href="https://www.databricks.com/events" target="_blank" rel="noreferrer noopener">here</a>. </p>



<h2 class="wp-block-heading">Prepare for the Exam</h2>



<ol class="wp-block-list">
<li class=""><a href="https://partner-academy.databricks.com/learn">Sign in</a> to the Databricks Partner Academy site for a self-paced video or paid instructor-led course using your office mail id.</li>



<li class="">Choose&nbsp;<a href="https://partner-academy.databricks.com/learn/lp/10/data-engineer-learning-plan" target="_blank" rel="noreferrer noopener">this</a> course for this Databricks Certified Data Engineer Associate certification exam.</li>
</ol>



<h3 class="wp-block-heading">Study these items in detail:</h3>



<ul class="wp-block-list">
<li class="">Databricks Lakehouse Platform <br><a href="https://www.youtube.com/watch?v=wNo-pfWAzQk" target="_blank" rel="noreferrer noopener nofollow">What is Lakehouse</a></li>



<li class="">ELT with Spark SQL and Python<br>Learn to perform Extract Load and Transform on data using Spark SQL and Python in Databricks.<br><a href="https://www.youtube.com/watch?v=Ia6fDlhlKXQ" target="_blank" rel="noreferrer noopener nofollow">Learn ELT with spark and python</a></li>



<li class=""><a href="https://www.databricks.com/session/incremental-processing-on-large-analytical-datasets" target="_blank" rel="noreferrer noopener nofollow">Incremental Data Processing </a><br><a href="http://html" target="_blank" rel="noreferrer noopener nofollow">Structured Streaming</a><br><a href="https://www.youtube.com/watch?v=8a38Fv9cpd8" target="_blank" rel="noreferrer noopener nofollow">Auto Loader</a><br><a href="https://www.databricks.com/discover/pages/getting-started-with-delta-live-tables" target="_blank" rel="noreferrer noopener nofollow">Delta Live Tables</a></li>



<li class=""><a href="https://www.databricks.com/blog/2021/09/08/5-steps-to-implementing-intelligent-data-pipelines-with-delta-live-tables.html" target="_blank" rel="noreferrer noopener nofollow">Production Pipelines</a></li>



<li class=""><a href="https://atlan.com/databricks-governance/" target="_blank" rel="noreferrer noopener nofollow">Data Governance</a></li>
</ul>



<figure class="is-style-default wp-block-image size-large is-resized"><a href="https://marketplace.visualstudio.com/items?itemName=AzureOps.ssiscatalogerpro&amp;ssr=false#overview" target="_blank" rel="noopener"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="148" data-attachment-id="4839" data-permalink="https://azureops.org/articles/azure-data-studio-for-sql-developers/scmw-horizontal-ad/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1326%2C163&amp;ssl=1" data-orig-size="1326,163" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SCMW-horizontal-ad" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=300%2C37&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1200%2C148&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&#038;ssl=1" alt="" class="wp-image-4839" style="object-fit:cover;width:811px;height:99px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=450%2C55&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=600%2C74&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=300%2C37&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=768%2C94&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?w=1326&amp;ssl=1 1326w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></a></figure>



<h2 class="wp-block-heading">Databricks Certified Data Engineer Associate Practice Exam</h2>



<p class="">To familiarize yourself with Databricks data engineer associate exam questions, please attempt the <a href="https://files.training.databricks.com/assessments/practice-exams/PracticeExam-DataEngineerAssociate.pdf" target="_blank" rel="noreferrer noopener">Practice Exam</a> prepared by Databricks to glimpse actual exam content standards and difficulty levels. </p>



<p class="">Once you are done with it and understand your preparation, I recommend practicing tests by <a href="https://www.udemy.com/course/databricks-certified-data-engineer-associate-practice-tests/" target="_blank" rel="noreferrer noopener">Akhil V</a> and <a href="https://www.udemy.com/course/databricks-certified-associate-data-engineer/" target="_blank" rel="noreferrer noopener">Certification Champs</a> on Udemy.</p>



<h2 class="wp-block-heading">Book Your Exam Slot</h2>



<p class="">If your practice test exam score is at least 90%, you can book your actual slot <a href="https://webassessor.com/databricks" target="_blank" rel="noreferrer noopener">here</a>. </p>



<p class="has-background" style="background-color:#bcefca"><strong>Pro tips:</strong><br>1.<a href="https://community.cloud.databricks.com/login.html" target="_blank" rel="noreferrer noopener"> Community Edition</a> of Databricks doesn’t cover all the topics for this exam. You may need to use paid Databricks with <a href="https://azure.microsoft.com/en-in/products/databricks/" target="_blank" rel="noreferrer noopener">Azure</a> or <a href="https://aws.amazon.com/quickstart/architecture/databricks/" target="_blank" rel="noreferrer noopener">AWS</a>. The smallest possible capacity might suffice learning requirements.<br>2. If you’re aiming for Developing Solutions for Microsoft Azure certification, take a look at&nbsp;<a href="https://azureops.org/articles/az-204-exam-reference/" target="_blank" rel="noreferrer noopener">these</a>&nbsp;helpful tips.</p>



<h2 class="wp-block-heading">See more</h2>



<iframe loading="lazy" width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>
<p>The post <a href="https://azureops.org/articles/databricks-certified-data-engineer-associate/">Databricks Certified Data Engineer Associate</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">4124</post-id>	</item>
		<item>
		<title>Databricks Secret Scopes – How to Create, Manage, and Use Securely</title>
		<link>https://azureops.org/articles/manage-secret-scopes-in-databricks-using-gui/</link>
		
		<dc:creator><![CDATA[Pavan Bangad]]></dc:creator>
		<pubDate>Wed, 14 Sep 2022 20:42:54 +0000</pubDate>
				<category><![CDATA[Azure]]></category>
		<category><![CDATA[Databricks]]></category>
		<category><![CDATA[Key Vault]]></category>
		<category><![CDATA[Databricks secret scopes]]></category>
		<category><![CDATA[Secret scropes in Databricks]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=3266</guid>

					<description><![CDATA[<p>Databricks platform is used to connect to multiple applications. Databricks requires credentials or secrets to connect to these applications. Databricks or Azure Key Vault can store these secrets securely. Secret scopes are used to manage the secrets which are stored in Azure Key Vault or Databricks.</p>
<p>The post <a href="https://azureops.org/articles/manage-secret-scopes-in-databricks-using-gui/">Databricks Secret Scopes – How to Create, Manage, and Use Securely</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="">Databricks is a unified big data processing and analytics cloud platform that transforms and processes huge volumes of data. Apache Spark is the building block of Databricks, an in-memory analytics engine for big data and machine learning. Databricks can connect to various sources for data ingestion. This article describes how to manage secret scopes in Databricks using GUI.</p>



<p class="has-pale-cyan-blue-background-color has-background"><strong>Pre-requisites</strong>:<br>To mount a location, you would need:<br>1. Databricks service in Azure, GCP, or AWS cloud.<br>2. A Databricks cluster.<br>3. Azure subscription with Azure Key Vault service created.</p>



<h2 class="wp-block-heading">What are Secret scopes in Databricks?</h2>



<p class="">When working with various applications, the Databricks platform comes in handy. To establish connections, credentials or secrets are necessary, which can be securely stored in Databricks or Azure Key Vault. Secret scopes are responsible for managing these secrets in either Azure Key Vault or Databricks.y Vault or Databricks.y Vault or Databricks.</p>



<p class=""><strong>Databricks supports two secret scopes :</strong><br>1. <a href="https://docs.microsoft.com/en-us/azure/key-vault/general/basic-concepts" target="_blank" rel="noreferrer noopener">Azure Key Vault</a> backed scopes: to manage secrets stored in the Azure Key Vault.<br>2. Databricks-backed scopes: to manage secrets stored in Databricks.</p>



<h2 class="wp-block-heading">Secret Scopes vs Key Vault-Backed Scopes</h2>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>Feature</th><th>Secret Scope</th><th>Key Vault-Backed Scope</th></tr></thead><tbody><tr><td>Storage</td><td>Stored inside Databricks workspace</td><td>Stored in Azure Key Vault</td></tr><tr><td>Security</td><td>Basic workspace-level</td><td>Enterprise-grade (RBAC + auditing)</td></tr><tr><td>Ideal For</td><td>Simpler, internal use</td><td>Production, regulated environments</td></tr><tr><td>Creation</td><td><code>databricks secrets create-scope</code></td><td>Linked via Azure Key Vault URI</td></tr></tbody></table></figure>



<p class="">This article will focus on how to manage Azure Key Vault-backed secret scopes.</p>



<h2 class="wp-block-heading">Create an Azure Key Vault-backed scope</h2>



<p class="">Follow the below steps to create an Azure Key Vault-backed secret scope.</p>



<p class="">1. Open <code>https://&lt;databricks-instance&gt;#secrets/createScope</code> URL</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="864" data-attachment-id="3476" data-permalink="https://azureops.org/articles/manage-secret-scopes-in-databricks-using-gui/1-open-url/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?fit=1366%2C984&amp;ssl=1" data-orig-size="1366,984" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="1.-open-url" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?fit=300%2C216&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?fit=1200%2C864&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=1200%2C864&#038;ssl=1" alt="Manage Secret Scopes in Databricks" class="wp-image-3476" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=1200%2C864&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=450%2C324&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=600%2C432&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=300%2C216&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=861%2C620&amp;ssl=1 861w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?resize=768%2C553&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/1.-open-url.jpg?w=1366&amp;ssl=1 1366w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<p class="has-text-align-left">2. Provide the below details:</p>



<p class=""><strong>Scope Name: </strong>&lt;Name of the scope&gt;</p>



<p class=""><strong>Manage Principal:</strong> Using this option, you can specify what all users can manage the secret scope. You can either select &#8220;All Users&#8221; or &#8220;Create&#8217;.</p>



<p class=""><strong>DNS Name and Resource ID:</strong> Both these properties can be found in Azure Key Vault service properties.</p>



<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="310" data-attachment-id="3477" data-permalink="https://azureops.org/articles/manage-secret-scopes-in-databricks-using-gui/key-vault-properties/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?fit=1540%2C398&amp;ssl=1" data-orig-size="1540,398" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="key-vault-properties" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?fit=300%2C78&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?fit=1200%2C310&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=1200%2C310&#038;ssl=1" alt="key vault service properties." class="wp-image-3477" style="width:854px;height:221px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=1200%2C310&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=450%2C116&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=600%2C155&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=300%2C78&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=768%2C198&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?resize=1536%2C397&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/09/key-vault-properties.jpg?w=1540&amp;ssl=1 1540w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<p class="">3. Click on Create. This will create secret scope.</p>



<h2 class="wp-block-heading">Access a secret from the Azure Key Vault in Databricks</h2>



<p class="">We can access secrets in Databricks using the below command.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; gutter: false; title: ; notranslate">
password =  dbutils.secrets.get(scope = &lt;name_of_scope&gt;, key = &quot;&lt;name_of_secret&gt;)
</pre></div>


<figure class="is-style-default wp-block-image size-large is-resized"><a href="https://marketplace.visualstudio.com/items?itemName=AzureOps.ssiscatalogerpro&amp;ssr=false#overview" target="_blank" rel="noopener"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="148" data-attachment-id="4839" data-permalink="https://azureops.org/articles/azure-data-studio-for-sql-developers/scmw-horizontal-ad/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1326%2C163&amp;ssl=1" data-orig-size="1326,163" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SCMW-horizontal-ad" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=300%2C37&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1200%2C148&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&#038;ssl=1" alt="" class="wp-image-4839" style="object-fit:cover;width:811px;height:99px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=450%2C55&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=600%2C74&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=300%2C37&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=768%2C94&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?w=1326&amp;ssl=1 1326w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></a></figure>



<h2 class="wp-block-heading">Delete a secret scope from Databricks</h2>



<p class="">Unfortunately, it is not possible to delete a secret scope using GUI. The alternative option is to use either Databricks CLI or Databricks Rest API for deletion. </p>



<h2 class="wp-block-heading">FAQ</h2>


<div class="wp-block-uagb-faq uagb-faq__outer-wrap uagb-block-5fa68f8b uagb-faq-icon-row uagb-faq-layout-accordion uagb-faq-expand-first-true uagb-faq-inactive-other-true uagb-faq__wrap uagb-buttons-layout-wrap uagb-faq-equal-height     " data-faqtoggle="true" role="tablist"><div class="wp-block-uagb-faq-child uagb-faq-child__outer-wrap uagb-faq-item uagb-block-c9ebda65 " role="tab" tabindex="0"><div class="uagb-faq-questions-button uagb-faq-questions">			<span class="uagb-icon uagb-faq-icon-wrap">
								<svg xmlns="https://www.w3.org/2000/svg" viewBox= "0 0 448 512"><path d="M432 256c0 17.69-14.33 32.01-32 32.01H256v144c0 17.69-14.33 31.99-32 31.99s-32-14.3-32-31.99v-144H48c-17.67 0-32-14.32-32-32.01s14.33-31.99 32-31.99H192v-144c0-17.69 14.33-32.01 32-32.01s32 14.32 32 32.01v144h144C417.7 224 432 238.3 432 256z"></path></svg>
							</span>
						<span class="uagb-icon-active uagb-faq-icon-wrap">
								<svg xmlns="https://www.w3.org/2000/svg" viewBox= "0 0 448 512"><path d="M400 288h-352c-17.69 0-32-14.32-32-32.01s14.31-31.99 32-31.99h352c17.69 0 32 14.3 32 31.99S417.7 288 400 288z"></path></svg>
							</span>
			<span class="uagb-question">1. How to view secret scope values in Databricks?</span></div><div class="uagb-faq-content"><p>You cannot view actual secret values for security reasons. However, you can list the scope name and keys using: databricks secrets list &#8211;scope my-scope</p></div></div><div class="wp-block-uagb-faq-child uagb-faq-child__outer-wrap uagb-faq-item uagb-block-e2e1e957 " role="tab" tabindex="0"><div class="uagb-faq-questions-button uagb-faq-questions">			<span class="uagb-icon uagb-faq-icon-wrap">
								<svg xmlns="https://www.w3.org/2000/svg" viewBox= "0 0 448 512"><path d="M432 256c0 17.69-14.33 32.01-32 32.01H256v144c0 17.69-14.33 31.99-32 31.99s-32-14.3-32-31.99v-144H48c-17.67 0-32-14.32-32-32.01s14.33-31.99 32-31.99H192v-144c0-17.69 14.33-32.01 32-32.01s32 14.32 32 32.01v144h144C417.7 224 432 238.3 432 256z"></path></svg>
							</span>
						<span class="uagb-icon-active uagb-faq-icon-wrap">
								<svg xmlns="https://www.w3.org/2000/svg" viewBox= "0 0 448 512"><path d="M400 288h-352c-17.69 0-32-14.32-32-32.01s14.31-31.99 32-31.99h352c17.69 0 32 14.3 32 31.99S417.7 288 400 288z"></path></svg>
							</span>
			<span class="uagb-question">Can I delete a Databricks secret scope?</span></div><div class="uagb-faq-content"><p>Yes, using: databricks secrets delete-scope &#8211;scope my-scope</p></div></div></div>


<p class="has-background" style="background-color:#bcefca"><strong>Pro tips:</strong><br>1. Databricks provides a free community version where you can learn and explore Databricks. You can signup here.<br>2. By managing secret scopes in Databricks, you can keep your sensitive data secure while allowing authorized users and applications to access it when needed.<br>3. If you&#8217;re aiming to obtain the Databricks certified Data Engineer Associate certification, take a look at these helpful tips.<br>4. Learn how to mount and unmount data lake gen2 storage in Databricks.<br>5. <a href="https://azureops.org/articles/automate-databricks-infrastructure-as-code-with-terraform/" target="_blank" rel="noreferrer noopener">Learn</a> how to automate Databricks IAAC using Terraform.</p>



<iframe loading="lazy" width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>
<p>The post <a href="https://azureops.org/articles/manage-secret-scopes-in-databricks-using-gui/">Databricks Secret Scopes – How to Create, Manage, and Use Securely</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">3266</post-id>	</item>
		<item>
		<title>Mount and Unmount Data Lake in Databricks</title>
		<link>https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/</link>
		
		<dc:creator><![CDATA[Pavan Bangad]]></dc:creator>
		<pubDate>Wed, 17 Aug 2022 08:00:00 +0000</pubDate>
				<category><![CDATA[Azure]]></category>
		<category><![CDATA[Data Lake]]></category>
		<category><![CDATA[Databricks]]></category>
		<category><![CDATA[databricks unmount]]></category>
		<category><![CDATA[databricks unmount storage account]]></category>
		<category><![CDATA[dbutils.fs.mount]]></category>
		<category><![CDATA[dbutils.fs.unmount]]></category>
		<category><![CDATA[mount unmount in databricks]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=3015</guid>

					<description><![CDATA[<p>Mounting object storage to Databricks fie system allows easy access to object storage as if they were on the local file system. In this article, we will see how to mount and unmount data lake in Databricks.</p>
<p>The post <a href="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/">Mount and Unmount Data Lake in Databricks</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Databricks is a unified big data processing and analytics cloud platform that transforms and processes huge volumes of data. Apache Spark is the building block of Databricks, an in-memory analytics engine for big data and machine learning. Databricks can connect to various sources for data ingestion. This article will show how to mount and unmount data lake in Databricks.</p>



<p class="has-pale-cyan-blue-background-color has-background"><strong>Pre-requisites</strong>:<br>To mount a location, you would need the following:<br>1. Databricks service in Azure, GCP, or AWS cloud.<br>2. A Databricks cluster.<br>3. A basic understanding of Databricks and how to create notebooks.</p>



<h3 class="wp-block-heading">What is Mounting in Databricks?</h3>



<p>Mounting object storage to DBFS allows easy access to object storage as if they were on the local file system. Once a location e.g., blob storage or Amazon S3 bucket is mounted, we can use the same mount location to access the external drive.</p>



<p>Generally, we use <code>dbutils.fs.mount</code>() command to mount a location in Databricks.</p>



<h3 class="wp-block-heading">How to mount a data lake in Databricks?</h3>



<p>Let us now see how to mount Azure data lake gen2 in Databricks.</p>



<p>First thing first, let&#8217;s create blob storage and container. Blob storage should look like in the below image.</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="332" data-attachment-id="3033" data-permalink="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/2-blob-storage/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?fit=1896%2C524&amp;ssl=1" data-orig-size="1896,524" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="2.blob-storage" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?fit=300%2C83&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?fit=1200%2C332&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=1200%2C332&#038;ssl=1" alt="" class="wp-image-3033" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=1200%2C332&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=450%2C124&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=600%2C166&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=300%2C83&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=768%2C212&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?resize=1536%2C425&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/2.blob-storage.jpg?w=1896&amp;ssl=1 1896w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<p>New Container should look like in the below image.</p>



<figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="350" data-attachment-id="3034" data-permalink="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/7-container_files/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?fit=1897%2C554&amp;ssl=1" data-orig-size="1897,554" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="7.-container_files" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?fit=300%2C88&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?fit=1200%2C350&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=1200%2C350&#038;ssl=1" alt="" class="wp-image-3034" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=1200%2C350&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=450%2C131&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=600%2C175&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=300%2C88&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=768%2C224&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?resize=1536%2C449&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-container_files.jpg?w=1897&amp;ssl=1 1897w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<p>To mount an ADLS gen2 we will need the below details to connect to a location.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
ContainerName = &quot;yourcontainerName&quot;
azure_blobstorage_name = &quot;blobstoragename&quot;
mountpointname = &quot;/mnt/azureops&quot;
secret_key =&quot;xxxxxxxxxxx&quot;
</pre></div>


<p class="has-ast-global-color-3-color has-text-color">Once we have this information, we can use below code snippet to connect the data lake with Databricks.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
dbutils.fs.mount(source = f&quot;wasbs://{ContainerName}@{azure_blobstorage_name}.blob.core.windows.net&quot;,mount_point = Mountpointname ,extra_configs = {&quot;fs.azure.account.key.&quot;+azure_blobstorage_name+&quot;.blob.core.windows.net&quot;:secret_key})
</pre></div>


<figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="313" data-attachment-id="3029" data-permalink="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/4-mount_storage_account/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?fit=1810%2C472&amp;ssl=1" data-orig-size="1810,472" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="4.-mount_storage_account" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?fit=300%2C78&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?fit=1200%2C313&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=1200%2C313&#038;ssl=1" alt="Mount and unmount data lake in Databricks" class="wp-image-3029" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=1200%2C313&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=450%2C117&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=600%2C156&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=300%2C78&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=768%2C200&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?resize=1536%2C401&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/4.-mount_storage_account.jpg?w=1810&amp;ssl=1 1810w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<h3 class="wp-block-heading">How to check all the mount points in Databricks?</h3>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
dbutils.fs.mounts()
</pre></div>


<figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="205" data-attachment-id="3028" data-permalink="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/1-mounts_location/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?fit=1837%2C314&amp;ssl=1" data-orig-size="1837,314" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="1.mounts_location" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?fit=300%2C51&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?fit=1200%2C205&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=1200%2C205&#038;ssl=1" alt="Check mount points in Databricks" class="wp-image-3028" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=1200%2C205&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=450%2C77&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=600%2C103&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=300%2C51&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=768%2C131&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?resize=1536%2C263&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/1.mounts_location.jpg?w=1837&amp;ssl=1 1837w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<h3 class="wp-block-heading">How to unmount a location?</h3>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
dbutils.fs.unmount(mount_point)
</pre></div>


<figure class="wp-block-image size-large"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="145" data-attachment-id="3030" data-permalink="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/6-unmount/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?fit=1802%2C217&amp;ssl=1" data-orig-size="1802,217" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="6.-unmount" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?fit=300%2C36&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?fit=1200%2C145&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=1200%2C145&#038;ssl=1" alt="Unmount data lake in Databricks" class="wp-image-3030" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=1200%2C145&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=450%2C54&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=600%2C72&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=300%2C36&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=768%2C92&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?resize=1536%2C185&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/6.-unmount.jpg?w=1802&amp;ssl=1 1802w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<h3 class="wp-block-heading">Let&#8217;s use all the above commands in action.</h3>



<p>The objective is to add a mount point if it does not exist.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
if all(mount.mountPoint != archival_mount_name for mount in dbutils.fs.mounts()):
     dbutils.fs.mount(source = f&quot;wasbs://{ContainerName}@{azure_blobstorage_name}.blob.core.windows.net&quot;,mount_point = Mountpointname ,extra_configs = {&quot;fs.azure.account.key.&quot;+azure_blobstorage_name+&quot;.blob.core.windows.net&quot;:archival_secret_key})
</pre></div>


<figure class="wp-block-image size-large is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" data-attachment-id="3037" data-permalink="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/7-usecase/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?fit=1819%2C339&amp;ssl=1" data-orig-size="1819,339" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="7.-usecase" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?fit=300%2C56&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?fit=1200%2C224&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=1200%2C224&#038;ssl=1" alt="" class="wp-image-3037" width="1200" height="224" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=1200%2C224&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=450%2C84&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=600%2C112&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=300%2C56&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=768%2C143&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?resize=1536%2C286&amp;ssl=1 1536w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/7.-usecase.jpg?w=1819&amp;ssl=1 1819w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></figure>



<figure class="is-style-default wp-block-image size-large is-resized"><a href="https://marketplace.visualstudio.com/items?itemName=AzureOps.ssiscatalogerpro&amp;ssr=false#overview" target="_blank" rel="noopener"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="148" data-attachment-id="4839" data-permalink="https://azureops.org/articles/azure-data-studio-for-sql-developers/scmw-horizontal-ad/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1326%2C163&amp;ssl=1" data-orig-size="1326,163" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SCMW-horizontal-ad" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=300%2C37&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1200%2C148&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&#038;ssl=1" alt="" class="wp-image-4839" style="object-fit:cover;width:811px;height:99px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=450%2C55&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=600%2C74&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=300%2C37&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=768%2C94&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?w=1326&amp;ssl=1 1326w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></a></figure>



<p class="has-background" style="background-color:#bcefca"><strong>Pro tips:</strong><br>1. Instead of using a storage account key, we can also mount a location using a SAS token URL or service principal<br>2. Databricks provide a free community version where you can learn and explore Databricks. You can signup <a href="https://community.cloud.databricks.com/login.html" target="_blank" rel="noreferrer noopener">here</a>.<br>3. If you’re aiming to obtain the Databricks certified Data Engineer Associate certification, take a look at&nbsp;<a href="https://azureops.org/articles/databricks-certified-data-engineer-associate/" target="_blank" rel="noreferrer noopener">these</a>&nbsp;helpful tips.</p>



<p>Notebook Reference</p>



<div class="wp-block-file"><a id="wp-block-file--media-74f16533-875a-4100-a0b9-bb8d329e82f0" href="https://azureops.org/wp-content/uploads/2022/08/mount_unmount.zip">mount_unmount</a><a href="https://azureops.org/wp-content/uploads/2022/08/mount_unmount.zip" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-74f16533-875a-4100-a0b9-bb8d329e82f0">Download</a></div>



<h2 class="wp-block-heading">See more</h2>



<iframe loading="lazy" width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>
<p>The post <a href="https://azureops.org/articles/mount-and-unmount-data-lake-in-databricks/">Mount and Unmount Data Lake in Databricks</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">3015</post-id>	</item>
		<item>
		<title>Call a notebook from another notebook in Databricks</title>
		<link>https://azureops.org/articles/call-a-notebook-from-another-notebook-in-databricks/</link>
		
		<dc:creator><![CDATA[Pavan Bangad]]></dc:creator>
		<pubDate>Wed, 03 Aug 2022 12:43:55 +0000</pubDate>
				<category><![CDATA[Databricks]]></category>
		<category><![CDATA[%run databricks]]></category>
		<category><![CDATA[call databricks notebooks]]></category>
		<category><![CDATA[dbutils notebook run]]></category>
		<category><![CDATA[dbutils.notebook.run]]></category>
		<category><![CDATA[dbutils.notebook.run databricks]]></category>
		<guid isPermaLink="false">https://azureops.org/?p=2624</guid>

					<description><![CDATA[<p>In this article, we will see how to call a notebook from another notebook in Databricks and how to manage the execution context of a notebook.</p>
<p>The post <a href="https://azureops.org/articles/call-a-notebook-from-another-notebook-in-databricks/">Call a notebook from another notebook in Databricks</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="">Databricks is a unified big data processing and analytics cloud platform that transforms and processes enormous volumes of data. Apache Spark is the building block of Databricks, an in-memory analytics engine for big data and machine learning. In this article, we will see how to call a notebook from another notebook in Databricks and how to manage the execution context of a notebook.</p>



<h2 class="wp-block-heading">What is Databricks notebook and execution context?</h2>



<p class="">Notebooks in Databricks are used to write spark code to process and transform data. Notebooks support Python, Scala, SQL, and R languages.</p>



<p class="">Whenever we execute a notebook in Databricks, it attaches a cluster (computation resource) to it and creates an execution context.</p>



<p class="has-pale-cyan-blue-background-color has-background"><strong>Pre-requisites</strong>:<br>If you want to run Databricks notebook inside another notebook, you would need the following:<br>1. Databricks service in Azure, GCP, or AWS cloud.<br>2. A Databricks cluster.<br>3. A basic understanding of Databricks and how to create notebooks.</p>



<h2 class="wp-block-heading">Methods to call a notebook from another notebook in Databricks</h2>



<p class="">There are two methods to run a Databricks notebook inside another Databricks notebook.</p>



<h3 class="wp-block-heading">1. Using the %run command</h3>



<p class="">%run command invokes the notebook in the same notebook context, meaning any variable or function declared in the parent notebook can be used in the child notebook.</p>



<p class="">The sample command would look like the one below.</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
%run &#x5B;notebook path] $paramter1=&quot;Value1&quot; $paramterN=&quot;valueN&quot;
</pre></div>


<figure class="wp-block-image size-full"><img data-recalc-dims="1" loading="lazy" decoding="async" width="650" height="437" data-attachment-id="2809" data-permalink="https://azureops.org/articles/call-a-notebook-from-another-notebook-in-databricks/method1-v2/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?fit=650%2C437&amp;ssl=1" data-orig-size="650,437" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="method1-v2" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?fit=300%2C202&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?fit=650%2C437&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?resize=650%2C437&#038;ssl=1" alt="Call a notebook from another notebook in Databricks using %run method" class="wp-image-2809" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?w=650&amp;ssl=1 650w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?resize=450%2C303&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?resize=600%2C403&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method1-v2.gif?resize=300%2C202&amp;ssl=1 300w" sizes="auto, (max-width: 650px) 100vw, 650px" /><figcaption class="wp-element-caption">Example &#8211; Use the %run function to call a notebook inside another notebook.</figcaption></figure>



<p class="">This method is suitable for defining a notebook with all the constant variables or a centralized shared function library. And you want to refer to them in the calling or child notebook.</p>



<p class="">What if we need to execute the child&#8217;s notebook in a different notebook context?  The following method describes how to achieve this.</p>



<figure class="is-style-default wp-block-image size-large is-resized"><a href="https://marketplace.visualstudio.com/items?itemName=AzureOps.ssiscatalogerpro&amp;ssr=false#overview" target="_blank" rel="noopener"><img data-recalc-dims="1" loading="lazy" decoding="async" width="1200" height="148" data-attachment-id="4839" data-permalink="https://azureops.org/articles/azure-data-studio-for-sql-developers/scmw-horizontal-ad/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1326%2C163&amp;ssl=1" data-orig-size="1326,163" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="SCMW-horizontal-ad" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=300%2C37&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?fit=1200%2C148&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&#038;ssl=1" alt="" class="wp-image-4839" style="object-fit:cover;width:811px;height:99px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=1200%2C148&amp;ssl=1 1200w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=450%2C55&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=600%2C74&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=300%2C37&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?resize=768%2C94&amp;ssl=1 768w, https://i0.wp.com/azureops.org/wp-content/uploads/2023/01/SCMW-horizontal-ad.png?w=1326&amp;ssl=1 1326w" sizes="auto, (max-width: 1200px) 100vw, 1200px" /></a></figure>



<h3 class="wp-block-heading">2. Using the dbutils.notebook.run() function</h3>



<p class="">This function will run the notebook in a new notebook context.</p>



<p class="">The syntax of dbutils.notebook function is: </p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
dbutils.notebook.run(notebookpath, timeout_in_seconds, parameters)
</pre></div>


<p class="">Here, </p>



<p class=""><strong>Notebook_path</strong> -&gt; path of the target notebook.<br><strong>Timeout_in_seconds </strong>&#8211; &gt; the notebook will throw an exception if it is not completed in the specified time.<br><strong>parameters</strong> &#8211; &gt; Used to send parameters to child notebook. Parameters should be specified in JSON format.<br> e.g. {&#8216;paramter1&#8217;: &#8216;value1&#8217;, &#8216;paramter2&#8217;: &#8216;value2&#8217;}</p>



<p class=""></p>



<figure class="wp-block-image size-full is-resized"><img data-recalc-dims="1" loading="lazy" decoding="async" width="600" height="374" data-attachment-id="2810" data-permalink="https://azureops.org/articles/call-a-notebook-from-another-notebook-in-databricks/method2/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?fit=600%2C374&amp;ssl=1" data-orig-size="600,374" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="method2" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?fit=300%2C187&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?fit=600%2C374&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?resize=600%2C374&#038;ssl=1" alt="Call a notebook from another notebook in Databricks using the dbutils.notebook function " class="wp-image-2810" style="width:600px;height:374px" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?w=600&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?resize=450%2C281&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?resize=500%2C312&amp;ssl=1 500w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/method2.gif?resize=300%2C187&amp;ssl=1 300w" sizes="auto, (max-width: 600px) 100vw, 600px" /><figcaption class="wp-element-caption">Example &#8211; Use dbutils.notebook.run() to call a notebook inside another notebook.</figcaption></figure>



<p class="">We can call the N numbers of the notebook by calling this function in the parent notebook</p>



<figure class="wp-block-image size-full"><img data-recalc-dims="1" loading="lazy" decoding="async" width="828" height="199" data-attachment-id="2812" data-permalink="https://azureops.org/articles/call-a-notebook-from-another-notebook-in-databricks/mutiple-notebook-1/" data-orig-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?fit=828%2C199&amp;ssl=1" data-orig-size="828,199" data-comments-opened="0" data-image-meta="{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}" data-image-title="mutiple-notebook-1" data-image-description="" data-image-caption="" data-medium-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?fit=300%2C72&amp;ssl=1" data-large-file="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?fit=828%2C199&amp;ssl=1" src="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?resize=828%2C199&#038;ssl=1" alt="" class="wp-image-2812" srcset="https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?w=828&amp;ssl=1 828w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?resize=450%2C108&amp;ssl=1 450w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?resize=600%2C144&amp;ssl=1 600w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?resize=300%2C72&amp;ssl=1 300w, https://i0.wp.com/azureops.org/wp-content/uploads/2022/08/mutiple-notebook-1.jpg?resize=768%2C185&amp;ssl=1 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>



<p class="">This will run all the notebooks sequentially. </p>



<h2 class="wp-block-heading">Run Databricks notebooks in parallel</h2>



<p class="">You can use the python library to run multiple Databricks notebooks in parallel. This library helps create multiple threads that run notebooks in parallel.</p>



<p class="">Import the library as follows:</p>


<div class="wp-block-syntaxhighlighter-code "><pre class="brush: python; title: ; notranslate">
from concurrent.futures import ThreadPoolExecutor
</pre></div>


<p class="">You can read more about ThreadPoolExecutor <a href="https://docs.python.org/3/library/concurrent.futures.html" target="_blank" rel="noreferrer noopener">here</a>. And <a href="https://www.codesexplorer.com/2020/03/run-databricks-notebooks-in-parallel-python.html" target="_blank" rel="noreferrer noopener">here</a> is a sample code that code explorer already wrote for running the notebook in parallel.</p>



<p class="">Attaching the same notebook used in this blog:</p>



<div class="wp-block-file"><a id="wp-block-file--media-3c756fc1-0a80-43f5-b292-05af9ef0e3bd" href="https://azureops.org/wp-content/uploads/2022/08/notebook_run.zip">notebook_run</a><a href="https://azureops.org/wp-content/uploads/2022/08/notebook_run.zip" class="wp-block-file__button wp-element-button" download aria-describedby="wp-block-file--media-3c756fc1-0a80-43f5-b292-05af9ef0e3bd">Download</a></div>



<p class="has-background" style="background-color:#bcefca"><strong>Pro tips:</strong><br>1. We can use Azure data factory for running notebooks in parallel. Refer to this <a href="https://docs.microsoft.com/en-us/azure/data-factory/transform-data-using-databricks-notebook" target="_blank" rel="noreferrer noopener">post</a> to learn more.<br>2. Jobs created using the&nbsp;dbutils.notebook API must complete in within 30 days. <br>3. We can only pass string parameters to the child notebook with the methods described in this article, and objects are not allowed.<br>4. Databricks provide a free community version where you can learn and explore Databricks. You can sign up <a href="https://community.cloud.databricks.com/login.html" target="_blank" rel="noreferrer noopener">here</a>.</p>



<h2 class="wp-block-heading">See more</h2>



<iframe loading="lazy" width="700" height="394" src="https://www.youtube.com/embed/t2h6xNVFQkc" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>



<div class="wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex">
<div class="is-style-fill wp-block-button"><a class="wp-block-button__link has-white-color has-blush-light-purple-gradient-background has-text-color has-background has-link-color wp-element-button" href="https://azureops.org/product/ssis-catalog-migration-wizard-pro/" target="_blank" rel="noreferrer noopener">Download Now</a></div>
</div>
<p>The post <a href="https://azureops.org/articles/call-a-notebook-from-another-notebook-in-databricks/">Call a notebook from another notebook in Databricks</a> appeared first on <a href="https://azureops.org">AzureOps</a>.</p>
]]></content:encoded>
					
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2624</post-id>	</item>
	</channel>
</rss>
