Data Engineering on Microsoft Azure: Mounting Entries in DBFS for Databricks Storage

How to Mount Entries in DBFS for Databricks Storage

Question

Shane is a Data architect working on designing the mount of the Databricks file system for storage.

He has three storage accounts for Databricks “analytics_demo”, “dscience_demo”, “azml_demo”

How would it mount entries in DBFS for each of these storage objects?

Answers

Explanations

Click on the arrows to vote for the correct answer

A. B. C. D.

Correct Answer: A.

Option A is correct because it's recommended from Microsoft to create separate mount entries for each object storage mounted in DBFS.

For example,

storage1 to be mounted as /mnt/storage1

storage2 to be mounted as /mnt/storage2 and nested mounts are not supported.

Option B is incorrect because nested mounts are not supported.

So the following structures are not supported.

“/mnt/analytics_demo/dscience_demo/azml_demo” Option C is incorrect because the following nested object mountstructures are not supported.

“/mnt/analytics_demo/…/dscience_demo/…/azml_demo”

The correct answer is A. “/mnt/analytics_demo”, “/mnt/dscience_demo”, “/mnt/azml_demo”.

Databricks File System (DBFS) is a distributed file system that is integrated with Azure Databricks. It allows you to store and access files, data, and models in Databricks. DBFS supports a variety of file formats and provides a unified namespace for data and models in Databricks.

To mount a storage account in DBFS, you need to follow these steps:

  1. In the Azure portal, create a storage account for Databricks.
  2. In the Databricks workspace, click on "Workspace" and then "Shared".
  3. Click on "Create" and then "Library".
  4. Choose "Mount" as the library source type and select the storage account you created earlier.
  5. Enter the mount point, which is the path where the storage account will be mounted in DBFS. For example, "/mnt/analytics_demo" for the "analytics_demo" storage account.
  6. Enter the storage account key and click on "Create".

You can repeat the above steps for each of the three storage accounts, specifying a different mount point for each storage account. The correct answer is A because each storage account is mounted at a different mount point in DBFS, which is "/mnt/analytics_demo", "/mnt/dscience_demo", and "/mnt/azml_demo".