You have an Azure Databricks resource.
You need to log actions that relate to compute changes triggered by the Databricks resources.
Which Databricks services should you log?
Click on the arrows to vote for the correct answer
A. B. C. D. E.D
An Azure Databricks cluster is a set of computation resources and configurations on which you run data engineering, data science, and data analytics workloads.
Incorrect Answers:
A: An Azure Databricks workspace is an environment for accessing all of your Azure Databricks assets. The workspace organizes objects (notebooks, libraries, and experiments) into folders, and provides access to data and computational resources such as clusters and jobs.
B: SSH allows you to log into Apache Spark clusters remotely.
C: Databricks File System (DBFS) is a distributed file system mounted into an Azure Databricks workspace and available on Azure Databricks clusters.
E: A job is a way of running a notebook or JAR either immediately or on a scheduled basis.
https://docs.microsoft.com/en-us/azure/databricks/clusters/Sure, I'd be happy to help you understand the question and the answers!
The question asks which Databricks services should be logged in order to track compute changes triggered by the Databricks resources.
Databricks is a big data processing and analytics platform built on top of Apache Spark, and it includes several different services that work together to provide a unified data processing experience. Let's break down each of the possible answers and explain what they mean:
A. Workspace: The Databricks workspace is a web-based interface that allows users to interact with Databricks resources. It includes tools for creating and managing notebooks, clusters, jobs, and other resources. While it's important to log activity in the workspace, it's not directly related to compute changes triggered by Databricks resources.
B. SSH: SSH (Secure Shell) is a protocol used to remotely access and manage Databricks clusters. It allows users to log in to a virtual machine running the Databricks runtime and execute commands. While logging SSH activity is important for security reasons, it's not directly related to compute changes triggered by Databricks resources.
C. DBFS: The Databricks File System (DBFS) is a distributed file system that allows users to store and manage data within Databricks. It's an important part of the Databricks ecosystem, but it's not directly related to compute changes triggered by Databricks resources.
D. Clusters: Databricks clusters are the virtual machines that are used to process data within Databricks. When a user runs a notebook or a job, the code is executed on a cluster. Logging cluster activity is important for tracking compute changes, as it allows you to see when clusters are created, modified, or terminated.
E. Jobs: Databricks jobs are scheduled processes that run on a Databricks cluster. They can be used to perform data processing tasks, model training, and other operations. Logging job activity is important for tracking compute changes, as it allows you to see when jobs are created, modified, or terminated.
Given the information above, the correct answers to the question are D. clusters and E. jobs. These are the two Databricks services that are directly related to compute changes triggered by Databricks resources, and logging activity in these services will allow you to track changes to your compute environment.