Summer Sale- Special Discount Limited Time 65% Offer - Ends in 0d 00h 00m 00s - Coupon code: netdisc

Databricks Databricks-Certified-Data-Analyst-Associate Databricks Certified Data Analyst Associate Exam Exam Practice Test

Page: 1 / 7
Total 65 questions

Databricks Certified Data Analyst Associate Exam Questions and Answers

Question 1

Where in the Databricks SQL workspace can a data analyst configure a refresh schedule for a query when the query is not attached to a dashboard or alert?

Options:

A.

Data bxplorer

B.

The Visualization editor

C.

The Query Editor

D.

The Dashboard Editor

Question 2

What describes the variance of a set of values?

Options:

A.

Variance is a measure of how far a single observed value is from a set ot va IN

B.

Variance is a measure of how far an observed value is from the variable's maximum or minimum value.

C.

Variance is a measure of central tendency of a set of values.

D.

Variance is a measure of how far a set of values is spread out from the sets central value.

Question 3

Data professionals with varying responsibilities use the Databricks Lakehouse Platform Which role in the Databricks Lakehouse Platform use Databricks SQL as their primary service?

Options:

A.

Data scientist

B.

Data engineer

C.

Platform architect

D.

Business analyst

Question 4

What describes Partner Connect in Databricks?

Options:

A.

it allows for free use of Databricks partner tools through a common API.

B.

it allows multi-directional connection between Databricks and Databricks partners easier.

C.

It exposes connection information to third-party tools via Databricks partners.

D.

It is a feature that runs Databricks partner tools on a Databricks SQL Warehouse (formerly known as a SQL endpoint).

Question 5

A data engineering team has created a Structured Streaming pipeline that processes data in micro-batches and populates gold-level tables. The microbatches are triggered every minute.

A data analyst has created a dashboard based on this gold-level data. The project stakeholders want to see the results in the dashboard updated within one minute or less of new data becoming available within the gold-level tables.

Which of the following cautions should the data analyst share prior to setting up the dashboard to complete this task?

Options:

A.

The required compute resources could be costly

B.

The gold-level tables are not appropriately clean for business reporting

C.

The streaming data is not an appropriate data source for a dashboard

D.

The streaming cluster is not fault tolerant

E.

The dashboard cannot be refreshed that quickly

Question 6

A data analyst has set up a SQL query to run every four hours on a SQL endpoint, but the SQL endpoint is taking too long to start up with each run.

Which of the following changes can the data analyst make to reduce the start-up time for the endpoint while managing costs?

Options:

A.

Reduce the SQL endpoint cluster size

B.

Increase the SQL endpoint cluster size

C.

Turn off the Auto stop feature

D.

Increase the minimum scaling value

E.

Use a Serverless SQL endpoint

Question 7

Which of the following should data analysts consider when working with personally identifiable information (PII) data?

Options:

A.

Organization-specific best practices for Pll data

B.

Legal requirements for the area in which the data was collected

C.

None of these considerations

D.

Legal requirements for the area in which the analysis is being performed

E.

All of these considerations

Question 8

Which of the following is an advantage of using a Delta Lake-based data lakehouse over common data lake solutions?

Options:

A.

ACID transactions

B.

Flexible schemas

C.

Data deletion

D.

Scalable storage

E.

Open-source formats

Question 9

Which location can be used to determine the owner of a managed table?

Options:

A.

Review the Owner field in the table page using Catalog Explorer

B.

Review the Owner field in the database page using Data Explorer

C.

Review the Owner field in the schema page using Data Explorer

D.

Review the Owner field in the table page using the SQL Editor

Question 10

A data analyst creates a Databricks SQL Query where the result set has the following schema:

region STRING

number_of_customer INT

When the analyst clicks on the "Add visualization" button on the SQL Editor page, which of the following types of visualizations will be selected by default?

Options:

A.

Violin Chart

B.

Line Chart

C.

IBar Chart

D.

Histogram

E.

There is no default. The user must choose a visualization type.

Question 11

A data analyst needs to use the Databricks Lakehouse Platform to quickly create SQL queries and data visualizations. It is a requirement that the compute resources in the platform can be made serverless, and it is expected that data visualizations can be placed within a dashboard.

Which of the following Databricks Lakehouse Platform services/capabilities meets all of these requirements?

Options:

A.

Delta Lake

B.

Databricks Notebooks

C.

Tableau

D.

Databricks Machine Learning

E.

Databricks SQL

Question 12

How can a data analyst determine if query results were pulled from the cache?

Options:

A.

Go to the Query History tab and click on the text of the query. The slideout shows if the results came from the cache.

B.

Go to the Alerts tab and check the Cache Status alert.

C.

Go to the Queries tab and click on Cache Status. The status will be green if the results from the last run came from the cache.

D.

Go to the SQL Warehouse (formerly SQL Endpoints) tab and click on Cache. The Cache file will show the contents of the cache.

E.

Go to the Data tab and click Last Query. The details of the query will show if the results came from the cache.

Question 13

A data analyst wants to create a dashboard with three main sections: Development, Testing, and Production. They want all three sections on the same dashboard, but they want to clearly designate the sections using text on the dashboard.

Which of the following tools can the data analyst use to designate the Development, Testing, and Production sections using text?

Options:

A.

Separate endpoints for each section

B.

Separate queries for each section

C.

Markdown-based text boxes

D.

Direct text written into the dashboard in editing mode

E.

Separate color palettes for each section

Question 14

Which of the following is a benefit of Databricks SQL using ANSI SQL as its standard SQL dialect?

Options:

A.

It has increased customization capabilities

B.

It is easy to migrate existing SQL queries to Databricks SQL

C.

It allows for the use of Photon's computation optimizations

D.

It is more performant than other SQL dialects

E.

It is more compatible with Spark's interpreters

Question 15

An analyst writes a query that contains a query parameter. They then add an area chart visualization to the query. While adding the area chart visualization to a dashboard, the analyst chooses "Dashboard Parameter" for the query parameter associated with the area chart.

Which of the following statements is true?

Options:

A.

The area chart will use whatever is selected in the Dashboard Parameter while all or the other visualizations will remain changed regardless of their parameter use.

B.

The area chart will use whatever is selected in the Dashboard Parameter along with all of the other visualizations in the dashboard that use the same parameter.

C.

The area chart will use whatever value is chosen on the dashboard at the time the area chart is added to the dashboard.

D.

The area chart will use whatever value is input by the analyst when the visualization is added to the dashboard. The parameter cannot be changed by the user afterwards.

E.

The area chart will convert to a Dashboard Parameter.

Question 16

A data analyst is working with gold-layer tables to complete an ad-hoc project. A stakeholder has provided the analyst with an additional dataset that can be used to augment the gold-layer tables already in use.

Which of the following terms is used to describe this data augmentation?

Options:

A.

Data testing

B.

Ad-hoc improvements

C.

Last-mile

D.

Last-mile ETL

E.

Data enhancement

Question 17

Which of the following statements describes descriptive statistics?

Options:

A.

A branch of statistics that uses summary statistics to quantitatively describe and summarize data.

B.

A branch of statistics that uses a variety of data analysis techniques to infer properties of an underlying distribution of probability.

C.

A branch of statistics that uses quantitative variables that must take on a finite or countably infinite set of values.

D.

A branch of statistics that uses summary statistics to categorically describe and summarize data.

E.

A branch of statistics that uses quantitative variables that must take on an uncountable set of values.

Question 18

Which statement describes descriptive statistics?

Options:

A.

A branch of statistics that uses a variety of data analysis techniques to infer properties of an underlying distribution of probability.

B.

A branch of statistics that uses summary statistics to categorically describe and summarize data.

C.

A branch of statistics that uses summary statistics to quantitatively describe and summarize data.

D.

A branch of statistics that uses quantitative variables that must take on a finite or countably infinite set of values.

Question 19

Which of the following statements about a refresh schedule is incorrect?

Options:

A.

A query can be refreshed anywhere from 1 minute lo 2 weeks

B.

Refresh schedules can be configured in the Query Editor.

C.

A query being refreshed on a schedule does not use a SQL Warehouse (formerly known as SQL Endpoint).

D.

A refresh schedule is not the same as an alert.

E.

You must have workspace administrator privileges to configure a refresh schedule

Page: 1 / 7
Total 65 questions