Special Summer Sale Limited Time Flat 70% Discount offer - Ends in 0d 00h 00m 00s - Coupon code: 70spcl

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Page: 1 / 36
Total 363 questions

CompTIA Data+ Certification Exam Questions and Answers

Question 1

Which of the following is the best description of discrete data types?

Options:

A.

Non-numeric data used to describe attributes of a population sample

B.

The frequency of the number of times each value occurs by using whole numbers

C.

Numeric values that can be measured on a continuous scale

D.

Non-numeric data used to describe attributes of a population sample ranked in a specific order

Question 2

Which one of the following programming languages is specifically designed for use in analytics applications?

Options:

A.

Python.

B.

R

C.

C++

D.

Java.

Question 3

Given the following data:

Question # 3

Which of the following BEST describes the data set?

Options:

A.

There is data bias.

B.

The data is incomplete.

C.

The data is inconsistent.

D.

The data is outliers.

Question 4

An analyst is currently working on a ticket for revamping a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

Options:

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views that are tailored toward each specific team.

D.

Develop a more streanMined dashboard to roll out by the next delivery date.

Question 5

Which of the following reports can be used when insight into operational performance is needed each Wednesday?

Options:

A.

Static report

B.

Tactical report

C.

Recurring report

D.

Ad hoc report

Question 6

Which of the following types of analysis is used when comparing last week's sales to the previous week's sales?

Options:

A.

Trend analysis

B.

Exploratory analysis

C.

Prescriptive analysis

D.

Link analysis

Question 7

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

Options:

A.

Determine the data needs and sources for analysis.

B.

Initiate the analysis for exploratory data analysis.

C.

Review the business questions to understand the scope.

D.

Finalize the methodology to solve the problem.

Question 8

A company notifies its employees that emails will be automatically moved to a cloud-based server in 180 days. Which of the following describes this concept?

Options:

A.

Data deletion

B.

Data processing

C.

Data retention

D.

Data constraints

Question 9

Which of the following is most likely to be used as a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Question 10

Which of the following is a common data analytics tool that is also used as an interpreted, high-level, general-purpose programming language?

Options:

A.

SAS

B.

Microsoft Power BI

C.

IBM SPSS

D.

Python

Question 11

An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:

Question # 11

Which of the following functions did the analyst use to generate the data in the Sales_indicator column?

Options:

A.

Aggregate

B.

Logical

C.

Date

D.

Sort

Question 12

A data analyst has been asked to organize the table below in the following ways:

By sales from high to low -

By state in alphabetic order -

Question # 12

Which of the following functions will allow the data analyst to organize the table in this manner?

Options:

A.

Conditional formatting

B.

Grouping

C.

Filtering

D.

Sorting

Question 13

Daniel is using the structured Query language to work with data stored in relational database.

He would like to add several new rows to a database table.

What command should he use?

Options:

A.

SELECT.

B.

ALTER.

C.

INSERT.

D.

UPDATE.

Question 14

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

Options:

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Question 15

An analyst must obtain the average daily sales for the following week:

Question # 15

Which of the following must the analyst perform to obtain this value?

Options:

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Question 16

An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)

Options:

A.

Retention

B.

Integrity

C.

Transmission

D.

Consistency

E.

Encryption

F.

Deletion

Question 17

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 18

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Question 19

A data analyst has a set of data that shows the number of gallons of oil produced each day. The company would like to know the standard deviation for the data set. The variance for the data is 36 gallons. Which of the following is the standard deviation for gallons produced?

Options:

A.

1.16

B.

6

C.

36

D.

72

Question 20

Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)

Options:

A.

Data identification

B.

Data processing

C.

Data Reporting

D.

Data encryption

E.

Data masking

F.

Fata removal

Question 21

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Question # 21

Which of the following charts would be BEST to use?

Options:

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Question 22

What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?

Options:

A.

Qlik.

B.

Power BI.

C.

Domo.

D.

Dataroma.

Question 23

Given the image below:

Question # 23

Which of the following file formats is depicted?

Options:

A.

JSON

B.

CSV

C.

XML

D.

HTML

Question 24

Given the following customer and order tables:

Which of the following describes the number of rows and columns of data that would be present after performing an INNER JOIN of the tables?

Options:

A.

Five rows, eight columns

B.

Seven rows, eight columns

C.

Eight rows, seven columns

D.

Nine rows, five columns

Question 25

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Question # 25

Which of the following types of charts should be considered to BEST display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chat using the site and percentage of new customers data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 26

A data architect is designing a data solution for a retail clothing store chain. Each store has a database that tracks sales transactions. The data architect needs to create a summary table that will be used for a senior executive dashboard. The summary table should not contain duplicate store information. Which of the following should the data architect create?

Options:

A.

A check constraint

B.

A primary key

C.

A foreign key

D.

A unique constraint

Question 27

A data analyst has been asked to merge the tables below, first performing an INNER JOIN and then a LEFT JOIN:

Question # 27

Customer Table -

In-store Transactions –

Question # 27

Which of the following describes the number of rows of data that can be expected after performing both joins in the order stated, considering the customer table as the main table?

Options:

A.

INNER: 6 rows; LEFT: 9 rows

B.

INNER: 9 rows; LEFT: 6 rows

C.

INNER: 9 rows; LEFT: 15 rows

D.

INNER: 15 rows; LEFT: 9 rows

Question 28

Which of the following file formats is best suited to start exploratory analysis within statistical software?

Options:

A.

CSV

B.

XLSM

C.

XML

D.

JSON

Question 29

Taylor wants to investigate how manufacturing, marketing, and sales expenditures impact overall profitability for her company.

Which of the following systems is the most appropriate?

Options:

A.

OLTP.

B.

OLAP.

C.

Data warehouse.

D.

Data mart.

Question 30

Given the customer table below:

Question # 30

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 31

Given the following table:

Question # 31

Which of the following describes the data quality issues with theagedata?

Options:

A.

Completeness

B.

Consistency

C.

Accuracy

D.

Manipulation

Question 32

An analyst needs to summarize the number of people in Chicago in 2022 using the following set of data:

Question # 32

Which of the following steps should the analyst use to provide results? (Select two).

Options:

A.

Aggregation

B.

Sorting

C.

Filtering

D.

Indexing

E.

Cleaning

F.

Replacing

Question 33

Which of the following would be used to store unstructured data from different sources?

Options:

A.

A data lake

B.

A database management system

C.

A database

D.

A data warehouse

Question 34

A business intelligence engineer needs to reduce the size of a data model for reporting purposes. The data set contains more than one million rows, and the table has a date-time column named Date. Which of the following should the analyst do to complete this task?

Options:

A.

Change the data type of the Date column to text.

B.

Trim the date.

C.

Round the hour of the Date column to the start of the hour.

D.

Split the Date column into two columns—time and date.

Question 35

Which one of the following would not normally be considered a summary statistic?

Options:

A.

z-score.

B.

Mean.

C.

Variance.

D.

Standard deviation.

Question 36

Which of the following is a non-parametric test?

Options:

A.

One-sample t-test

B.

Two-way ANOVA

C.

Correlation coefficient

D.

Spearman's rank correlation

Question 37

Which of the following differentiates a flat text file from other data types?

Options:

A.

Data is separated by a delimiter.

B.

Data is stored in defined rows.

C.

Data is defined with key-value pairs.

D.

Data is housed in a markup language.

Question 38

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Question # 38

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Question 39

What subset of Structured Query Language (SQL) is used to add, remove, modify, or retrieve the information stored within a relational database?

Options:

A.

DDL.

B.

DSL.

C.

DQL.

D.

DML.

Question 40

A cereal manufacturer wants to determine whether the sugar content of its cereal has increased over the years. Which of the following is the appropriate descriptive statistic to use?

Options:

A.

Frequency

B.

Percent change

C.

Variance

D.

Mean

Question 41

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

Options:

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Question 42

Which of the following is an example of a strategy to reduce statistical errors?

Options:

A.

Removing outliers

B.

Adding more data

C.

Transformation

D.

Recoding data

Question 43

Which of the following should be accomplished NEXT after understanding a business requirement for a data analysis report?

Options:

A.

Rephrase the business requirement.

B.

Determine the data necessary for the analysis

C.

Build a mock dashboard/presentation layout.

D.

Perform exploratory data analysis.

Question 44

Q3 2020 has just ended, and now a data analyst needs to create an ad-hoc sales report that demonstrates how well the Q3 2020 promotion went versus last year's Q3 promotion.

Which of the following date parameters should the analyst use?

Options:

A.

2019 vs. YTD 2020

B.

Q3 2019 vs. Q3 2020

C.

YTD 2019 vs. YTD 2020

D.

Q4 2019 vs. Q3 2020

Question 45

After the daily ETL jobs are completed, the data in the reports does not appear complete, and a lot of data seems to be missing. Which of the following concepts should be used to assess and investigate further?

Options:

A.

Cross-validation

B.

Data profiling

C.

Data integrity

D.

Data consistency

Question 46

Given the following table:

Question # 46

Which of the following methods is the best way to describe the changes in the values in the table?

Options:

A.

Average

B.

Range

C.

Standard deviation

D.

Median

Question 47

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Question # 47

Using this information, which of the following students had the BEST score?

Options:

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Question 48

While reviewing survey data, a research analyst notices data is missing from all the responses to a single question. Which of the following methods would BEST address this issue?

Options:

A.

Replace missing data.

B.

Remove duplicate data.

C.

Replace redundant data.

D.

Remove invalid data.

Question 49

Which of the following data manipulation techniques should an analyst use to hide unnecessary data during analysis?

Options:

A.

Filtering

B.

Parametrization

C.

Sorting

D.

Indexing

Question 50

Under which of the following circumstances should the null hypothesis be accepted when a = 0.05?

Options:

A.

When p is 0.00003

B.

When p is 0.001

C.

When p is 0.04

D.

When p is 0.06

Question 51

During data cleansing, an analyst conducts measures of central tendency on a data set. Which of the following data is the analyst attempting to identify?

Options:

A.

Duplicate

B.

Missing

C.

Outlying

D.

Invalid

Question 52

A data analyst has been asked to create a sales report that calculates the rolling 12-month average for sales. If the report will be published on November 1, 2020, which of the following months shouts the report cover?

Options:

A.

October 1, 2019 to October 31, 2020

B.

October 31, 2020 to November 1, 2021

C.

November 1, 2019 to October 31, 2020

D.

October 31, 2019 to October 31, 2020

Question 53

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

Options:

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Question 54

Which of the following is an example of a discrete data type?

Options:

A.

8in (20cm)

B.

5 kids

C.

2.5mi (4km)

D.

10.7lbs (4.9kg)

Question 55

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company's year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 56

A sales team wants visibility of current sales numbers, pipeline, and team performance. The team would also like to see calculations of individuals’ earned commissions and projected commissions based on sales, but they want that information to be kept confidential. Which of the following would be the BEST way to provide this visibility?

Options:

A.

Create a dashboard displaying a data refresh date so users know the current sales numbers and configure permissions to control access.

B.

Create a dashboard for sales numbers, pipeline, and team and individual performance for the management team.

C.

Create a dashboard with filters for the overall team, individuals, and management. Users can filter to see the data they want.

D.

Create a dashboard with views for team, individuals, and management. Configure permissions to control access.

Question 57

An analyst is working with the income data of suburban families in the United States. The data set has a lot of outliers, and the analyst needs to provide a measure that represents the typical income. Which of the following would BEST fulfill the analyst’s goal?

Options:

A.

Median

B.

Mean

C.

Mode

D.

Standard deviation

Question 58

A data analyst received a large amount of third-party data that needs to be joined with in-house data files. After the data is joined, the analyst notices three columns all contain dates. Which of the following should the analyst do to maintain data consistency?

Options:

A.

Append all date columns and parse the strings.

B.

Impute all three date columns and then merge.

C.

Merge all date columns and unify the format.

D.

Separate the columns into a table and merge.

Question 59

‘Which of the following is the BEST reason to use database views instead of tables?

Options:

A.

Views reduce the need for repetitive, complex data joins.

B.

Views allow for the storage of temporary data. whereas tables do not.

C.

Views allow for the joining of multiple data sources, whereas tables do not.

D.

Views can be used to restrict sensitive information.

Question 60

Which of the following is a characteristic of a relational database?

Options:

A.

It utilizes key-value pairs.

B.

It has undefined fields.

C.

It is structured in nature.

D.

It uses minimal memory.

Question 61

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

Options:

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Question 62

An analyst needs to conduct a quick analysis. Which of the following is the FIRST step the analyst should perform with the data?

Options:

A.

Conduct an exploratory analysis and use descriptive statistics.

B.

Conduct a trend analysis and use a scatter chart.

C.

Conduct a link analysis and illustrate the connection points.

D.

Conduct an initial analysis and use a Pareto chart.

Question 63

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

Question # 63

Which of the following types of functions would be the most appropriate to use?

Options:

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Question 64

An analyst in a consumer bank department wants to showcase the concentration of accounts opened in the United States by ZIP Code to describe the effectiveness of the bank's marketing campaigns. Which of the following would be the best way to visualize the data?

Options:

A.

A stacked chart

B.

A tree map

C.

A waterfall chart

D.

A geographic map

Question 65

A business unit made the following modification to the values in a table:

Question # 65

Which of the following data quality dimensions was applied in this scenario?

Options:

A.

Integrity

B.

Consistency

C.

Completeness

D.

Accuracy

Question 66

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Question # 66

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 67

A data analyst received the information in the table below from a recently completed marketing campaign:

Question # 67

Which of the following is the total order conversion rate?

Options:

A.

13.2%

B.

14.8%

C.

22.3%

D.

85.2%

Question 68

A data analyst needs to perform a full outer join of a customer's orders using the tables below:

Question # 68

Which of the following is the mean of the order quantity?

Options:

A.

73.5

B.

76.5

C.

78.8

D.

81.5

Question 69

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

Options:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Question 70

A sales director has requested a report for individual team members within the division be developed. The director would like the report to be shared with all team members, but individual team members should not be identifiable within the report Which of the following access requirements would support the director's needs?

Options:

A.

Create an acceptable use policy for the sales data.

B.

Release the report as user-group-based access and include data masking.

C.

Get a data use agreement from the individual team members.

D.

Provide the report based on role and include data encryption.

Question 71

Consider this dataset showing the retirement age of 11 people, in whole years:

54, 54, 54, 55, 56, 57, 57, 58, 58, 60, 60

This tables show a simple frequency distribution of the retirement age data.

Question # 71

Options:

A.

56

B.

55

C.

57

D.

54

Question 72

A sales manager requested a report that contains the first name, last name, and phone number of all the company’s customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?

Options:

A.

FULL OUTER JOIN

B.

INNER JOIN

C.

LEFT OUTER JOIN

D.

CROSS JOIN

Question 73

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 74

Which of the following is an example of structured data?

Options:

A.

A credit card number

B.

An email

C.

A photo

D.

Social media correspondence

Question 75

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 76

A data analyst has been asked to create one table that has each employee's first name, last name, sales, and address. The sales and addresses are listed in the tables below:

Question # 76

Which of the following steps should the analyst take to create the table?

Options:

A.

Transpose the first name and last name in both tables. Use lookup to pull the address field from Table 2 into Table 1.

B.

Use lookup with the first name or first name to pull the address field from Table 2 into Table 1.

C.

Use the append formula in both tables for the first name and last name. Use lookup to pull the address field from Table 2 into Table 1.

D.

Create a column that concatenates the first name and last name in each table. Use concatenate and lookup to bring the address field into Table 1.

Question 77

Which of the following concepts should be applied if a data set with 40 fields needs to be pared down to 20 fields and contains similar data across multiple fields?

Options:

A.

Duplication

B.

Consolidation

C.

Compliance

D.

Standardization

Question 78

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 79

Given the following:

Question # 79

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

Options:

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Question 80

Given the diagram below:

Question # 80

Which of the following steps is missing?

Options:

A.

Remove redundant data.

B.

Validate the data types.

C.

Connect to the data API.

D.

Normalize the data.

Question 81

Which of the following analysis techniques is an unsupervised data mining process?

Options:

A.

Clustering

B.

Descriptive

C.

Regression

D.

Predictive

Question 82

An analyst needs to determine the appropriate data type for the following sample data:

sample data collected:

Which of the following data types should be used for this data?

Options:

A.

Text

B.

Float

C.

Alphanumeric

D.

Numeric

Question 83

An analyst reviews the following table:

Question # 83

Which of the following data types is represented in the values in the RefNo column?

Options:

A.

Numeric

B.

Real Number

C.

Currency

D.

Alphanumeric

Question 84

Which of the following variable name formats would be problematic if used in the majority of data software programs?

Options:

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Question 85

Given the image below:

Question # 85

The data should be cleaned because of the presence of:

Options:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Question 86

A web developer wants to ensure that malicious users can't type SQL statements when they asked for input, like their username/userid.

Which of the following query optimization techniques would effectively prevent SQL Injection attacks?

Options:

A.

Indexing.

B.

Subset of records.

C.

Temporary table in the query set.

D.

Parametrization.

Question 87

An analyst is explaining the company’s financial systems and reporting tools to a new coworker. Which of the following data quality dimensions are the most important? (Select three).

Options:

A.

Data formatting

B.

Data accuracy

C.

Data maturity

D.

Data field

E.

Data completeness

F.

Data consistency

G.

Data diversity

Question 88

Which of the following is an object associated with a table that sorts and stores table row data in a key-value pair?

Options:

A.

Foreign key

B.

Function

C.

Stored procedure

D.

Clustered index

Question 89

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Question # 89

Which of the following is the mean?

Options:

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Question 90

A database administrator needs to increase performance on a large dimension table. Which of the following is the best way to accomplish this task?

Options:

A.

Sampling

B.

Partitioning

C.

Windowing

D.

Sorting

Question 91

An analyst needs to create an analytics dashboard for an employee intranet site to improve the search functionality, display relevant information, and maintain an updated FAQ page. Which of the following visualizations would best represent what employees are searching for?

Options:

A.

A word cloud

B.

A histogram

C.

A pie chart

D.

A scatter plot

Question 92

A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?

Options:

A.

Data content

B.

Frequency

C.

Filtering

D.

Views

Question 93

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

Options:

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Question 94

Which of the following would be considered non-personally identifiable information?

Options:

A.

Cell phone device name

B.

Customer’s name

C.

Government ID number

D.

Telephone number

Question 95

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

A Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 96

A data engineer is creating a database field to capture whether a customer likes vanilla ice cream. Which of the following data types is the best to capture this information?

Options:

A.

Integer

B.

Boolean

C.

Categorical

D.

Numeric

Question 97

A company needs a report that provides executives an overview and regional managers with both an overview and specifics. Which of the following reporting elements will achieve these results?

Options:

A.

Observations and insights

B.

Live data feed

C.

Drill-down function

D.

Access permissions

Question 98

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Question # 98

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

Options:

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Question 99

Which of the following best describes a difference between JSON and XML?

Options:

A.

JSON is quicker to read and write.

B.

JSON has to use an end tag.

C.

JSON strings are longer

D.

JSON is much more difficult to parse.

Question 100

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 101

A sales manager wants quarterly sales reports broken down by unit and week. Which of the following data output lists includes the most necessary information?

Options:

A.

Order number. salesperson. date shipped, recipient address, and price

B.

Item name, salesperson. recipient address, shipping cost. and date shipped

C.

Item number, item name, salesperson. date sold. and price

D.

Item name. salesperson. price. shipping cost. and date shipped

Question 102

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

Options:

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Question 103

Which of the following best describes a 95% confidence interval?

Options:

A.

There is a 95% probability that a sample is within one standard deviation of the mean.

B.

A stated range may contain 95% of the population mean, 95% of the time.

C.

A set of ranges contains the population mean with 95% certainty.

D.

A range contains 95% of the population mean.

Question 104

An analyst is reporting on the average income for a county and is reviewing the following data:

Question # 104

Which of the following is the reason the analyst would need to cleanse the data in this data set?

Options:

A.

Data completeness

B.

Data outliers

C.

Duplicate data

D.

Missing values

Question 105

Which of the following data types must be used when working with variables that require classification into two or more groups before analysis?

Options:

A.

Discrete

B.

Numerical

C.

Alphanumeric

D.

Categorical

Question 106

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

Options:

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Question 107

A data analyst has been asked to create a daily manufacturing report for the floor manager Which of the following metrics should be included in the report?

Options:

A.

Tons of steel produced per hour

B.

Annual sales budget

C.

End-of-day stock price

D.

Daily corporate employee count

Question 108

An analyst conducted a preliminary analysis for a data set and identified several patterns and anomalies. Which of the following analysis techniques did the analyst use?

Options:

A.

Performance analysis

B.

Exploratory analysis

C.

Link analysis

D.

Trend analysis

Page: 1 / 36
Total 363 questions