Big Halloween Sale Limited Time Flat 70% Discount offer - Ends in 0d 00h 00m 00s - Coupon code: 70spcl

CertNexus AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) Exam Practice Test

Page: 1 / 9
Total 92 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Question 1

Which of the following describes a neural network without an activation function?

Options:

A.

A form of a linear regression

B.

A form of a quantile regression

C.

An unsupervised learning technique

D.

A radial basis function kernel

Question 2

A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?

Options:

A.

Mean squared error

B.

Precision and accuracy

C.

Precision and recall

D.

Recall and explained variance

Question 3

Which two of the following criteria are essential for machine learning models to achieve before deployment? (Select two.)

Options:

A.

Complexity

B.

Data size

C.

Explainability

D.

Portability

E.

Scalability

Question 4

Which of the following describes a benefit of machine learning for solving business problems?

Options:

A.

Increasing the quantity of original data

B.

Increasing the speed of analysis

C.

Improving the constraint of the problem

D.

Improving the quality of original data

Question 5

Which of the following is the definition of accuracy?

Options:

A.

(True Positives + False Positives) / Total Predictions

B.

(True Positives + True Negatives) / Total Predictions

C.

True Positives / (True Positives + False Negatives)

D.

True Positives / (True Positives + False Positives)

Question 6

What is the open framework designed to help detect, respond to, and remediate threats in ML systems?

Options:

A.

Adversarial ML Threat Matrix

B.

MITRE ATTandCK® Matrix

C.

OWASP Threat and Safeguard Matrix

D.

Threat Susceptibility Matrix

Question 7

Which of the following describes a typical use case of video tracking?

Options:

A.

Augmented dreaming

B.

Medical diagnosis

C.

Traffic monitoring

D.

Video composition

Question 8

For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.

Which assumption of linear regression is being violated?

Options:

A.

Equality of variance (Homoscedastidty)

B.

Independence

C.

Linearity

D.

Normality

Question 9

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

Options:

A.

After logit-transformation, the data may violate the assumption of independence.

B.

Noisy data could become more influential in your model.

C.

The model will be more likely to violate the assumption of normality.

D.

Values near 0.5 before logit-transformation will be near 0 after.

Question 10

A dataset can contain a range of values that depict a certain characteristic, such as grades on tests in a class during the semester. A specific student has so far received the following grades: 76,81, 78, 87, 75, and 72. There is one final test in the semester. What minimum grade would the student need to achieve on the last test to get an 80% average?

Options:

A.

82

B.

89

C.

91

D.

94

Question 11

Your dependent variable Y is a count, ranging from 0 to infinity. Because Y is approximately log-normally distributed, you decide to log-transform the data prior to performing a linear regression.

What should you do before log-transforming Y?

Options:

A.

Add 1 to all of the Y values.

B.

Divide all the Y values by the standard deviation of Y.

C.

Explore the data for outliers.

D.

Subtract the mean of Y from all the Y values.

Question 12

Which of the following is a common negative side effect of not using regularization?

Options:

A.

Overfitting

B.

Slow convergence time

C.

Higher compute resources

D.

Low test accuracy

Question 13

An organization sells house security cameras and has asked their data scientists to implement a model to detect human feces, as distinguished from animals, so they can alert th customers only when a human gets close to their house.

Which of the following algorithms is an appropriate option with a correct reason?

Options:

A.

A decision tree algorithm, because the problem is a classification problem with a small number of features.

B.

k-means, because this is a clustering problem with a small number of features.

C.

Logistic regression, because this is a classification problem and our data is linearly separable.

D.

Neural network model, because this is a classification problem with a large number of features.

Question 14

Which of the following sentences is true about model evaluation and model validation in ML pipelines?

Options:

A.

Model evaluation and validation are the same.

B.

Model evaluation is defined as an external component.

C.

Model validation is defined as a set of tasks to confirm the model performs as expected.

D.

Model validation occurs before model evaluation.

Question 15

Which of the following are true about the transform-design pattern for a machine learning pipeline? (Select three.)

It aims to separate inputs from features.

Options:

A.

It encapsulates the processing steps of ML pipelines.

B.

It ensures reproducibility.

C.

It represents steps in the pipeline with a directed acyclic graph (DAG).

D.

It seeks to isolate individual steps of ML pipelines.

E.

It transforms the output data after production.

Question 16

Which two encodes can be used to transform categories data into numerical features? (Select two.)

Options:

A.

Count Encoder

B.

Log Encoder

C.

Mean Encoder

D.

Median Encoder

E.

One-Hot Encoder

Question 17

In general, models that perform their tasks:

Options:

A.

Less accurately are less robust against adversarial attacks.

B.

Less accurately are neither more nor less robust against adversarial attacks.

C.

More accurately are less robust against adversarial attacks.

D.

More accurately are neither more nor less robust against adversarial attacks.

Question 18

Word Embedding describes a task in natural language processing (NLP) where:

Options:

A.

Words are converted into numerical vectors.

B.

Words are featurized by taking a histogram of letter counts.

C.

Words are featurized by taking a matrix of bigram counts.

D.

Words are grouped together into clusters and then represented by word cluster membership.

Question 19

Normalization is the transformation of features:

Options:

A.

By subtracting from the mean and dividing by the standard deviation.

B.

Into the normal distribution.

C.

So that they are on a similar scale.

D.

To different scales from each other.

Question 20

Why do data skews happen in the ML pipeline?

Options:

A.

Test and evaluation data are designed incorrectly.

B.

There Is a mismatch between live input data and offline data.

C.

There is a mismatch between live output data and offline data.

D.

There is insufficient training data for evaluation.

Question 21

Which of the following best describes distributed artificial intelligence?

Options:

A.

It does not require hyperparemeter tuning because the distributed nature accounts for the bias.

B.

It intelligently pre-distributes the weight of starting a neural network.

C.

It relies on a distributed system that performs robust computations across a network of unreliable nodes.

D.

It uses a centralized system to speak to decentralized nodes.

Question 22

A healthcare company experiences a cyberattack, where the hackers were able to reverse-engineer a dataset to break confidentiality.

Which of the following is TRUE regarding the dataset parameters?

Options:

A.

The model is overfitted and trained on a high quantity of patient records.

B.

The model is overfitted and trained on a low quantity of patient records.

C.

The model is underfitted and trained on a high quantity of patient records.

D.

The model is underfitted and trained on a low quantity of patient records.

Question 23

Which database is designed to better anticipate and avoid risks of AI systems causing safety, fairness, or other ethical problems?

Options:

A.

Asset

B.

Code Repository

C.

Configuration Management

D.

Incident

Question 24

Which of the following is the correct definition of the quality criteria that describes completeness?

Options:

A.

The degree to which all required measures are known.

B.

The degree to which a set of measures are equivalent across systems.

C.

The degree to which a set of measures are specified using the same units of measure in all systems.

D.

The degree to which the measures conform to defined business rules or constraints.

Question 25

Which two techniques are used to build personas in the ML development lifecycle? (Select two.)

Options:

A.

Population estimates

B.

Population regression

C.

Population resampling

D.

Population triage

E.

Population variance

Question 26

You are building a prediction model to develop a tool that can diagnose a particular disease so that individuals with the disease can receive treatment. The treatment is cheap and has no side effects. Patients with the disease who don't receive treatment have a high risk of mortality.

It is of primary importance that your diagnostic tool has which of the following?

Options:

A.

High negative predictive value

B.

High positive predictive value

C.

Low false negative rate

D.

Low false positive rate

Question 27

You are developing a prediction model. Your team indicates they need an algorithm that is fast and requires low memory and low processing power. Assuming the following algorithms have similar accuracy on your data, which is most likely to be an ideal choice for the job?

Options:

A.

Deep learning neural network

B.

Random forest

C.

Ridge regression

D.

Support-vector machine

Page: 1 / 9
Total 92 questions