Weekend Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70special

CertNexus AIP-210 CertNexus Certified Artificial Intelligence Practitioner (CAIP) Exam Practice Test

Page: 1 / 9
Total 92 questions

CertNexus Certified Artificial Intelligence Practitioner (CAIP) Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$37.5  $124.99

PDF Study Guide

  • Product Type: PDF Study Guide
$33  $109.99
Question 1

Which of the following describes a neural network without an activation function?

Options:

A.

A form of a linear regression

B.

A form of a quantile regression

C.

An unsupervised learning technique

D.

A radial basis function kernel

Question 2

Why do data skews happen in the ML pipeline?

Options:

A.

Test and evaluation data are designed incorrectly.

B.

There Is a mismatch between live input data and offline data.

C.

There is a mismatch between live output data and offline data.

D.

There is insufficient training data for evaluation.

Question 3

When working with textual data and trying to classify text into different languages, which approach to representing features makes the most sense?

Options:

A.

Bag of words model with TF-IDF

B.

Bag of bigrams (2 letter pairs)

C.

Word2Vec algorithm

D.

Clustering similar words and representing words by group membership

Question 4

Which of the following is a privacy-focused law that an AI practitioner should adhere to while designing and adapting an AI system that utilizes personal data?

Options:

A.

General Data Protection Regulation (GDPR)

B.

ISO/IEC 27001

C.

PCIDSS

D.

Sarbanes Oxley (SOX)

Question 5

The graph is an elbow plot showing the inertia or within-cluster sum of squares on the y-axis and number of clusters (also called K) on the x-axis, denoting the change in inertia as the clusters change using k-means algorithm.

What would be an optimal value of K to ensure a good number of clusters?

Options:

A.

2

B.

3

C.

5

D.

9

Question 6

Which of the following is a common negative side effect of not using regularization?

Options:

A.

Overfitting

B.

Slow convergence time

C.

Higher compute resources

D.

Low test accuracy

Question 7

Your dependent variable data is a proportion. The observed range of your data is 0.01 to 0.99. The instrument used to generate the dependent variable data is known to generate low quality data for values close to 0 and close to 1. A colleague suggests performing a logit-transformation on the data prior to performing a linear regression. Which of the following is a concern with this approach?

Definition of logit-transformation

If p is the proportion: logit(p)=log(p/(l-p))

Options:

A.

After logit-transformation, the data may violate the assumption of independence.

B.

Noisy data could become more influential in your model.

C.

The model will be more likely to violate the assumption of normality.

D.

Values near 0.5 before logit-transformation will be near 0 after.

Question 8

A product manager is designing an Artificial Intelligence (AI) solution and wants to do so responsibly, evaluating both positive and negative outcomes.

The team creates a shared taxonomy of potential negative impacts and conducts an assessment along vectors such as severity, impact, frequency, and likelihood.

Which modeling technique does this team use?

Options:

A.

Business

B.

Harms

C.

Process

D.

Threat

Question 9

Which of the following pieces of AI technology provides the ability to create fake videos?

Options:

A.

Generative adversarial networks (GAN)

B.

Long short-term memory (LSTM) networks

C.

Recurrent neural networks (RNN)

D.

Support-vector machines (SVM)

Question 10

Which of the following is a type 1 error in statistical hypothesis testing?

Options:

A.

The null hypothesis is false, but fails to be rejected.

B.

The null hypothesis is false and is rejected.

C.

The null hypothesis is true and fails to be rejected.

D.

The null hypothesis is true, but is rejected.

Question 11

Which of the following is the correct definition of the quality criteria that describes completeness?

Options:

A.

The degree to which all required measures are known.

B.

The degree to which a set of measures are equivalent across systems.

C.

The degree to which a set of measures are specified using the same units of measure in all systems.

D.

The degree to which the measures conform to defined business rules or constraints.

Question 12

You have a dataset with many features that you are using to classify a dependent variable. Because the sample size is small, you are worried about overfitting. Which algorithm is ideal to prevent overfitting?

Options:

A.

Decision tree

B.

Logistic regression

C.

Random forest

D.

XGBoost

Question 13

Which of the following unsupervised learning models can a bank use for fraud detection?

Options:

A.

Anomaly detection

B.

DB5CAN

C.

Hierarchical clustering

D.

k-means

Question 14

Which of the following are true about the transform-design pattern for a machine learning pipeline? (Select three.)

It aims to separate inputs from features.

Options:

A.

It encapsulates the processing steps of ML pipelines.

B.

It ensures reproducibility.

C.

It represents steps in the pipeline with a directed acyclic graph (DAG).

D.

It seeks to isolate individual steps of ML pipelines.

E.

It transforms the output data after production.

Question 15

When should you use semi-supervised learning? (Select two.)

Options:

A.

A small set of labeled data is available but not representative of the entire distribution.

B.

A small set of labeled data is biased toward one class.

C.

Labeling data is challenging and expensive.

D.

There is a large amount of labeled data to be used for predictions.

E.

There is a large amount of unlabeled data to be used for predictions.

Question 16

A big data architect needs to be cautious about personally identifiable information (PII) that may be captured with their new IoT system. What is the final stage of the Data Management Life Cycle, which the architect must complete in order to implement data privacy and security appropriately?

Options:

A.

De-Duplicate

B.

Destroy

C.

Detain

D.

Duplicate

Question 17

Normalization is the transformation of features:

Options:

A.

By subtracting from the mean and dividing by the standard deviation.

B.

Into the normal distribution.

C.

So that they are on a similar scale.

D.

To different scales from each other.

Question 18

Which of the following sentences is true about model evaluation and model validation in ML pipelines?

Options:

A.

Model evaluation and validation are the same.

B.

Model evaluation is defined as an external component.

C.

Model validation is defined as a set of tasks to confirm the model performs as expected.

D.

Model validation occurs before model evaluation.

Question 19

A classifier has been implemented to predict whether or not someone has a specific type of disease. Considering that only 1% of the population in the dataset has this disease, which measures will work the BEST to evaluate this model?

Options:

A.

Mean squared error

B.

Precision and accuracy

C.

Precision and recall

D.

Recall and explained variance

Question 20

For each of the last 10 years, your team has been collecting data from a group of subjects, including their age and numerous biomarkers collected from blood samples. You are tasked with creating a prediction model of age using the biomarkers as input. You start by performing a linear regression using all of the data over the 10-year period, with age as the dependent variable and the biomarkers as predictors.

Which assumption of linear regression is being violated?

Options:

A.

Equality of variance (Homoscedastidty)

B.

Independence

C.

Linearity

D.

Normality

Question 21

You train a neural network model with two layers, each layer having four nodes, and realize that the model is underfit. Which of the actions below will NOT work to fix this underfitting?

Options:

A.

Add features to training data

B.

Get more training data

C.

Increase the complexity of the model

D.

Train the model for more epochs

Question 22

You are building a prediction model to develop a tool that can diagnose a particular disease so that individuals with the disease can receive treatment. The treatment is cheap and has no side effects. Patients with the disease who don't receive treatment have a high risk of mortality.

It is of primary importance that your diagnostic tool has which of the following?

Options:

A.

High negative predictive value

B.

High positive predictive value

C.

Low false negative rate

D.

Low false positive rate

Question 23

Which of the following sentences is TRUE about the definition of cloud models for machine learning pipelines?

Options:

A.

Data as a Service (DaaS) can host the databases providing backups, clustering, and high availability.

B.

Infrastructure as a Service (IaaS) can provide CPU, memory, disk, network and GPU.

C.

Platform as a Service (PaaS) can provide some services within an application such as payment applications to create efficient results.

D.

Software as a Service (SaaS) can provide AI practitioner data science services such as Jupyter notebooks.

Question 24

You create a prediction model with 96% accuracy. While the model's true positive rate (TPR) is performing well at 99%, the true negative rate (TNR) is only 50%. Your supervisor tells you that the TNR needs to be higher, even if it decreases the TPR. Upon further inspection, you notice that the vast majority of your data is truly positive.

What method could help address your issue?

Options:

A.

Normalization

B.

Oversampling

C.

Principal components analysis

D.

Quality filtering

Question 25

Which of the following options is a correct approach for scheduling model retraining in a weather prediction application?

Options:

A.

As new resources become available

B.

Once a month

C.

When the input format changes

D.

When the input volume changes

Question 26

Which two techniques are used to build personas in the ML development lifecycle? (Select two.)

Options:

A.

Population estimates

B.

Population regression

C.

Population resampling

D.

Population triage

E.

Population variance

Question 27

Which of the following is the primary purpose of hyperparameter optimization?

Options:

A.

Controls the learning process of a given algorithm

B.

Makes models easier to explain to business stakeholders

C.

Improves model interpretability

D.

Increases recall over precision

Page: 1 / 9
Total 92 questions