Labour Day Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: suredis

Microsoft DP-100 Designing and Implementing a Data Science Solution on Azure Exam Practice Test

Page: 1 / 41
Total 407 questions

Designing and Implementing a Data Science Solution on Azure Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$47.25  $134.99

PDF Study Guide

  • Product Type: PDF Study Guide
$40.25  $114.99
Question 1

You need to select a feature extraction method.

Which method should you use?

Options:

A.

Mutual information

B.

Mood’s median test

C.

Kendall correlation

D.

Permutation Feature Importance

Question 2

You need to replace the missing data in the AccessibilityToHighway columns.

How should you configure the Clean Missing Data module? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 3

You need to identify the methods for dividing the data according to the testing requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 4

You need to implement early stopping criteria as suited in the model training requirements.

Which three code segments should you use to develop the solution? To answer, move the appropriate code segments from the list of code segments to the answer area and arrange them in the correct order.

NOTE: More than one order of answer choices is correct. You will receive credit for any of the correct orders you select.

Options:

Question 5

You need to correct the model fit issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 6

You need to configure the Edit Metadata module so that the structure of the datasets match.

Which configuration options should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 7

You need to configure the Feature Based Feature Selection module based on the experiment requirements and datasets.

How should you configure the module properties? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 8

You need to visually identify whether outliers exist in the Age column and quantify the outliers before the outliers are removed.

Which three Azure Machine Learning Studio modules should you use in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 9

You need to set up the Permutation Feature Importance module according to the model training requirements.

Which properties should you select? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 10

You need to produce a visualization for the diagnostic test evaluation according to the data visualization requirements.

Which three modules should you recommend be used in sequence? To answer, move the appropriate modules from the list of modules to the answer area and arrange them in the correct order.

Options:

Question 11

You need to identify the methods for dividing the data according, to the testing requirements.

Which properties should you select? To answer, select the appropriate option-, m the answer area. NOTE: Each correct selection is worth one point.

Options:

Question 12

You need to configure the Permutation Feature Importance module for the model training requirements.

What should you do? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 13

You need to select a feature extraction method.

Which method should you use?

Options:

A.

Spearman correlation

B.

Mutual information

C.

Mann-Whitney test

D.

Pearson’s correlation

Question 14

You must store data in Azure Blob Storage to support Azure Machine Learning.

You need to transfer the data into Azure Blob Storage.

What are three possible ways to achieve the goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

Bulk Insert SQL Query

B.

AzCopy

C.

Python script

D.

Azure Storage Explorer

E.

Bulk Copy Program (BCP)

Question 15

You configure a Deep Learning Virtual Machine for Windows.

You need to recommend tools and frameworks to perform the following:

  • Build deep neural network (DNN) models
  • Perform interactive data exploration and visualization

Which tools and frameworks should you recommend? To answer, drag the appropriate tools to the correct tasks. Each tool may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Options:

Question 16

You are developing deep learning models to analyze semi-structured, unstructured, and structured data types.

You have the following data available for model building:

  • Video recordings of sporting events
  • Transcripts of radio commentary about events
  • Logs from related social media feeds captured during sporting events

You need to select an environment for creating the model.

Which environment should you use?

Options:

A.

Azure Cognitive Services

B.

Azure Data Lake Analytics

C.

Azure HDInsight with Spark MLib

D.

Azure Machine Learning Studio

Question 17

You are implementing hyperparameter tuning by using Bayesian sampling for an Azure ML Python SDK v2-based model training from a notebook. The notebook is in an Azure Machine Learning workspace. The notebook uses a training script that runs on a compute cluster with 20 nodes.

The code implements Bandit termination policy with slack_factor set to 02 and a sweep job with max_concurrent_trials set to 10.

You must increase effectiveness of the tuning process by improving sampling convergence.

You need to select which sampling convergence to use.

What should you select?

Options:

A.

Set the value of slack. factor of earty. termination policy to 0.1.

B.

Set the value of max_concurrent_trials to 4.

C.

Set the value of slack_factor of eartyjermination policy to 0.9.

D.

Set the value of max. concurrentjrials to 20.

Question 18

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You train and register an Azure Machine Learning model.

You plan to deploy the model to an online endpoint.

You need to ensure that applications will be able to use the authentication method with a non-expiring artifact to access the model.

Solution:

Create a managed online endpoint and set the value of its auto_mode parameter to key. Deploy the model to the inline endpoint.

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 19

You are using a Git repository to track work in an Azure Machine Learning workspace.

You need to authenticate a Git account by using SSH.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 20

You must use the Azure Machine Learning SDK to interact with data and experiments in the workspace.

You need to configure the config.json file to connect to the workspace from the Python environment.

Which two additional parameters must you add to the config.json file in order to connect to the workspace? Each correct answer presents part of the solution.

NOTE: Each correct selection is worth one point.

Options:

A.

subscription_Id

B.

Key

C.

resource_group

D.

region

E.

Login

Question 21

You plan to run a Python script as an Azure Machine Learning experiment.

The script must read files from a hierarchy of folders. The files will be passed to the script as a dataset argument.

You must specify an appropriate mode for the dataset argument.

Which two modes can you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

to_pandas_dataframe ()

B.

as_download()

C.

as_upload()

D.

as mount ()

Question 22

You create an Azure Machine Learning workspace. You are training a classification model with no-code AutoML in Azure Machine Learning studio.

The model must predict if a client of a financial institution will subscribe to a fixed-term deposit. You must preview the data profile in Azure Machine Learning studio once the dataset is created.

You need to train the model.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 23

You define a datastore named ml-data for an Azure Storage blob container. In the container, you have a folder named train that contains a file named data.csv. You plan to use the file to train a model by using the Azure Machine Learning SDK.

You plan to train the model by using the Azure Machine Learning SDK to run an experiment on local compute.

You define a DataReference object by running the following code:

You need to load the training data.

Which code segment should you use?

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

E.

Option E

Question 24

You create an experiment in Azure Machine Learning Studio. You add a training dataset that contains 10,000 rows. The first 9,000 rows represent class 0 (90 percent).

The remaining 1,000 rows represent class 1 (10 percent).

The training set is imbalances between two classes. You must increase the number of training examples for class 1 to 4,000 by using 5 data rows. You add the Synthetic Minority Oversampling Technique (SMOTE) module to the experiment.

You need to configure the module.

Which values should you use? To answer, select the appropriate options in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 25

You create an Azure Machine Learning workspace.

You must configure an event handler to send an email notification when data drift is detected in the workspace datasets. You must minimize development efforts.

You need to configure an Azure service to send the notification.

Which Azure service should you use?

Options:

A.

Azure Function apps

B.

Azure DevOps pipeline

C.

Azure Automation runbook

D.

Azure Logic Apps

Question 26

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You plan to use a Python script to run an Azure Machine Learning experiment. The script creates a reference to the experiment run context, loads data from a file, identifies the set of unique values for the label column, and completes the experiment run:

from azureml.core import Run

import pandas as pd

run = Run.get_context()

data = pd.read_csv('data.csv')

label_vals = data['label'].unique()

# Add code to record metrics here

run.complete()

The experiment must record the unique labels in the data as metrics for the run that can be reviewed later.

You must add code to the script to record the unique label values as run metrics at the point indicated by the comment.

Solution: Replace the comment with the following code:

for label_val in label_vals:

run.log('Label Values', label_val)

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 27

You are developing a machine learning model.

You must inference the machine learning model for testing.

You need to use a minimal cost compute target

Which two compute targets should you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point

Options:

A.

Local web service

B.

Remote VM

C.

Azure Databricks

D.

Azure Machine Learning Kubernetes

E.

Azure Container Instances

Question 28

You manage an Azure Machine Learning workspace named workspace1.

You must register an Azure Blob storage datastore in workspace1 by using an access key. You develop Python SDK v2 code to import all modules required to register the datastore.

You need to complete the Python SDK v2 code to define the datastore.

How should you complete the code? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 29

You use the following code to define the steps for a pipeline:

from azureml.core import Workspace, Experiment, Run

from azureml.pipeline.core import Pipeline

from azureml.pipeline.steps import PythonScriptStep

ws = Workspace.from_config()

. . .

step1 = PythonScriptStep(name="step1", ...)

step2 = PythonScriptsStep(name="step2", ...)

pipeline_steps = [step1, step2]

You need to add code to run the steps.

Which two code segments can you use to achieve this goal? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point.

Options:

A.

experiment = Experiment(workspace=ws,

name='pipeline-experiment')

run = experiment.submit(config=pipeline_steps)

B.

run = Run(pipeline_steps)

C.

pipeline = Pipeline(workspace=ws, steps=pipeline_steps)

experiment = Experiment(workspace=ws,

name='pipeline-experiment')

run = experiment.submit(pipeline)

D.

pipeline = Pipeline(workspace=ws, steps=pipeline_steps)

run = pipeline.submit(experiment_name='pipeline-experiment')

Question 30

You train a machine learning model.

You must deploy the model as a real-time inference service for testing. The service requires low CPU utilization and less than 48 MB of RAM. The compute target for the deployed service must initialize automatically while minimizing cost and administrative overhead.

Which compute target should you use?

Options:

A.

Azure Kubernetes Service (AKS) inference cluster

B.

Azure Machine Learning compute cluster

C.

Azure Container Instance (ACI)

D.

attached Azure Databricks cluster

Question 31

You have an Azure Machine learning workspace. The workspace contains a dataset with data in a tabular form.

You plan to use the Azure Machine Learning SDK for Python vl to create a control script that will load the dataset into a pandas dataframe in preparation for model training The script will accept a parameter designating the dataset

You need to complete the script.

How should you complete the script? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 32

You need to implement source control for scripts in an Azure Machine Learning workspace. You use a terminal window in the Azure Machine Learning Notebook tab

You must authenticate your Git account with SSH.

You need to generate a new SSH key.

Which four actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them m the correct order.

Options:

Question 33

You are evaluating a Python NumPy array that contains six data points defined as follows:

data = [10, 20, 30, 40, 50, 60]

You must generate the following output by using the k-fold algorithm implantation in the Python Scikit-learn machine learning library:

train: [10 40 50 60], test: [20 30]

train: [20 30 40 60], test: [10 50]

train: [10 20 30 50], test: [40 60]

You need to implement a cross-validation to generate the output.

How should you complete the code segment? To answer, select the appropriate code segment in the dialog box in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 34

You are the owner of an Azure Machine Learning workspace.

You must prevent the creation or deletion of compute resources by using a custom role. You must allow all other operations inside the workspace.

You need to configure the custom role.

How should you complete the configuration? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 35

You create a training pipeline by using the Azure Machine Learning designer. You need to load data into a machine learning pipeline by using the Import Data component. Which two data sources could you use? Each correct answer presents a complete solution.

NOTE: Each correct selection is worth one point

Options:

A.

Azure Blob storage container through a registered datastore

B.

Azure SQL Database

C.

URL via HTTP

D.

Azure Data Lake Storage Gen2

E.

Registered dataset

Question 36

You have a binary classifier that predicts positive cases of diabetes within two separate age groups.

The classifier exhibits a high degree of disparity between the age groups.

You need to modify the output of the classifier to maximize its degree of fairness across the age groups and meet the following requirements:

• Eliminate the need to retrain the model on which the classifier is based.

• Minimize the disparity between true positive rates and false positive rates across age groups.

Which algorithm and panty constraint should you use? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.

Options:

Question 37

You previously deployed a model that was trained using a tabular dataset named training-dataset, which is based on a folder of CSV files.

Over time, you have collected the features and predicted labels generated by the model in a folder containing a CSV file for each month. You have created two tabular datasets based on the folder containing the inference data: one named predictions-dataset with a schema that matches the training data exactly, including the predicted label; and another named features-dataset with a schema containing all of the feature columns and a timestamp column based on the filename, which includes the day, month, and year.

You need to create a data drift monitor to identify any changing trends in the feature data since the model was trained. To accomplish this, you must define the required datasets for the data drift monitor.

Which datasets should you use to configure the data drift monitor? To answer, drag the appropriate datasets to the correct data drift monitor options. Each source may be used once, more than once, or not at all. You may need to drag the split bar between panes or scroll to view content.

NOTE: Each correct selection is worth one point.

Options:

Question 38

You are planning to register a trained model in an Azure Machine Learning workspace.

You must store additional metadata about the model in a key-value format. You must be able to add new metadata and modify or delete metadata after creation.

You need to register the model.

Which parameter should you use?

Options:

A.

description

B.

model_framework

C.

cags

D.

properties

Question 39

Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.

After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.

You are creating a new experiment in Azure Machine Learning Studio.

One class has a much smaller number of observations than the other classes in the training set.

You need to select an appropriate data sampling strategy to compensate for the class imbalance.

Solution: You use the Scale and Reduce sampling mode.

Does the solution meet the goal?

Options:

A.

Yes

B.

No

Question 40

You train a model and register it in your Azure Machine Learning workspace. You are ready to deploy the model as a real-time web service.

You deploy the model to an Azure Kubernetes Service (AKS) inference cluster, but the deployment fails because an error occurs when the service runs the entry script that is associated with the model deployment.

You need to debug the error by iteratively modifying the code and reloading the service, without requiring a re-deployment of the service for each code update.

What should you do?

Options:

A.

Register a new version of the model and update the entry script to load the new version of the model from its registered path.

B.

Modify the AKS service deployment configuration to enable application insights and re-deploy to AKS.

C.

Create an Azure Container Instances (ACI) web service deployment configuration and deploy the model on ACI.

D.

Add a breakpoint to the first line of the entry script and redeploy the service to AKS.

E.

Create a local web service deployment configuration and deploy the model to a local Docker container.

Question 41

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 42

You need to define a process for penalty event detection.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 43

You need to use the Python language to build a sampling strategy for the global penalty detection models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Question 44

You need to implement a scaling strategy for the local penalty detection data.

Which normalization type should you use?

Options:

A.

Streaming

B.

Weight

C.

Batch

D.

Cosine

Question 45

You need to implement a feature engineering strategy for the crowd sentiment local models.

What should you do?

Options:

A.

Apply an analysis of variance (ANOVA).

B.

Apply a Pearson correlation coefficient.

C.

Apply a Spearman correlation coefficient.

D.

Apply a linear discriminant analysis.

Question 46

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 47

You need to resolve the local machine learning pipeline performance issue. What should you do?

Options:

A.

Increase Graphic Processing Units (GPUs).

B.

Increase the learning rate.

C.

Increase the training iterations,

D.

Increase Central Processing Units (CPUs).

Question 48

You need to modify the inputs for the global penalty event model to address the bias and variance issue.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 49

You need to define an evaluation strategy for the crowd sentiment models.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 50

You need to select an environment that will meet the business and data requirements.

Which environment should you use?

Options:

A.

Azure HDInsight with Spark MLlib

B.

Azure Cognitive Services

C.

Azure Machine Learning Studio

D.

Microsoft Machine Learning Server

Question 51

You need to define a modeling strategy for ad response.

Which three actions should you perform in sequence? To answer, move the appropriate actions from the list of actions to the answer area and arrange them in the correct order.

Options:

Question 52

You need to implement a new cost factor scenario for the ad response models as illustrated in the

performance curve exhibit.

Which technique should you use?

Options:

A.

Set the threshold to 0.5 and retrain if weighted Kappa deviates +/- 5% from 0.45.

B.

Set the threshold to 0.05 and retrain if weighted Kappa deviates +/- 5% from 0.5.

C.

Set the threshold to 0.2 and retrain if weighted Kappa deviates +/- 5% from 0.6.

D.

Set the threshold to 0.75 and retrain if weighted Kappa deviates +/- 5% from 0.15.

Question 53

You need to implement a model development strategy to determine a user’s tendency to respond to an ad.

Which technique should you use?

Options:

A.

Use a Relative Expression Split module to partition the data based on centroid distance.

B.

Use a Relative Expression Split module to partition the data based on distance travelled to the event.

C.

Use a Split Rows module to partition the data based on distance travelled to the event.

D.

Use a Split Rows module to partition the data based on centroid distance.

Question 54

You need to build a feature extraction strategy for the local models.

How should you complete the code segment? To answer, select the appropriate options in the answer area.

NOTE: Each correct selection is worth one point.

Options:

Page: 1 / 41
Total 407 questions