March Sale Special Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70special

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Page: 1 / 26
Total 262 questions

CompTIA Data+ Certification Exam Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$36  $119.99

PDF Study Guide

  • Product Type: PDF Study Guide
$31.5  $104.99
Question 1

A table in a hospital database has a column for patient height in inches and a column for patient height in centimeters. This is an example of:

Options:

A.

dependent data.

B.

duplicate data.

C.

invalid data

D.

redundant data

Question 2

A development company is constructing a new unit in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans, which of the following should be the price of the Rose unit?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 3

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 4

Which of the following is a difference between a primary key and a unique key?

Options:

A.

A unique key cannot take null values, whereas a primary key can take null values.

B.

There can be only one primary key in a data set, whereas there can be multiple unique keys.

C.

A primary key can take a value more than once, whereas a unique key cannot take a value more than once.

D.

A primary key cannot be a date variable, whereas a unique key can be.

Question 5

A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?

Options:

A.

Modify the date range on the report

B.

Include a time stamp on the report.

C.

Increase the frequency of report generation.

D.

Add a report run date to the report.

Question 6

Standardized tests are given to students in the middle of each month, and the results are ready by the end of the month. The superintendent needs a quick view of test performance. Which of the following would be the best recommendation to meet the superintendent's requirements?

Options:

A.

A dashboard with a continuous data stream and saved searches

B.

A report of test scores by classroom, emailed to the superintendent at the end of the month

C.

A report of test scores with pie charts showing student performance

D.

A dashboard with a scheduled delivery, the ability to filter scores by school, and bar charts for comparison

Question 7

A data analyst must fulfill a request for information that is needed weekly and should be automatically emailed to a specific set of users. Which of the following types of reports should the analyst recommend?

Options:

A.

A self-service report

B.

A research report

C.

An ad hoc report

D.

An operational report

Question 8

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 9

Which of the following technologies would be best suited for creating a multiple linear regression model?

Options:

A.

Microsoft Power Bl

B.

R

C.

SQL

D.

Tableau

Question 10

Samantha needs to share a list of her organization's top 50 customers with the VP of sales.

She would like to include the name of the customer, the business they represent, their contact information, and their total sales over the past year.

The VP does not have any specialized analytics skills or software but would like to make some personal notes on the dataset.

What would be the best tool for Samantha to use to share this information?

Options:

A.

Power BI.

B.

Microsoft Excel.

C.

Minitab.

D.

SAS.

Question 11

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 12

An employer needs to maintain adequate office staffing during the winter and wants to track storm data. Which of the following data collection methods should the employer use?

Options:

A.

Web scraping

B.

Public databases

C.

Observations

D.

Weather surveys

Question 13

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

A Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 14

Which of the following is an example of a flat file?

Options:

A.

CSV file

B.

PDF file

C.

JSON file

D.

JPEG file

Question 15

Which of the ing is the correct ion for a tab-delimited spre file?

Options:

A.

tap

B.

tar

C.

sv

D.

az

Question 16

Each month an analyst needs to execute a data pull for the two prior months. Which of the following is the most efficient function for the analyst to use?

Options:

A.

Logical

B.

Date

C.

Aggregate

D.

System

Question 17

A financial analyst is creating a daily billing report for a company. One night, the company's data warehouse did not update the data, which caused the data to be reported incorrectly the next day. Which of the following documentation elements should the analyst add to catch this error?

Options:

A.

Version number

B.

Data refresh

C.

Frequently asked questions tab

D.

Summary

Question 18

An analyst is preparing a report that contains weather data. The temperatures are shown in Fahrenheit. but they must be reported in Celsius. Which of the following should the analyst do to fix this issue?

Options:

A.

Normalize the data.

B.

Standardize the data.

C.

Rescale the data.

D.

Aggregate the data.

Question 19

Which one of the following values will appear first if they are sorted in descending order?

Options:

A.

Aaron.

B.

Molly.

C.

Xavier.

D.

Adam.

Question 20

Given the diagram below:

Which of the following steps is missing?

Options:

A.

Remove redundant data.

B.

Validate the data types.

C.

Connect to the data API.

D.

Normalize the data.

Question 21

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

Options:

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Question 22

Which one the following is not considered an aggregate function?

Options:

A.

SUM

B.

MIN

C.

SELECT

D.

MAX

Question 23

A data set was recorded using multimedia technology. Which of the following is a necessary step on the way to interpretation?

Options:

A.

Structural equation modeling

B.

Transcription

C.

Sequential analysis

D.

Sampling

Question 24

A recurring event is being stored in two databases that are housed in different geographical locations. A data analyst notices the event is being logged three hours earlier in one database than in the other database. Which of the following is the MOST likely cause of the issue?

Options:

A.

The data analyst is not querying the databases correctly.

B.

The databases are recording different events.

C.

The databases are recording the event in different time zones.

D.

The second database is logging incorrectly.

Question 25

You would like to measure how well an organization is achieving its goals.

What type of analysis should you perform?

Options:

A.

Performance analysis.

B.

Outlier analysis.

C.

Predictive analysis.

D.

Trend analysis.

Question 26

Which of the following are reasons to conduct data cleansing? (Select two).

Options:

A.

To perform web scraping

B.

To track KPls

C.

To improve accuracy

D.

To review data sets

E.

To increase the sample size

F.

To calculate trends

Question 27

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Which of the following charts would be BEST to use?

Options:

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Question 28

Jenny wants to study the academic performance of undergraduate sophomores and wants to determine the average grade point average at different points during an academic year.

What best describes the data set she needs?

Options:

A.

Sample.

B.

Observation.

C.

Variable.

D.

Population.

Question 29

Which of the following descriptive statistical methods are measures of central tendency? (Choose two.)

Options:

A.

Mean

B.

Minimum

C.

Mode

D.

Variance

E.

Correlation

F.

Maximum

Question 30

An analyst has received the requirements for an internal user dashboard. The analyst confirms the data sources and then creates a wireframe. Which of the following is the NEXT step the analyst should take in the dashboard creation process?

Options:

A.

Optimize the dashboard.

B.

Create subscriptions.

C.

Get stakeholder approval.

D.

Deploy to production.

Question 31

A financial institution is reporting on sales performance to a company at the account level. Due to the sensitive nature of the government the does il with, some account information is not shown. Which of the following fields should be masked?

Options:

A.

Sales volume

B.

Start date

C.

Product name

D.

Customer name

Question 32

Mario works with a group of R programmers tasked with copying data from an accounting system into a data warehouse.

In what phase are the group's R skills most relevant?

Options:

A.

Extract.

B.

Load.

C.

Transform.

D.

Purge.

Question 33

Given the table below:

Which of the following variable types BEST describes the “Year” column?

Options:

A.

Numeric

B.

Date

C.

Alphanumeric

D.

Text

Question 34

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

Options:

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Question 35

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

Options:

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Question 36

Amanda needs to create a dashboard that will draw information from many other data sources and present it to business leaders.

Which one of the following tools is least likely to meet her needs?

Options:

A.

QuickSight.

B.

Tableau.

C.

Power BI.

D.

SPSS Modeler.

Question 37

Given the image below:

The data should be cleaned because of the presence of:

Options:

A.

outlier

B.

non-parametric data.

C.

multicollinearity.

D.

invalid data.

Question 38

Which of the following is the best description of the term "data governance"?

Options:

A.

Data governance governs the development of a data visualization dashboard in an organization.

B.

Data governance is the policy that protects against data breaches by cybercriminals.

C.

Data governance is the process of analyzing, manipulating, and reporting data in an organization.

D.

Data governance is the availability, usability, integrity, and security of data in an enterprise.

Question 39

A database consists of one fact table that is composed of multiple dimensions. Depending on the dimension, each one can be represented by a denormalized table or multiple normalized tables. This structure is an example of a:

Options:

A.

transactional schema.

B.

star schema.

C.

non-relational schema.

D.

snowflake schema.

Question 40

What SQL command is used to delete an entire table from a database?

Options:

A.

DROP.

B.

MODIFY.

C.

DELETE.

D.

ALTER.

Question 41

A data analyst needs to calculate the mean for Q1 sales using the data set below:

Which of the following is the mean?

Options:

A.

$2,466.18

B.

$2,667.60

C.

$3,082.72

D.

$12,330.88

Question 42

Given the table below:

Which of the following variables can be considered inconsistent, and how many distinct values should the variable have?

Options:

A.

Name, one

B.

Gender, two

C.

Level, three

D.

Code, four

E.

Region, five

Question 43

Given the following:

Which of the following is the most important thing for an analyst to do when transforming the table for a trend analysis?

Options:

A.

Fill in the missing cost where it is null.

B.

Separate the table into two tables and create a primary key

C.

Replace the extended cost field with a calculated field.

D.

Correct the dates so they have the same format.

Question 44

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 45

A data analyst is performing a data merge within a spreadsheet using the tables below:

The analyst is attempting to pull the addresses from Table 2 into Table 1 using the last names and is receiving an error message. Which of the following steps can the analyst perform to fix the error?

Options:

A.

Use concatenate to combine the tables.

B.

Ensure the formula is pulling from right to left.

C.

Sort the data by the last name field.

D.

Review the spelling and data type.

Question 46

Which of the following is an example of PII?

Options:

A.

Age

B.

Name

C.

Ethnicity

D.

Gender

Question 47

Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?

  • Python

Options:

A.

R

B.

Microsoft Power Bl

C.

SAS

Question 48

An analyst has generated a report that includes the number of months in the first two quarters of 2019 when sales exceeded $50,000:

Which of the following functions did the analyst use to generate the data in the Sales_indicator column?

Options:

A.

Aggregate

B.

Logical

C.

Date

D.

Sort

Question 49

A sales analyst needs to report how the sales team is performing to target. Which of the following files will be important in determining 2019 performance attainment?

Options:

A.

2018 goal data

B.

2018 actual revenue

C.

2019 goal data

D.

2019 commission plan

Question 50

After completing web scraping, which of the following file formats needs to be parsed?

Options:

A.

.html

B.

.txt

C.

.csv

D.

.tsv

Question 51

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 52

A development company is constructing a new Init in its apartment complex. The complex has the following floor plans:

Using the average cost per square foot of the original floor plans. which of the following should be the price of the Rose Init?

Options:

A.

$640,900

B.

$690,000

C.

$705,200

D.

$702,500

Question 53

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Which of the following data manipulation techniques did the analyst use?

Options:

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Question 54

Which of the following can be used to translate data into another form so it can only be read by a user who has a key or a password?

Options:

A.

Data encryption.

B.

Data transmission.

C.

Data protection.

D.

Data masking.

Question 55

Which of the following is a control measure for preventing a data breach?

Options:

A.

Data transmission

B.

Data attribution

C.

Data retention

D.

Data encryption

Question 56

Given the information in the following tables:

Which of the following describes merging these tables to create a master file that includes all transactions for both online and in-store sales?

Options:

A.

Data audit

B.

Data completeness

C.

Data validation

D.

Data consolidation

Question 57

An e-commerce company recently tested a new website layout. The website was tested by a test group of customers, and an old website was presented to a control group. The table below shows the percentage of users in each group who made purchases on the websites:

Which of the following conclusions is accurate at a 95% confidence interval?

Options:

A.

In Germany, the increase in conversion from the new layout was not significant.

B.

In France, the increase in conversion from the new layout was not significant.

C.

In general, users who visit the new website are more likely to make a purchase.

D.

The new layout has the lowest conversion rates in the United Kingdom.

Question 58

Five dogs have the following heights in millimeters:

300, 430, 170, 470, 600

Which of the following is the mean height for the five dogs?

Options:

A.

394mm

B.

405mm

C.

493mm

D.

504mm

Question 59

Which of the following would be the best way to identify multicollinear attributes in a data set?

Options:

A.

Correlation coefficient

B.

Chi-squared test

C.

Two-sample f-test

D.

Two-way ANOVA

Question 60

Which of the following are reasons to create and maintain a data dictionary? (Choose two.)

Options:

A.

To improve data acquisition

B.

To remember specifics about data fields

C.

To specify user groups for databases

D.

To provide continuity through personnel turnover

E.

To confine breaches of PHI data

F.

To reduce processing power requirements

Question 61

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

Options:

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Question 62

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

Options:

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Question 63

An analyst must obtain the average daily sales for the following week:

Which of the following must the analyst perform to obtain this value?

Options:

A.

Data normalization

B.

Data append

C.

Data aggregation

D.

Data blending

Question 64

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Choose two.)

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period the report covers

F.

The date on which the report was run

Question 65

Which of the following database schemas features normalized dimension tables?

Options:

A.

Flat

B.

Snowflake

C.

Hierarchical

D.

Star

Question 66

A junior web developer is developing a new application where users can upload short videos. The first task is to create a homepage that shows the headline "Upload Your Short Videos" and a clickable button that says "upload now".

Which of the following HTML commands would help the developer to complete the task successfully?

Options:

A.

< span >Upload Your Short Videos< /span >< button >upload now< /button >

B.

< p >Upload Your Short Videos< /p >< p >upload now< /p >

C.

< hl >Upload Your Short Videos< /h1 >< button >upload now< /button >

D.

< hl >Upload Your Short Videos< /h1 >< hl >upload now< /h1 >

Question 67

Given the following data tables:

Which of the following MDM processes needs to take place FIRST?

Options:

A.

Creation of a data dictionary

B.

Compliance with regulations

C.

Standardization of data field names

D.

Consolidation of multiple data fields

Question 68

A data analyst is creating a report that will provide information about various regions, products, and time periods. Which of the following formats would be the MOST efficient way to deliver this report?

Options:

A.

A workbook with multiple tabs for each region

B.

A daily email with snapshots of regional summaries

C.

A static report with a different page for every filtered view

D.

A dashboard with filters at the top that the user can toggle

Question 69

An analyst develops an IT document and needs to describe the technical terms used in the document. Which of the following is where the analyst should include descriptions of the technical terms?

Options:

A.

Glossary

B.

System diagram

C.

User requirements

D.

Index

Question 70

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

non-relational schema.

B.

galaxy schema.

C.

snowflake schema.

D.

star schema.

Question 71

A data analyst needs to present the results of an online marketing campaign to the marketing manager. The manager wants to see the most important KPIs and measure the return on marketing investment. Which of the following should the data analyst use to BEST communicate this information to the manager?

Options:

A.

A real-time monitor that allows the manager to view performance the day the campaign was launched

B.

A sell-service dashboard that allows the manager to look at the company’s annual budget performance

C.

A spreadsheet of the raw data from all marketing campaigns and channels

D.

A summary with statistics, conclusions, and recommendations from the data analyst

Question 72

Which of the following actions should be taken when transmitting data to mitigate the chance of a data leak occurring? (Choose two.)

Options:

A.

Data identification

B.

Data processing

C.

Data Reporting

D.

Data encryption

E.

Data masking

F.

Fata removal

Question 73

Angela is aggregating data from CRM system with data from an employee system.

While performing an initial quality check, she realizes that her employee ID is not associated with her identifier in the CRM system.

What kind of issues is Angela facing?

Choose the best answer.

Options:

A.

ETL process.

B.

Record linkage.

C.

ELT process.

D.

System integration.

Question 74

A data analyst needs to create a dashboard to help identify trends in the data sets. Which of the following is an appropriate consideration for dashboard development?

Options:

A.

Data sources and attributes

B.

Frequently asked questions

C.

A report from the data source

D.

A comparison of data sets

Question 75

An analyst notices changes in sales ratios when analyzing a quarterly report. Which of the following is the analyst conducting?

Options:

A.

A gap analysis

B.

A link analysis

C.

A trend analysis

D.

A statistical analysis

Question 76

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

Options:

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Question 77

Which of the following data cleansing issues will be fixed when a DISTINCT function is applied?

Options:

A.

Missing data

B.

Duplicate data

C.

Redundant data

D.

Invalid data

Question 78

Given the diagram below:

Which of the following types of sampling is depicted in the image?

Options:

A.

Stratified

B.

Random

C.

Cluster

D.

Systematic

Page: 1 / 26
Total 262 questions