Big Cyber Monday Sale Limited Time 70% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: 70special

CompTIA DA0-001 CompTIA Data+ Certification Exam Exam Practice Test

Page: 1 / 40
Total 396 questions

CompTIA Data+ Certification Exam Questions and Answers

Testing Engine

  • Product Type: Testing Engine
$37.5  $124.99

PDF Study Guide

  • Product Type: PDF Study Guide
$33  $109.99
Question 1

An analyst is designing a dashboard to determine which site has the highest percentage of new customers. The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered to best display the data?

Options:

A.

Include a bar chart using the site and the percentage of new customers data.

B.

Include a line chart using the site and the percentage of new customers data.

C.

Include a pie chart using the site and percentage of new custorners data.

D.

Include a scatter chart using the site and the percent of new customers data.

Question 2

An analyst runs a report on a daily basis, and the number of datapoints must be validated before the data can be analyzed. The number of datapoints increases each day by approximately 20% of the total number from the day before. On a given day, the number of datapoints was 8,798. Which of the following should be the total number of datapoints on the next day?

Options:

A.

7,038

B.

9,600

C.

10,600

D.

10,800

Question 3

A company wants to know how its customers interact with an e-commerce website based on clicks over items. Which of the following is the primary requirement for this report?

Options:

A.

Data content

B.

Frequency

C.

Filtering

D.

Views

Question 4

A data analyst has been asked to organize the table below in the following ways:

By sales from high to low -

By state in alphabetic order -

Which of the following functions will allow the data analyst to organize the table in this manner?

Options:

A.

Conditional formatting

B.

Grouping

C.

Filtering

D.

Sorting

Question 5

A research analyst collects ten data points from 1.000 specimens. The analyst will not need any additional data to complete the analysis and will not need to retrieve information by specifier. Which of the following is the best data structure for the analyst to use?

Options:

A.

NoSQL

B.

Flat file

C.

JSON

D.

Relational database

Question 6

‘Which of the following is the BEST reason to use database views instead of tables?

Options:

A.

Views reduce the need for repetitive, complex data joins.

B.

Views allow for the storage of temporary data. whereas tables do not.

C.

Views allow for the joining of multiple data sources, whereas tables do not.

D.

Views can be used to restrict sensitive information.

Question 7

A data analyst needs to create a data visualization that aids in un the cumulative impact of sequentially introduced values that are positive or negative. Which of the following

data visualization methods should the analyst use?

Options:

A.

A bubble chart

B.

A waterfall chart

C.

A scatter plot

D.

A line chart

Question 8

Which of the following best describes the process of examining data for statistics and information about the data?

    Cleansing

Options:

A.

search

B.

Profiling

C.

Governance

Question 9

Given the following report:

Which of the following components need to be added to ensure the report is point-in-time and static? (Choose two.)

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period the report covers

F.

The date on which the report was run

Question 10

A data architect is designing a data solution for a retail clothing store chain. Each store has a database that tracks sales transactions. The data architect needs to create a summary table that will be used for a senior executive dashboard. The summary table should not contain duplicate store information. Which of the following should the data architect create?

Options:

A.

A check constraint

B.

A primary key

C.

A foreign key

D.

A unique constraint

Question 11

An analyst conducted a preliminary analysis for a data set and identified several patterns and anomalies. Which of the following analysis techniques did the analyst use?

Options:

A.

Performance analysis

B.

Exploratory analysis

C.

Link analysis

D.

Trend analysis

Question 12

A data analyst needs to apply quality control concepts to a data set for accuracy. Which of the following is the best way to do this?

Options:

A.

Standardization

B.

Parameterization

C.

Encryption

D.

Cross-validation

Question 13

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company’s year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

A Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 14

Which of the following best describes a difference between JSON and XML?

Options:

A.

JSON is quicker to read and write.

B.

JSON has to use an end tag.

C.

JSON strings are longer

D.

JSON is much more difficult to parse.

Question 15

A client wants a new report that will be automatically emailed to all global sales teams on a weekly basis. Each sales team must be able to view the sales for its region and the combined sales for all regions. Which of the following would be the most efficient method for meeting the requirements?

Options:

A.

Creating a single report with a region filter

B.

Creating report distribution lists for the sales teams in each region

C.

Creating a unique copy of the report for each sales team region

D.

Creating a unique copy of the report for each recipient

Question 16

A data analyst has been asked to create a sales report that calculates the rolling 12-month average for sales. If the report will be published on November 1, 2020, which of the following months shouts the report cover?

Options:

A.

October 1, 2019 to October 31, 2020

B.

October 31, 2020 to November 1, 2021

C.

November 1, 2019 to October 31, 2020

D.

October 31, 2019 to October 31, 2020

Question 17

A data analyst received the information in the table below from a recently completed marketing campaign:

Which of the following is the total order conversion rate?

Options:

A.

13.2%

B.

14.8%

C.

22.3%

D.

85.2%

Question 18

Which of the following techniques is used to quantify data?

Options:

A.

Decoding

B.

Enumeration

C.

Coding

D.

Structure

Question 19

You should always choose the analytics tool that is most appropriate for any given situation, even if that means acquiring a new tool.

Options:

A.

True.

B.

False.

Question 20

A data analyst has been asked to derive a new variable labeled “Promotion_flag” based on the total quantity sold by each salesperson. Given the table below:

Which of the following functions would the analyst consider appropriate to flag “Yes” for every salesperson who has a number above 1,000,000 in the Quantity_sold column?

Options:

A.

Date

B.

Mathematical

C.

Logical

D.

Aggregate

Question 21

A data analyst is reviewing SQL code and sees a query that uses terms such as MIN, SUM, and COUNT. Which of the following types of functions best describes these terms?

Options:

A.

Aggregate

B.

Logical

C.

Filtering

D.

System

Question 22

Which of the following is the first step an analyst should perform upon receiving a business request for analysis?

Options:

A.

Determine the data needs and sources for analysis.

B.

Initiate the analysis for exploratory data analysis.

C.

Review the business questions to understand the scope.

D.

Finalize the methodology to solve the problem.

Question 23

Which of the following best describes a business analytics tool with interactive visualization and business capabilities and an interface that is simple enough for end users to create their own reports and dashboards?

    Python

Options:

A.

R

B.

Microsoft Power Bl

C.

SAS

Question 24

An analyst is building a new dashboard for a user. After an initial conversation with the user. the analyst created a mock-up of the dashboard. Which of the following best explains why the analyst created the mock-up?

Options:

A.

To identify the dimensions and measures

B.

To send to the client after deploying the dashboard to production

C.

To confirm important details before dashboard development begins

D.

To receive client approval for the final dashboard design

Question 25

A data analyst is designing a dashboard that will provide a story of sales and determine which site is providing the highest sales volume per customer The analyst must choose an appropriate chart to include in the dashboard. The following data is available:

Which of the following types of charts should be considered?

Options:

A.

Include a line chart using the site and average sales per customer.

B.

Include a pie chart using the site and sales to average sales per customer.

C.

Include a scatter chart using sales volume and average sales per customer.

D.

Include a column chart using the site and sales to average sales per customer.

Question 26

Which of the following variable name formats would be problematic if used in the majority of data software programs?

Options:

A.

First_Name_

B.

FirstName

C.

First_Name

D.

First Name

Question 27

Joseph is interpreting a left skewed distribution of test scores. Joe scored at the mean, Alfonso scored at the median, and gaby scored and the end of the tail.

Who had the highest score?

Options:

A.

Joseph

B.

Joe

C.

Alfonso

D.

Gaby

Question 28

A data set was recorded using multimedia technology. Which of the following is a necessary step on the way to interpretation?

Options:

A.

Structural equation modeling

B.

Transcription

C.

Sequential analysis

D.

Sampling

Question 29

A data scientist wants to see which products make the most money and which products attract the most customer purchasing interest in their company.

Which of the following data manipulation techniques would he use to obtain this information?

Options:

A.

Data append

B.

Data blending

C.

Normalize data

D.

Data merge

Question 30

A salesperson who is prospecting potential clients collected the following data:

Which of the following is an issue with this data?

Options:

A.

Duplicate data

B.

Invalid data

C.

Missing value

D.

Redundant data

Question 31

Which of the following types of dashboards should a business intelligence engineer develop in order to provide information about failed data pipelines?

Options:

A.

Referencing

B.

Strategic

C.

Operational

D.

Technical

Question 32

What subset of Structured Query Language (SQL) is used to add, remove, modify, or retrieve the information stored within a relational database?

Options:

A.

DDL.

B.

DSL.

C.

DQL.

D.

DML.

Question 33

An analyst needs to provide a chart to identify the composition between the categories of the survey response data set:

Which of the following charts would be BEST to use?

Options:

A.

Histogram

B.

Pie

C.

Line

D.

Scatter pot

E.

Waterfall

Question 34

A sales manager requested a report that contains the first name, last name, and phone number of all the company’s customers and employees. The data engineer needs to return all the records from several tables, even duplicates. Which of the following is the best way to join the two tables?

Options:

A.

FULL OUTER JOIN

B.

INNER JOIN

C.

LEFT OUTER JOIN

D.

CROSS JOIN

Question 35

Which of the following contains alphanumeric values?

Options:

A.

10.1Ε²

B.

13.6

C.

1347

D.

A3J7

Question 36

Which of the following statements would be used to append two tables that have the same number of columns?

Options:

A.

UNION ALL

B.

MERGE

C.

GROUP BY

D.

JOIN

Question 37

Which of the following components need to be added to ensure the report is point-in-time and static? (Select two).

Options:

A.

A control group for the phrases

B.

A summary of the KPIs

C.

Filter buttons for the status

D.

The date when the report was last accessed

E.

The time period the report covers

F.

The date on which the report was run

Question 38

A report is scheduled to run and be distributed at the end of business each day. On Mondays, one of the recipients opens the previous week's reports and combines them to calculate the weekly totals and projections for the coming week. This is a tedious process, and the recipient asks an analyst for help. Which of the following should the analyst recommend?

Options:

A.

Add calculation fields to the daily report so the totals are built in.

B.

Create a new report with weekly totals set to run at the end of business on Friday.

C.

Provide a daily summary to the report with totals to save the user the effort of manual calculations.

D.

Reduce the frequency of the report to once a week and change the date range.

Question 39

Which of the following is the most likely reason for a data analyst to optimize a query using parameterization?

Options:

A.

To return a subset of records

B.

To insert a temporary table

C.

To prevent SQL injections

D.

To increase the query speed

Question 40

Which of the following is an example of a strategy to reduce statistical errors?

Options:

A.

Removing outliers

B.

Adding more data

C.

Transformation

D.

Recoding data

Question 41

A data analyst is working with a team to create a dashboard for a client who requires on-demand access. Which of the following is the best delivery method to support the clients’ requirement?

Options:

A.

Email

B.

Scheduled

C.

Subscription

D.

Static

Question 42

A data analyst needs to write a SOL query measuring last month's website visits and distribute a summary report to the marketing team. Which of the following is the analyst creating?

Options:

A.

Date range

B.

Distribution list

C.

Data content

D.

Report view

Question 43

An analyst has received the requirements for an internal user dashboard. The analyst confirms the data sources and then creates a wireframe. Which of the following is the NEXT step the analyst should take in the dashboard creation process?

Options:

A.

Optimize the dashboard.

B.

Create subscriptions.

C.

Get stakeholder approval.

D.

Deploy to production.

Question 44

A client has requested an analysis of all pet care items purchased by current customers and their social media connections in the past 12 months. Which of the following data analysis techniques would be the best choice given these requirements?

Options:

A.

Trend analysis

B.

Performance analysis

C.

Link analysis

D.

Exploratory data analysis

Question 45

A data analyst needs to create a dashboard using the company's yearly revenue data sets. Which of the following would be the best way to plot the information to show the top-performing region?

Options:

A.

A line chart

B.

A waterfall chart

C.

A heat map

D.

A stacked bar chart

Question 46

Which of the following is an example of a data-mining ETL tool?

Options:

A.

SSIS

B.

Stata

C.

SPSS

D.

Cognos

Question 47

Which of the following types of analysis is used when comparing last week's sales to the previous week's sales?

Options:

A.

Trend analysis

B.

Exploratory analysis

C.

Prescriptive analysis

D.

Link analysis

Question 48

An analyst is currently working on a ticket to revamp a company-wide dashboard that has been in use for five years. Which of the following should be the first step in the development process?

Options:

A.

Talk to the group that made the request to determine the desired goal.

B.

Make changes to a frequently used report that is already in production.

C.

Build an additional dashboard with fewer views tailored toward each specific team.

D.

Develop a more streamlined dashboard to roll out by the next delivery date.

Question 49

A survey asks participants to rate a company on a scale of one to ten. Which of the following best describes the rating variable?

Options:

A.

Continuous

B.

Ordinal

C.

Categorical

D.

Nominal

Question 50

Given the following tables:

Which of the following will be the dimensions from a FULL JOIN of the tables above?

Options:

A.

Two rows and three columns

B.

Three rows and four columns

C.

Four rows and two columns

D.

Four rows and four columns

Question 51

An analyst is working with a data set that lists individuals' first and last names in separate columns. Which of the following processes should the analyst use to combine the first and last names into a single spreadsheet cell?

Options:

A.

Transpose

B.

Blend

C.

Concatenate

D.

Merges

Question 52

A reporting analyst needs to create a report that refreshes automatically and is accessible to the entire sales organization. Which of the following tools is the most appropriate to use for this task?

Options:

A.

R

B.

Excel

C.

Tableau

D.

Python

Question 53

Which of the following is the best approach to use to gain a general understanding of a data set?

Options:

A.

Descriptive statistics

B.

Basic projections

C.

Gap analysis

D.

Trend analysis

Question 54

Which of the following is an example of a discrete variable?

Options:

A.

The temperature of a hot tub

B.

The height of a horse

C.

The time to complete a task

D.

The number of people in an office

Question 55

Which of the following data analysis tools increases the efficiency of data visualizations?

Options:

A.

SQL

B.

Microsoft Excel

C.

SAS

D.

RapidMiner

Question 56

A publishing group has requested a dashboard to track submissions before publication. A key requirement is that all changes are tracked, as multiple users will be checking out documents and editing them before submissions are considered final. Which of the following is the BEST way to meet this stakeholder requirement?

Options:

A.

Display the version number next to each submission on the dashboard.

B.

Present a data refresh date at the top of the dashboard.

C.

Confirm the dashboard is adhering to the corporate style guide.

D.

Use permissions to ensure users only see certain versions of the submissions.

Question 57

Which of the following is the best technique for transferring data from one database to another with some data manipulation?

Options:

A.

Application programming interfaces

B.

Delta load

C.

Extract, transform, load

D.

Export/import

Question 58

A database administrator needs to ensure only approved users can access specific database tables to perform financial functions. Which of the following is the best access control method for the administrator to use?

Options:

A.

Role-based

B.

Rule-based

C.

Discretionary

D.

Group-based

Question 59

You are working with a dataset and need to swap the values in rows with those in columns.

What action do you need to perform?

Options:

A.

Recording

B.

Filtering.

C.

Aggregation.

D.

Transposition.

Question 60

A user receives a large custom report to track company sales across various date ranges. The user then completes a series of manual calculations for each date range. Which of the following should an analyst suggest so the user has a dynamic, seamless experience?

Options:

A.

Create multiple reports, one for each needed date range.

B.

Build calculations into the report so they are done automatically.

C.

Add macros to the report to speed up the filtering and calculations process.

D.

Create a dashboard with a date range picker and calculations built in.

Question 61

Daniel is using the structured Query language to work with data stored in relational database.

He would like to add several new rows to a database table.

What command should he use?

Options:

A.

SELECT.

B.

ALTER.

C.

INSERT.

D.

UPDATE.

Question 62

An analyst compiled a high-level report that includes the following data points:

    Total dollars closed for the year

    Annual quota/goal

    Top 10 customers

    Average deal size

    Largest deals lost

Which of the following groups is the most likely audience for this report?

Options:

A.

External vendors

B.

General public

C.

Lower-level managers

D.

C-suite officers

Question 63

A data analyst for a media company needs to determine the most popular movie genre. Given the table below:

Which of the following must be done to the Genre column before this task can be completed?

Options:

A.

Append

B.

Merge

C.

Concatenate

D.

Delimit

Question 64

A marketing analytics team received customer transaction data from two different sources. The data is complete and accurate; however, the field names appear to be inconsistent. Given the following tables:

Which of the following is considered best practice if the team wants to consolidate the files and conduct further analysis?

Options:

A.

Standardize the field names.

B.

Recode the data values.

C.

Overwrite the field names in one of the tables.

D.

Edit the field names in the data dictionary.

Question 65

Given the following table:

Which of the following methods is the best way to describe the changes in the values in the table?

Options:

A.

Average

B.

Range

C.

Standard deviation

D.

Median

Question 66

Given the following athlete workout data (with inconsistent units or formats for time/distance), which of the following best describes the data quality issue?

Options:

A.

Duplicate data

B.

Data outlier

C.

Data inconsistency

D.

Invalid data

Question 67

An analyst has written the following code:

SELECT *

FROM Cust_table

WHERE age > 60 AND City = "New York"

Which of the following criteria is the analyst retrieving?

Options:

A.

All customers older than age 60 in New York state

B.

All customers aged 60 and older in New York state

C.

All customers older than age 60 in New York City

D.

All customers younger than age 60 in New York City

Question 68

An analyst is working for an organization that has a vast amount of data stored across various systems and databases. Employees have difficulty locating and understanding the currently available data assets. Which of the following tools should the analyst implement to best solve this issue?

Options:

A.

Data lake

B.

Business glossary

C.

Data catalog

D.

Operational data store

Question 69

A database consists of one fact table that is composed of multiple dimensions. Each dimension is represented by a denormalized table. This structure is an example of a:

Options:

A.

Non-relational schema

B.

Galaxy schema

C.

Snowflake schema

D.

Star schema

Question 70

Given the customer table below:

Which of the following chart types is the most appropriate to represent the average spending of active customers vs. inactive customers?

Options:

A.

Pie chart

B.

Heat graph

C.

Scatter plot

D.

Line chart

Question 71

The number of phone calls that the call center receives in a day is an example of:

Options:

A.

continuous data.

B.

categorical data.

C.

ordinal data.

D.

discrete data.

Question 72

An analyst wants to include a graph in a quarterly sales report that shows the comparison between two quantitative variables. Which of the following visual diagrams can the analyst use to most effectively represent this relationship?

Options:

A.

Bar graph

B.

Heat map

C.

Pie chart

D.

Scatter plot

Question 73

A database administrator needs to increase performance on a large dimension table. Which of the following is the best way to accomplish this task?

Options:

A.

Sampling

B.

Partitioning

C.

Windowing

D.

Sorting

Question 74

A table in a hospital database has a column for patient height in inches and a column for patient height in centimeters. This is an example of:

Options:

A.

dependent data.

B.

duplicate data.

C.

invalid data

D.

redundant data

Question 75

Which of the following data types best describe 4Ac1? (Select two).

Options:

A.

Alphanumeric

B.

Symbolic

C.

Numeric

D.

Float

E.

Boolean

F.

String

Question 76

A data analyst has been asked to create a daily manufacturing report for the floor manager Which of the following metrics should be included in the report?

Options:

A.

Tons of steel produced per hour

B.

Annual sales budget

C.

End-of-day stock price

D.

Daily corporate employee count

Question 77

Which of the following tools would be best to use to calculate the interquartile range, median, mean, and standard deviation of a column in a table that has 5.000.000 rows?

Options:

A.

Microsoft Excel

B.

R

C.

Snowflake

D.

SQL

Question 78

Randy scored 76 on a math test, Katie scored 86 on a science test, Ralph scored 80 on a history test, and Jean scored 80 on an English test. The table below contains the mean and standard deviation of the scores for each of the courses:

Using this information, which of the following students had the BEST score?

Options:

A.

Randy

B.

Katie

C.

Ralph

D.

Jean

Question 79

Which of the following is the best variable formal to store a customer's age using the least possible amount of storage data?

Options:

A.

Int

B.

Float

C.

Char

D.

Double

Question 80

What category of data stewardship work is focused on ensuring that the organization respects the wishes of data subjects?

Options:

A.

Data quality.

B.

Data privacy.

C.

Data security.

D.

Regulatory compliance.

Question 81

Which of the following describes the method of sampling in which elements of data are selected randomly from each of the small subgroups within a population?

Options:

A.

Simple random

B.

Cluster

C.

Systematic

D.

Stratified

Question 82

Which of the following summary statements upholds integrity in data reporting?

Options:

A.

Sales are approximately equal for Product A and Product B across all strategies.

B.

Strategy 4 provides the best sales in comparison to other strategies.

C.

While Strategy 2 does not result in the highest sales of Product D. over all products it appears to be the most effective.

D.

Product D should be promoted more than the other products in all strategies.

Question 83

Which one of the following is a common data warehouse schema?

Options:

A.

Snowflake.

B.

Square.

C.

Spiral.

D.

Sphere.

Question 84

Which of the following is an object associated with a table that sorts and stores table row data in a key-value pair?

Options:

A.

Foreign key

B.

Function

C.

Stored procedure

D.

Clustered index

Question 85

Given the following grocery store orders:

If a query is made to the table with the following logic:

Order_Total > 132 OR (Order Total >= 25 AND Order_Total < 74)

Which of the following is the number of orders that will be returned by the query?

Options:

A.

Four

B.

Five

C.

Six

D.

Seven

Question 86

What analytics suite is offered by Microsoft and directly integrates with SQL Server Databases?

Options:

A.

Qlik.

B.

Power BI.

C.

Domo.

D.

Dataroma.

Question 87

Which of the following value is the measure of dispersion "range" between the scores of ten students in a test.

The scores of ten students in a test are 17, 23, 30, 36, 45, 51, 58, 66, 72, 77.

Options:

A.

90

B.

60

C.

70

D.

80

Question 88

Different people manually type a series of handwritten surveys into an online database. Which of the following issues will MOST likely arise with this data? (Choose two.)

Options:

A.

Data accuracy

B.

Data constraints

C.

Data attribute limitations

D.

Data bias

E.

Data consistency

F.

Data manipulation

Question 89

The current date is July 14, 2020. A data analyst has been asked to create a report that shows the company's year-over-year Q2 2020 sales. Which of the following reports should the analyst compare?

Options:

A.

Q2 2020 and Q4 2019

B.

YTD 2020 and YTD 2019

C.

Q2 2020 and Q2 2019

D.

Q2 2020 and Q2 2021

Question 90

A data governance analyst who is reviewing a retailer's data set notices that sales data is captured at the regional level but not at the individual store level. Which of the following best describes the issue with this data set?

Options:

A.

Data attribute limitations

B.

Data accuracy

C.

Data integrity

D.

Data consistency

Question 91

Which of the following is a control measure for preventing a data breach?

Options:

A.

Data transmission

B.

Data attribution

C.

Data retention

D.

Data encryption

Question 92

After a merger, an analyst needs to enhance a very complicated quarterly report so that it is more user friendly for new team members. Which of the following elements would help reduce questions?

Options:

A.

Version details

B.

Appendix

C.

Reference data sources

D.

FAQs

Question 93

What would be an example of an acceptable form of primary identification for the Data+ exam?

Options:

A.

Passport.

B.

School ID card.

C.

Employee ID card.

D.

Credit card with photo and signature.

Question 94

Which of the following is concatenate typically used to combine?

Options:

A.

Rows

B.

Columns

C.

Tables

D.

Databases

Question 95

Jhon is working on an ELT process that sources data from six different source systems.

Looking at the source data, he finds that data about the sample people exists in two of six systems.

What does he have to make sure he checks for in his ELT process?

Choose the best answer.

Options:

A.

Duplicate Data.

B.

Redundant Data.

C.

Invalid Data.

D.

Missing Data.

Question 96

An analyst modified a data set that had a number of issues. Given the original and modified versions:

Which of the following data manipulation techniques did the analyst use?

Options:

A.

Imputation

B.

Recoding

C.

Parsing

D.

Deriving

Question 97

A data analyst needs to create a dashboard to help identify trends in the data sets. Which of the following is an appropriate consideration for dashboard development?

Options:

A.

Data sources and attributes

B.

Frequently asked questions

C.

A report from the data source

D.

A comparison of data sets

Question 98

An analyst is working on a project for a director. During this process. the analyst pulled the data. created summarized tables and graphs with descriptions, created a report summary, and inserted all items into a report. After writing the report, which of the following would be the most appropriate next step?

Options:

A.

Complete an audit on the data pulled for the report.

B.

Complete a check for quality in the report.

C.

Complete a review of the data and a check for consistency

D.

Complete a trend analysis to be included in the report.

Question 99

An analyst is reviewing the following data:

Car IDSpeed

123155

566436

564418

650567

546436

645638

Which of the following should the analyst include in the measures of central tendency for speed?

Options:

A.

Mode = 38 Range = 31 Mean = 42.5

B.

Range = 49 Max = 67 Min = 18

C.

Mode = 36 Max = 67 Min = 18

D.

Mode = 36 Median = 37 Mean = 41.5

Question 100

A company's human resources department has asked a data analyst to categorize the income of all employees into five salary bands:

Which of the following types of functions would be the most appropriate to use?

Options:

A.

Statistical

B.

Aggregate

C.

Logical

D.

Mathematical

Question 101

Given the diagram below:

Which of the following types of sampling is depicted in the image?

Options:

A.

Stratified

B.

Random

C.

Cluster

D.

Systematic

Question 102

Which of the following is the median of the number set:3, 7, 5, 6, 9?

Options:

A.

5

B.

6

C.

7

D.

9

Question 103

Exhibit.

Which of the following logical statements results in Table B?

A)

B)

C)

D)

Options:

A.

Option A

B.

Option B

C.

Option C

D.

Option D

Question 104

While reviewing survey data, a research analyst notices data is missing from all the responses to a single question. Which of the following methods would BEST address this issue?

Options:

A.

Replace missing data.

B.

Remove duplicate data.

C.

Replace redundant data.

D.

Remove invalid data.

Question 105

Which of the following would be the best way to identify multicollinear attributes in a data set?

Options:

A.

Correlation coefficient

B.

Chi-squared test

C.

Two-sample f-test

D.

Two-way ANOVA

Question 106

A site reliability team wants to monitor the stability of their website. so they can proactively diagnose issues when they occur Which of the following deliverables would best suit their needs?

Options:

A.

A self-serve dashboard of website performance that updates in real time

B.

A weekly log report of site visits and user actions

C.

A portal that is refreshed daily and reports errors classified by type

D.

A daily summary email indicating website outages for the previous day

Question 107

Which of the following types of analysis would be best for an analyst to use to examine the relationships between authors who cited other authors in a library of research papers?

Options:

A.

Linguistic analysis

B.

Trend analysis

C.

Link analysis

D.

Performance analysis

Question 108

Which of the following is used for calculations and pivot tables?

Options:

A.

IBM SPSS

B.

SAS

C.

Microsoft Excel

D.

Domo

Question 109

An analyst wants to create a historical data set for the past five years with each year in its own data set. Which of the following methods is the best way to create this historical data set?

Options:

A.

Data transpose

B.

Data concatenation

C.

Data append

D.

Data normalization

Question 110

A collections manager has a team calling customers who are past due on their accounts in an attempt to collect payments. The manager receives the call list in the form of a printed report that is generated by the accounting department at the beginning of each week. Consequently, the collections team calls some customers who have made payments in the time since the report was last printed. Which of the following reporting enhancements could the accounting department implement to best reduce the number of calls on current accounts?

Options:

A.

Modify the date range on the report

B.

Include a time stamp on the report.

C.

Increase the frequency of report generation.

D.

Add a report run date to the report.

Question 111

Which of the following query statements would be used when filtering data in a relational database management system? (Select two).

Options:

A.

ORDER BY

B.

HAVING

C.

WHERE

D.

SELECT

E.

INSERT

F.

GROUP BY

Question 112

A user imports a data file into the accounts payable system each day. On a regular basis. the field input is not what the system is expecting. so it results in an error for the row and a broken import process. To resolve the issue, the user opens the file, finds the error in the row, and manually corrects it before attempting the import again. The import sometimes breaks on subsequent attempts. though. Which of the following changes should be made to this process to reduce the number of errors?

Options:

A.

Delete all incorrect inputs and upload the corrected file.

B.

Have the user manually review the file for data completeness before loading it

C.

Create a data field to data type validator to run the file through prior to import.

D.

Spot-check the file prior to import to catch and correct field errors.

Question 113

A data profiling rule checks the quality of all email addresses in a database. The rule returns a value with the number of email addresses that conformed to the rule. Which of the following options describes this value?

Options:

A.

Columns passed

B.

Rows passed

C.

Rows failed

D.

Columns failed

Question 114

Which of the following is a characteristic of a relational database?

Options:

A.

It utilizes key-value pairs.

B.

It has undefined fields.

C.

It is structured in nature.

D.

It uses minimal memory.

Question 115

An analyst has been asked to validate data quality. Which of the following are the BEST reasons to validate data for quality control purposes? (Choose two.)

Options:

A.

Retention

B.

Integrity

C.

Transmission

D.

Consistency

E.

Encryption

F.

Deletion

Question 116

Which of the following are reasons to create and maintain a data dictionary? (Choose two.)

Options:

A.

To improve data acquisition

B.

To remember specifics about data fields

C.

To specify user groups for databases

D.

To provide continuity through personnel turnover

E.

To confine breaches of PHI data

F.

To reduce processing power requirements

Question 117

An analyst for a concert venue is analyzing the number of tickets sold for a recent event. Which of the following types of data is the number of sold tickets an example of?

Options:

A.

Ordinal

B.

Continuous

C.

Nominal

D.

Discrete

Question 118

A column is being used to store strings of variable lengths. Performance is a concern, so the column needs to use as little space as possible. Which of the following data types best meets these requirements?

Options:

A.

char

B.

nchar

C.

varchar

D.

nvarchar

Page: 1 / 40
Total 396 questions