Data Warehousing and Data Mining MCQ Quiz with Answers. Data mining and data warehousing multiple choice questions with answers pdf.
The quiz consists of questions specifically tailored to cover the fundamental concepts around Data Warehousing and Data Mining. In addition, this article provides detailed explanations for each answer so that you can gain a better understanding of the topics.
Data Warehousing and Data Mining MCQs
1. Data Mining is also referred to as ___.
a. Knowledge discovery in databases
b. Data Cleaning
c. Data extraction
d. Data management
Ans: A
2. Oracle 10 g provides software called ___, which is a data mining tool.
a. Data miner 10g
b. Dolphin
c. Data miner
d. Darwin
Ans: D
3. Data about data is called ___.
a. Table
b. Database
c. Metadata
d. Integration
Ans: C
4. To represent any n–Dimension data we need a series of ___ Dimension cubes.
a. (n–1)
b. n
c. n+1
d. n+2
Ans: A
5. Which of the following schema contains multiple fact tables?
a. Star schema
b. Snowflake schema
c. Fact constellation
d. None of the above
Ans: C
6. The ___ operation performs a selection on one dimension of the given cube, resulting in a subcube.
a. Pivot
b. Slice
c. Roll-up
d. Drill – down
Ans: B
7. ___ servers support multidimensional views of data through array-based multidimensional storage engines.
a. ROLAP
b. MOLAP
c. Data warehouse
d. Database
Ans: B
8. ___ is used to refer to systems and technologies that provide the business with the means for decision-makers to extract personalized meaningful information about their business and industry.
a. Business intelligence
b. Data warehouse
c. Database
d. All the above
Ans: A
9. The ___ software gives the user the opportunity to look at the data from a variety of different dimensions.
a. Query Tools
b. Multidimensional Analysis Software
c. Data Mining Tools
d. None of the above
Ans: B
10. Which of the following layer is concerned with producing metadata about the ETL process?
a. Data integration layer
b. Applications layer
c. Access layer
d. None of the above
Ans: A
11. ___ methods smooth a sorted data value by consulting its “neighborhood”, that is, the values around it.
a. Binning
b. Clustering
c. Combined computer and human inspection
d. Regression
Ans: A
12. ___ techniques can be used to reduce the number of values for a given continuous attribute, by dividing the range of the attribute into intervals.
a. Discretization
b. Transformation
c. Smoothing
d. Generalization
Ans: C
13. ___ can be used to help avoid errors in schema integration.
a. User data
b. System administrator
c. Metadata
d. All the above
Ans: C
14. A frequent set is a ___ if it is a frequent set and no superset of this is a frequent set.
a. Border set
b. Minimal frequent set
c. Maximal frequent set
d. None of the above
Ans: C
15. ___ is the oldest and most well-known statistical technique that the Data Mining community utilizes.
a. Regression
b. Clustering
c. Decision Trees
d. None of the above
Ans: A
16. An ___ is an information-processing paradigm that is inspired by the way biological nervous systems, such as the brain, process information.
a. Decision tree
b. Prediction network
c. Artificial Neural Network
d. All the above
Ans: C
17. FP–Tree Growth Algorithm can be implemented in ___ Phases.
a. One
b. Two
c. Three
d. Four
Ans: B
18. The pincer–search algorithm is based on ___ search.
a. Bi-directional
b. Single-directional
c. Random
d. Sequential
Ans: A
19. ___ algorithm works like a train running over the data, with stops at intervals M between transactions. When the train reaches the end of the transaction file it completes one path.
a. FP–Tree Growth
b. Partition Algorithm
c. Dynamic Itemset Counting
d. Pincers – Search
Ans: C
20. It is difficult to find strong associations among data items at low or primitive levels of abstraction due to the ___ of data in multidimensional space.
a. Sparsity
b. Scarcity
c. Plenty
d. Excess
Ans: A
21. The process of partitioning the ranges of quantitative attributes into intervals is called ___.
a. Splitting
b. Grouping
c. Binning
d. None of the above
Ans: C
22. ___ attributes are numeric and have an implicit ordering among values.
a. Quantitative
b. Nominal
c. Categorical
d. None of the above
Ans: A
23. In the K-means clustering algorithm the distance between cluster centroid to each object is calculated using the ___ method.
a. Euclidean distance
b. Clustering distance
c. Central distance
d. Cluster width
Ans: A
24. ___ techniques are more commonly used in hierarchical clustering and this is the method implemented in XLMiner™.
a. Agglomerative
b. Divisive
c. K-means
d. None of the above
Ans: A
25. Hierarchical clustering may be represented by a two-dimensional diagram known as ___.
a. Dendrogram
b. Cladogram
c. Histogram
d. None of the above
Ans: A
26. The basic algorithm for decision tree induction is a ___ algorithm.
a. Step-by-step
b. Procedural
c. Greedy
d. None of these
Ans: C
27. ___ is the process of removing attributes in the data that are irrelevant to the classification or prediction task.
a. Relevance analysis
b. Data cleaning
c. Data transformation
d. Normalization
Ans: A
28. ___ is the ability of the model to make correct predictions given noisy data or data with missing values.
a. Speed
b. Predictive accuracy
c. Robustness
d. Interpretability
Ans: C
29. ___ technologies are the right solutions for knowledge discovery on the web.
a. Data mining
b. Knowledge mining
c. Web mining
d. All the above
Ans: C
30. ___ focuses on the analysis of the link structure of the web and one of its purposes is to identify more preferable documents.
a. Web content mining
b. Web usage mining
c. Web Structure Mining
d. None of these
Ans: C
31. ___ is simple text files that are automatically generated every time someone accesses one Website.
a. Server session
b. Logfile
c. User session
d. None of the above
Ans: B
32. In inverted indices text retrieval method ___ list is a list of terms (or pointers to terms) that occur in the document, sorted according to some relevance measure.
a. Term
b. Index
c. Posting
d. None of these
Ans: C
33. Which of the following are the stop words?
a. A
b. The
c. Of
d. All the above
Ans: D
34. Which of the following is the first step in text retrieval systems?
a. Stemming
b. Term words finding
c. Tokenization
d. Replacing the null data with keywords
Ans: C
35. FAMS stands for ___.
a. Fraud and Abuse Management System
b. Faith and Abuse Monitoring System
c. Fast and Accurate Monitoring System
d. None of the above
Ans: A
36. Most IDSs are based on ___ that are developed by the manual encoding of expert knowledge.
a. System generated
b. Automated signatures
c. Handcrafted signatures
d. None of these
Ans: C
37. IDS stands for ___.
a. Intrusion detection system
b. Independent detection system
c. Independent Data System
d. Intrusion Data system
Ans: A
38. ___ is the process of determining what evidence that can be taken from raw audit data is most useful for analysis.
a. Data cleaning
b. Feature extraction
c. Data cleansing
d. None of these
Ans: B
39. ___ method is useful for finding patterns or associations between attributes.
a. PRIM
b. WIS
c. Induction
d. Relativity
Ans: B
40. PRIM stands for ___.
a. Patient Rule Induction Method
b. Patent Rule Induction Method
c. Patient Rule Increment Method
d. Patent Rule Increment Method
Ans: A
Data communication and computer network MCQ
41. Which of the following statements are correct?
A. Data mining is concerned with finding hidden relationships present in business data to allow businesses to make predictions for future use.
B. Modeling is simply the act of building a model based on data from situations where the answer is known and then applying the model to other situations where the answers are not known.
a. Only A is True
b. Only B is True
c. Both A and B are True.
d. Both are wrong
Ans: C
42. Which of the following statements are correct?
A. A data warehouse refers to a database that is maintained separately from an organization’s operational databases.
B. Data in the data warehouse are stored to provide information from a historical perspective.
a. Only A is True
b. Only B is True
c. Both A and B are True.
d. Both are wrong
Ans: C
43. Which of the following are features of an OLTP system?
1. Adopts an entity-relationship (ER) data, model.
2. Application-oriented database design.
3. Uses Star or Snowflake data model.
4. Subject–oriented database design.
a. Only 1 and 4
b. Only 1 and 2
c. Only 3 and 4
d. All the above
Ans: B
44. BI serves two main purposes, it monitors the ___ and ___ of the organization.
a. Financial, operational health
b. Infrastructure, client
c. Client, Manager
d. Financial, client
Ans: A
45. Choose the correct statements:
1. The most visible layer of the business intelligence infrastructure is the applications layer, which delivers the information to business users.
2. Business intelligence has a significant role in knowing about the customer.
a. Statements 1 & 2 are true
b. Statements 1 & 2 are false
c. Statement 1 is true and 2 is false
d. Statement 1 is false and 2 is true
Ans: A
46. Which of the following statements are correct?
A. BI can help companies to share selected strategic information with business partners.
B. Data mining is not an “intelligence” tool or framework.
a. Only A is True
b. Only B is True
c. Both A and B are True.
d. Both are wrong
Ans: C
47. Consider a scenario where a bin contains values 4, 8, and 15. If the smoothing by bin means the method is applied to clean the data, then each of the original values in the bin will be replaced by ___.
a. 8
b. 9
c. 15
d. 4
Ans: B
48. Which of the following statements are correct?
A. In Numerosity reduction, the data are replaced or estimated by alternative, smaller data representations.
B. A cube for the highest level of abstraction is the apex cuboid.
a. Only A is True
b. Only B is True
c. Both A and B are
d. both are wrong
Ans: C
49. Two fundamental goals of Data Mining are ___ and ___.
a. Analysis and Description
b. Prediction and Description
c. Data cleaning and organizing the data
d. Data cleaning and summarization
Ans: B
50. Choose the correct statements:
1. An artificial neuron is simply an electronically modeled biological neuron.
2. Artificial Neural networks are composed of a large number of highly interconnected processing elements (neurons) working in unison to solve specific problems.
a. Statements 1 & 2 are true
b. Statements 1 & 2 are false
c. Statement 1 is true and 2 is false
d. Statement 1 is false and 2 is true
Ans: A
You may want more Download Data Mining MCQ Questions
Conclusion
Understanding the basics of data warehousing and data mining is important in order to be able to answer the questions posed in this MCQ quiz. Taking this quiz can help you assess your knowledge on these topics and show you what areas you may need to brush up on.
Thanks for reading our post on Data Warehousing and Data Mining MCQ if you like our works please share them on social media.