0% found this document useful (0 votes)
72 views4 pages

Key Data Mining Concepts and Techniques

The document contains a list of important questions related to data mining, covering definitions, applications, analysis techniques, and statistical methods. Each question is assigned a specific mark value, indicating the weight of the question in an assessment context. Topics include data types, data preprocessing, similarity measures, and various data mining functions.

Uploaded by

rohithsd0222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views4 pages

Key Data Mining Concepts and Techniques

The document contains a list of important questions related to data mining, covering definitions, applications, analysis techniques, and statistical methods. Each question is assigned a specific mark value, indicating the weight of the question in an assessment context. Topics include data types, data preprocessing, similarity measures, and various data mining functions.

Uploaded by

rohithsd0222
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd

Important Questions

Sr. No Questions Marks

1. Define Data Mining. 2M

2. Why is data mining required? 2M

3. Enlist the applications of Data Mining. 2M

4. What is Cluster Analysis 2M

5. What is Outlier Analysis 2M

6. Define data, Information, Knowledge 2M

7. Define Correlation, Covariance 2M

8. Compute the similarity between Chicken and Bird using SMC coefficient for 2M
the given data.

Chicken={0,1,1,0,1,0,0,1,1,1}

Bird= {0,1,1,0,0,0,0,1,0,1}

9. Define Time Series Data 2M

10. Define the following (I)Object. (II) Attribute. 2M

11. List data reduction techniques in data mining. 2M

12. Define Ordinal data Attribute. 2M

13. Enlist types of Datasets 2M

14. Define Qualitative Data and Quantitative data 2M

15. Define Data Redundancy 2M

16. Define Data scrubbing, Data auditing 2M

17. Which tools are used for Data Mitigation 2M

18. Explain Ordered Data 2M

19. Explain about knowledge discovery in database process with a neat diagram. 5M

20. Discuss Different Data Mining Function in detail 5M

21. Explain the Multidimensional view of data mining. 5M

22. Explain how data mining works. 5M

23. Explain role of Data Mining in Business Intelligence 5M

24. Illustrate 5 applications of data mining that has been used to solve specific 5 M

Page 1 of 4
problems

25. List and explain the goals of data mining. 5M

26. Discuss about confluence of multiple disciplines in Data Mining. 5M

27. Illustrate the typical view in ML and statistics with a neat diagram 5M

28. Illustrate 5 applications of data mining that have been used to solve specific 5 M
problems

29. How to search for knowledge and interesting patterns in data? 5M

30. Discuss the major issues of Data mining 5M

31. Compare quantitative data and qualitative data. 5M

32. Explain Attribute subset selection methods with an example 5M

33. How to perform correlation analysis between categorical Variable using chi 5 M
square test.

34. A survey on car has had conducted in 2011 and determined that 60% of car 5 M
owners have only one car, 28% have two cars, and 12% have three or more.
Supposing that you have decided to conduct your own survey and have
collected the data below, determine whether your data supports the results of
the study. Use a significance level of 0.05. Also, given that, out of 129 car
owners, 73 had one car and 38 had two cars. df = 2 is 5.99. Apply the chi
square test to get nominal data.

35. Suppose two stocks A and B have the following values in one week: (2, 5), 5 M
(3, 8), (5, 10), (4, 11), (6, 14). If the stocks are affected by the same industry
trends, will their prices rise or fall together using covariance?

36. What is dimensionality Reduction. Explain methods used for reduction the 5 M
dimensionality

37. Illustrate why data preprocessing is a major step in data mining. 5M

38. Consider the following salaries: 5M


25, 30, 28, 55, 60, 42, 70, 75, 50, 48

Apply the binning technique to remove noisy data.

39. Explain about quality measures of data preprocessing. 5M

40. Illustrate similarity, dissimilarity and their properties 5M

41. Define noisy data. Explain how noisy data can be handled in data mining 5M

42. Calculate the cosine similarity distance between d1 and d2 vectors. 5M

d1 3 2 0 5 0 0 0 2 0 0

Page 2 of 4
d2 1 0 0 0 0 0 0 1 0 2

43. Illustrate why data preprocessing is a major step in data mining. 5M

44. Describe quality measures of data preprocessing. 5M

45. List and explain the major task in data preprocessing. 5M

46. Normalize the following group of data: 200 , 300 , 400 , 600, 1000 using 5M

i. Min-Max
ii. Z-Score
iii. Decimal Scaling

47. Explain Data Cube Aggregation 5M

48. Below dataset describes the rate of economic growth (ai) and the rate of return 5M
on the S&P 500(bi). Using the covariance formula, determine whether
economic growth and S&P 500 returns have positive or negative relationship?

Economic Growth % S&P 500 Returns %


(ai) (bi)
2.1 8
2.5 12
4.0 14
3.6 10
49. Explain Data Discretization in detail, Supervised and Unsupervised 5M
Discretization
50. Describe Binarization with example 5M

51. Explain Linear relationship between variables 5M

52. Describe Similarity And Dissimilarity in details 5M

53. Apply entropy-based discretization on the given set S= (16, n), (0, y), (4, y), 10M
(12, y), (16, n), (26, n), (18, y), (24, n), (28, n). If S has partitioned into 2
intervals S1 & S2 with 2 possible split points 14 & 21. Find the Best split
point.

54. Calculate the minkowski distance and Euclidean distance between the 10M
following pairs of points to determine their dissimilarity:

Point X Y
p1 0 2
p2 2 0
p3 3 1
p4 5 1
55. Explain data Reduction methods in Detail 10M

Calculate the entropy discretization for the following data set. If S has 10M
partitioned into 2 intervals S1 & S2 with 2 possible split points 14 & 17. Find

Page 3 of 4
the Best split point.

0 4 12 16 16 18 24 26 28

Y Y Y N N Y N N N

Page 4 of 4

You might also like