Lab 7 - About WEKA PDF

Title Lab 7 - About WEKA
Course Data Mining
Institution The University of the South Pacific
Pages 19
File Size 1.5 MB
File Type PDF
Total Downloads 113
Total Views 144

Summary

About WEKA ...


Description

IS 328 – DATA MINING

LAB 7

IS 328 – Data Mining Lab 7

1

IS 328 – DATA MINING

LAB 7

Task 1: Understanding Overfitting by revisiting classification algorithms Step 1: Simple Classifier 1. Opened WEKA and opened a weather data set.

2. Classified the data set by ZeroR algorithm.

2

IS 328 – DATA MINING

LAB 7

3. Classified the dataset by OneR algorithm and interpreted results.

3

IS 328 – DATA MINING

LAB 7

Step 2: Overfitting Problem 1. Opened weather.numeric.arff in WEKA.

4

IS 328 – DATA MINING

LAB 7

2. Explored the data set.

3. Classified the data with OneR classifier and checked the performance.

5

IS 328 – DATA MINING

LAB 7

Result: 10 out of 14 instances classified correctly. 4. Removed the outlook attribute and classified the data with OneR again.

6

IS 328 – DATA MINING

LAB 7

5. Changed the minBucketSize value in OneR configuration panel to 1 and ran the test again.

7

IS 328 – DATA MINING

LAB 7

Result: 13 out of 14 results classified correctly (overfitting) and testing phase produced poor accuracy of 35.71%. Step 3: 1. Opened Diabetics.arff dataset to understand fitting problems.

8

IS 328 – DATA MINING

LAB 7

2. Classified the data with ZeroR and checked the baseline accuracy.

3. Classify the data with OneR and checked the accuracy.

9

IS 328 – DATA MINING

LAB 7

4. Changed the bucket size and ran the OneR classifier again.

10

IS 328 – DATA MINING

LAB 7

5. Ran the test again with Use Training Set.

11

IS 328 – DATA MINING

LAB 7

Step 4: Using Probability Theorem 1. Opened weather.norminal.arff data set and chose Naïve Bayes to classify the data with default test option.

12

IS 328 – DATA MINING

LAB 7

Task 2: Understand user classifier, different test options as well as explore Naïve Bayes, KNN (Lazy IBK) in WEKA Step 5: Be a classifier. 1. Installed userClassifier from Package Manager in the Tools menu.

13

IS 328 – DATA MINING

LAB 7

2. Opened segment-challenge.arff file in WEKA.

3. Selected Classify panel and chose the userClassifier from the tree.

4. Clicked on the Set button next to Supplied Test Set and opened file segment-test.arff.

14

IS 328 – DATA MINING

LAB 7

5. Clicked Start button, got a window with two visualizers namely Tree Visualizer and Data Visualizer.

6. Chose region-centroid-row and intensity-mean attributes on X and Y axis of the data visualizer

15

IS 328 – DATA MINING

LAB 7

7. Repeatedly selected the regions with same pattern by using “Rectangle”

8. Checked the generated Tree out of selected Data in Tree Visualizer.

9. Accepted the Tree

10. Checked the classifier output and its performance.

16

IS 328 – DATA MINING

LAB 7

17

IS 328 – DATA MINING

LAB 7

Step 6: Training and Testing 1. Chose J48 classifier from the Classify panel, started again and checked accuracy.

2. Selected the test option as “Use Training Set”, started the test and checked accuracy.

3. Selected the option “Percentage Split”

18

IS 328 – DATA MINING

LAB 7

19...


Similar Free PDFs