Title | Lab 7 - About WEKA |
---|---|
Course | Data Mining |
Institution | The University of the South Pacific |
Pages | 19 |
File Size | 1.5 MB |
File Type | |
Total Downloads | 113 |
Total Views | 144 |
About WEKA ...
IS 328 – DATA MINING
LAB 7
IS 328 – Data Mining Lab 7
1
IS 328 – DATA MINING
LAB 7
Task 1: Understanding Overfitting by revisiting classification algorithms Step 1: Simple Classifier 1. Opened WEKA and opened a weather data set.
2. Classified the data set by ZeroR algorithm.
2
IS 328 – DATA MINING
LAB 7
3. Classified the dataset by OneR algorithm and interpreted results.
3
IS 328 – DATA MINING
LAB 7
Step 2: Overfitting Problem 1. Opened weather.numeric.arff in WEKA.
4
IS 328 – DATA MINING
LAB 7
2. Explored the data set.
3. Classified the data with OneR classifier and checked the performance.
5
IS 328 – DATA MINING
LAB 7
Result: 10 out of 14 instances classified correctly. 4. Removed the outlook attribute and classified the data with OneR again.
6
IS 328 – DATA MINING
LAB 7
5. Changed the minBucketSize value in OneR configuration panel to 1 and ran the test again.
7
IS 328 – DATA MINING
LAB 7
Result: 13 out of 14 results classified correctly (overfitting) and testing phase produced poor accuracy of 35.71%. Step 3: 1. Opened Diabetics.arff dataset to understand fitting problems.
8
IS 328 – DATA MINING
LAB 7
2. Classified the data with ZeroR and checked the baseline accuracy.
3. Classify the data with OneR and checked the accuracy.
9
IS 328 – DATA MINING
LAB 7
4. Changed the bucket size and ran the OneR classifier again.
10
IS 328 – DATA MINING
LAB 7
5. Ran the test again with Use Training Set.
11
IS 328 – DATA MINING
LAB 7
Step 4: Using Probability Theorem 1. Opened weather.norminal.arff data set and chose Naïve Bayes to classify the data with default test option.
12
IS 328 – DATA MINING
LAB 7
Task 2: Understand user classifier, different test options as well as explore Naïve Bayes, KNN (Lazy IBK) in WEKA Step 5: Be a classifier. 1. Installed userClassifier from Package Manager in the Tools menu.
13
IS 328 – DATA MINING
LAB 7
2. Opened segment-challenge.arff file in WEKA.
3. Selected Classify panel and chose the userClassifier from the tree.
4. Clicked on the Set button next to Supplied Test Set and opened file segment-test.arff.
14
IS 328 – DATA MINING
LAB 7
5. Clicked Start button, got a window with two visualizers namely Tree Visualizer and Data Visualizer.
6. Chose region-centroid-row and intensity-mean attributes on X and Y axis of the data visualizer
15
IS 328 – DATA MINING
LAB 7
7. Repeatedly selected the regions with same pattern by using “Rectangle”
8. Checked the generated Tree out of selected Data in Tree Visualizer.
9. Accepted the Tree
10. Checked the classifier output and its performance.
16
IS 328 – DATA MINING
LAB 7
17
IS 328 – DATA MINING
LAB 7
Step 6: Training and Testing 1. Chose J48 classifier from the Classify panel, started again and checked accuracy.
2. Selected the test option as “Use Training Set”, started the test and checked accuracy.
3. Selected the option “Percentage Split”
18
IS 328 – DATA MINING
LAB 7
19...