HW9 Solns PDF

Title HW9 Solns
Author jj oi
Course Statistical Methods For Data Mining
Institution Northwestern University
Pages 10
File Size 264.4 KB
File Type PDF
Total Downloads 108
Total Views 155

Summary

hw9...


Description

HW 9 Solutions 1) Revisit the Prob 1 worksheet of HW7_data.xls, which contains data from a prostate cancer study in which the goal was to understand the relationship between a prostate specific antigen (PSA, which was the response y) and a number of clinical measurement variables (the predictor variables) in men with advanced prostate cancer. Data for n = 97 subjects are included. There were seven predictor variables, but this data set contains only three: Cancer volume (x1), prostate weight (x2), and capsular penetration (x3). For this problem, you will use the gbm package and function in R to fit a boosted tree to the prostate cancer data. a) Use the built-in CV (via the gbm and gbm.perf functions together) to find the best number of trees to include in the final boosted tree. For consistency, use n.trees=5000, shrinkage=0.02, interaction.depth=3, bag.fraction = .5, train.fraction = 1, n.minobsinnode = 3, cv.folds = 10. Repeat the gbm and gbm.perf functions a few times to see

how much the results change from replicate to replicate, and discuss what you see. Averaging the results across the multiple replicates, roughly what is the best number of trees for the final boosted tree, and roughly what is the corresponding CV r2? PRO...


Similar Free PDFs