9:00 am: Arrivals
9:15 am: Error Analysis and Tree/Forest Challenges
10:15 am: SVMs
11:00 am: Challenges + Work on McNulty
1:30pm: Work on McNulty
w5d1_SVMs.pdf (2.5 MB)
A tutorial on SVMs
Another tutorial on SVMs
An Idiot's Guide to SVMs
How to tune SVM Parameters
Preprocessing data in sklearn
SVMs in sklearn
We will go back to the original Supervised Learning Challenges.
For the house representatives data set, calculate the accuracy, precision, recall and f1 scores of each classifier you built (on the test set).
For each, draw the ROC curve and calculate the AUC.
Calculate the same metrics you did in challenge 1, but this time in a cross validation scheme with the cross_val_score function (like in Challenge 9)
For your movie classifiers, calculate the precision and recall for each class.
Draw the ROC curve (and calculate AUC) for the logistic regression classifier from challenge 12
Note: Uninstall pydot if you already installed it but it's not working
pip uninstall pydot
Otherwise, you can start here:
pip uninstall pyparsing pip install -Iv https://pypi.python.org/packages/source/p/pyparsing/pyparsing-1.5.7.tar.gz#md5=9be0fcdcc595199c646ab317c1d9a709 pip install pydot brew install graphviz
Note: If you're trying to draw a tree and you get an error about not finding
Try the following and it should be fixed:
pip install pyparsing==1.5.7
Tree / Forest Challenges
For the house representatives data set, fit and plot a decision tree classifier
Fit and draw a decision tree classifier for your movie dataset
Tackle the Titanic Survivors kaggle competition with decision trees. Look at your splits, how does your tree decide?