Chapter 5: Building Predictive Models Using Penalized Linear Methods

Python Packages for Penalized Linear Regression

Multivariable Regression: Predicting Wine Taste

Building and Testing a Model to Predict Wine Taste

CODE

  • Listing 5-1: Using Cross-Validation to Estimate Out-of-Sample Error with Lasso Modeling Wine Taste—wineLassoCV.py
    • Figure 5-1: ... un-normalized Y
    • Figure 5-2: ... normalized Y
    • Figure 5-3: ... un-normalized X and Y

Training on the Whole Data Set before Deployment

CODE

  • Listing 5-2: Lasso Training on Full Data Set—wineLassoCoefCurves.py
    • Figure 5-4: Coefficient curves for Lasso trained to predict wine quality
    • Figure 5-5: Coefficient curves for Lasso trained on un-normalized Xs

Basis Expansion: Improving Performance by Creating New Variables from Old Ones

CODE

  • Listing 5-3: Using Out-of-Sample Error to Evaluate New Attributes for Predicting Wine Quality—wineExpandedLassoCV.py
    • Figure 5-6: Cross-validation error curves for Lasso trained on wine quality data with expanded feature set

Binary Classification: Using Penalized Linear Regression to Detect Unexploded Mines

CODE

  • Listing 5-4: Using ElasticNet Regression to Build a Binary (Two-Class) Classifier— rocksVMinesENetRegCV.py
    • Figure 5-7: Out-of-sample classifier misclassification performance
    • Figure 5-8: Out-of-sample classifier AUC performance
    • Figure 5-9: Receiver operating characteristic for best performing classifier

Build a Rocks versus Mines Classifier for Deployment

CODE

  • Listing 5-5: Coefficient Trajectories for ElasticNet Trained on Rocks versus Mines Data— rocksVMinesCoefCurves.py
    • Figure 5-10: Coefficient curves for ElasticNet trained on rocks versus mines data
  • Listing 5-6: Penalized Logistic Regression Trained on Rocks versus Mines Data— rocksVMinesGlmnet.py
    • Figure 5-11: Coefficient curves for ElasticNet penalized logistic regression trained on rocks versus mines data

Multiclass Classification: Classifying Crime Scene Glass Samples

Listing 5-7: Multiclass Classification with Penalized Linear Regression - Classifying Crime Scene Glass Samples—glassENetRegCV.py

  • Figure 5-12: Misclassification error rates using penalized linear regression for glass classification