Machine Learning with caret in R

Max Kuhn - DataCamp

Course Description

Machine learning is the study and application of algorithms that learn from and make predictions on data. From search results to self-driving cars, it has manifested itself in all areas of our lives and is one of the most exciting and fast growing fields of research in the world of data science. This course teaches the big ideas in machine learning: how to build and evaluate predictive models, how to tune them for optimal performance, how to preprocess data for better results, and much more. The popular caret R package, which provides a consistent interface to all of R’s most powerful machine learning facilities, is used throughout the course.

1 Regression models: fitting them and evaluating their performance

In the first chapter of this course, you’ll fit regression models with train() and evaluate their out-of-sample performance using cross-validation and root-mean-square error (RMSE).

1.1 Welcome to the Toolbox

1.1.1 In-sample RMSE for linear regression

RMSE is commonly calculated in-sample on your training set. What’s a potential drawback to calculating training set error?

There’s no potential drawback to calculating training set error, but you should calculate $R^{2}$ instead of RMSE.

You have no idea how well your model generalizes to new data (i.e. overfitting).

You should manually inspect your model to validate its coefficients and calculate RMSE.

Correct! Training set error doesn’t tell you anything about the future.

1.1.2 In-sample RMSE for linear regression on diamonds

As you saw in the video, included in the course is the diamonds dataset, which is a classic dataset from the ggplot2 package. The dataset contains physical attributes of diamonds as well as the price they sold for. One interesting modeling challenge is predicting diamond price based on their attributes using something like a linear regression.

Recall that to fit a linear regression, you use the lm() function in the following format:

mod <- lm(y ~ x, my_data)

To make predictions using mod on the original data, you call the predict() function:

pred <- predict(mod, my_data)

Fit a linear model on the diamonds dataset predicting price using all other variables as predictors (i.e. price ~ .). Save the result to model.

library(tidyverse)
data("diamonds")
# Fit lm model: model
model <- lm(price ~ ., diamonds)

Make predictions using model on the full original dataset and save the result to p.

# Predict on full data: p
p <- predict(model, diamonds)

Compute errors using the formula

e r r o r s = p r e d i c t e d - a c t u a l

$errors = predicted - actual$ . Save the result to error.

# Compute errors: error
error <- p - diamonds[["price"]]

Compute RMSE using the formula you learned in the video and print it to the console.

# Calculate RMSE
sqrt(mean(error ^ 2))

## [1] 1129.843

Great work! Now you know how to manually calculate RMSE for your model’s predictions!

1.2 Out-of-sample error measures

1.2.1 Out-of-sample RMSE for linear regression

What is the advantage of using a train/test split rather than just validating your model in-sample on the training set?

It takes less time to calculate error on the test set, since it is smaller than the training set.
There is no advantage to using a test set. You can just use adjusted $R^{2}$ on your training set.

It gives you an estimate of how well your model performs on new data.

Correct! Tests sets are essential for making sure your models will make good predictions.

1.2.2 Randomly order the data frame

One way you can take a train/test split of a dataset is to order the dataset randomly, then divide it into the two sets. This ensures that the training set and test set are both random samples and that any biases in the ordering of the dataset (e.g. if it had originally been ordered by price or size) are not retained in the samples we take for training and testing your models. You can think of this like shuffling a brand new deck of playing cards before dealing hands.

First, you set a random seed so that your work is reproducible and you get the same random split each time you run your script:

set.seed(42)

Next, you use the sample() function to shuffle the row indices of the diamonds dataset. You can later use these indices to reorder the dataset.

rows <- sample(nrow(diamonds))

Finally, you can use this random vector to reorder the diamonds dataset:

diamonds <- diamonds[rows, ]

Set the random seed to 42.

# Set seed
set.seed(42)

Make a vector of row indices called rows.

# Shuffle row indices: rows
rows <- sample(nrow(diamonds))

Randomly reorder the diamonds data frame, assigning to shuffled_diamonds.

# Randomly order data
shuffled_diamonds <- diamonds[rows, ]

Great job! Randomly ordering your dataset is important for many machine learning methods.

1.2.3 Try an 80/20 split

Now that your dataset is randomly ordered, you can split the first 80% of it into a training set, and the last 20% into a test set. You can do this by choosing a split point approximately 80% of the way through your data:

split <- round(nrow(mydata) * 0.80)

You can then use this point to break off the first 80% of the dataset as a training set:

mydata[1:split, ]

And then you can use that same point to determine the test set:

mydata[(split + 1):nrow(mydata), ]

Choose a row index to split on so that the split point is approximately 80% of the way through the diamonds dataset. Call this index split.

# Determine row to split on: split
split <- round(nrow(diamonds) * 0.80)

Create a training set called train using that index.

# Create train
train <- diamonds[1:split, ]

Create a test set called test using that index.

# Create test
test <- diamonds[(split + 1):nrow(diamonds), ]

Well done! Because you already randomly ordered your dataset, it’s easy to split off a random test set.

1.2.4 Predict on test set

Now that you have a randomly split training set and test set, you can use the lm() function as you did in the first exercise to fit a model to your training set, rather than the entire dataset. Recall that you can use the formula interface to the linear regression function to fit a model with a specified target variable using all other variables in the dataset as predictors:

mod <- lm(y ~ ., training_data)

You can use the predict() function to make predictions from that model on new data. The new dataset must have all of the columns from the training data, but they can be in a different order with different values. Here, rather than re-predicting on the training set, you can predict on the test set, which you did not use for training the model. This will allow you to determine the out-of-sample error for the model in the next exercise:

p <- predict(model, new_data)

Fit an lm() model called model to predict price using all other variables as covariates. Be sure to use the training set, train.

# Fit lm model on train: model
model <- lm(price ~ ., train)

Predict on the test set, test, using predict(). Store these values in a vector called p.

# Predict on test: p
p <- predict(model, test)

Excellent work! R makes it very easy to predict with a model on new data.

1.2.5 Calculate test set RMSE by hand

Now that you have predictions on the test set, you can use these predictions to calculate an error metric (in this case RMSE) on the test set and see how the model performs out-of-sample, rather than in-sample as you did in the first exercise. You first do this by calculating the errors between the predicted diamond prices and the actual diamond prices by subtracting the predictions from the actual values.

Once you have an error vector, calculating RMSE is as simple as squaring it, taking the mean, then taking the square root:

sqrt(mean(error^2))

test, model, and p are loaded in your workspace.

Calculate the error between the predictions on the test set and the actual diamond prices in the test set. Call this error.

# Compute errors: error
error <- p - test[["price"]]

Calculate RMSE using this error vector, just printing the result to the console.

# Calculate RMSE
sqrt(mean(error^2))

## [1] 796.8922

Good Job! Calculating RMSE on a test set is exactly the same as calculating it on a training set.

1.2.6 Comparing out-of-sample RMSE to in-sample RMSE

Why is the test set RMSE higher than the training set RMSE?

Because you overfit the training set and the test set contains data the model hasn’t seen before.

Because you should not use a test set at all and instead just look at error on the training set.
Because the test set has a smaller sample size the training set and thus the mean error is lower.

Right! Computing the error on the training set is risky because the model may overfit the data used to train it.

1.3 Cross-validation

1.3.1 Advantage of cross-validation

What is the advantage of cross-validation over a single train/test split?

There is no advantage to cross-validation, just as there is no advantage to a single train/test split. You should be validating your models in-sample with a metric like adjusted $R^{2}$ .
You can pick the best test set to minimize the reported RMSE of your model.

It gives you multiple estimates of out-of-sample error, rather than a single estimate.

Correct! If all of your estimates give similar outputs, you can be more certain of the model’s accuracy. If your estimates give different outputs, that tells you the model does not perform consistently and suggests a problem with it.

1.3.2 10-fold cross-validation

As you saw in the video, a better approach to validating models is to use multiple systematic test sets, rather than a single random train/test split. Fortunately, the caret package makes this very easy to do:

model <- train(y ~ ., my_data)

caret supports many types of cross-validation, and you can specify which type of cross-validation and the number of cross-validation folds with the trainControl() function, which you pass to the trControl argument in train():

model <- train(
  y ~ ., 
  my_data,
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = 10,
    verboseIter = TRUE
  )
)

It’s important to note that you pass the method for modeling to the main train() function and the method for cross-validation to the trainControl() function.

Fit a linear regression to model price using all other variables in the diamonds dataset as predictors. Use the train() function and 10-fold cross-validation. (Note that we’ve taken a subset of the full diamonds dataset to speed up this operation, but it’s still named diamonds.)

library(caret)
# Fit lm model using 10-fold CV: model
model <- train(
  price ~ ., 
  diamonds,
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = 10,
    verboseIter = TRUE
  )
)

## + Fold01: intercept=TRUE 
## - Fold01: intercept=TRUE 
## + Fold02: intercept=TRUE 
## - Fold02: intercept=TRUE 
## + Fold03: intercept=TRUE 
## - Fold03: intercept=TRUE 
## + Fold04: intercept=TRUE 
## - Fold04: intercept=TRUE 
## + Fold05: intercept=TRUE 
## - Fold05: intercept=TRUE 
## + Fold06: intercept=TRUE 
## - Fold06: intercept=TRUE 
## + Fold07: intercept=TRUE 
## - Fold07: intercept=TRUE 
## + Fold08: intercept=TRUE 
## - Fold08: intercept=TRUE 
## + Fold09: intercept=TRUE 
## - Fold09: intercept=TRUE 
## + Fold10: intercept=TRUE 
## - Fold10: intercept=TRUE 
## Aggregating results
## Fitting final model on full training set

Print the model to the console and examine the results.

# Print model to console
model

## Linear Regression 
## 
## 53940 samples
##     9 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 48547, 48546, 48546, 48547, 48545, 48547, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   1131.015  0.9196398  740.6117
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE

Good job! Caret does all the work of splitting test sets and calculating RMSE for you!

1.3.3 5-fold cross-validation

In this course, you will use a wide variety of datasets to explore the full flexibility of the caret package. Here, you will use the famous Boston housing dataset, where the goal is to predict median home values in various Boston suburbs.

You can use exactly the same code as in the previous exercise, but change the dataset used by the model:

model <- train(
  medv ~ ., 
  Boston, # <- new!
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = 10,
    verboseIter = TRUE
  )
)

Next, you can reduce the number of cross-validation folds from 10 to 5 using the number argument to the trainControl() argument:

trControl = trainControl(
  method = "cv", 
  number = 5,
  verboseIter = TRUE
)

Fit an lm() model to the Boston housing dataset, such that medv is the response variable and all other variables are explanatory variables.

Use 5-fold cross-validation rather than 10-fold cross-validation.

library(mlbench)
data(BostonHousing)
Boston=BostonHousing
# Fit lm model using 5-fold CV: model
model <- train(
  medv ~ ., 
  Boston,
  method = "lm",
  trControl = trainControl(
    method = "cv", 
    number = 5,
    verboseIter = TRUE
  )
)

## + Fold1: intercept=TRUE 
## - Fold1: intercept=TRUE 
## + Fold2: intercept=TRUE 
## - Fold2: intercept=TRUE 
## + Fold3: intercept=TRUE 
## - Fold3: intercept=TRUE 
## + Fold4: intercept=TRUE 
## - Fold4: intercept=TRUE 
## + Fold5: intercept=TRUE 
## - Fold5: intercept=TRUE 
## Aggregating results
## Fitting final model on full training set

Print the model to the console and inspect the results.

# Print model to console
model

## Linear Regression 
## 
## 506 samples
##  13 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold) 
## Summary of sample sizes: 405, 405, 406, 403, 405 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   4.860247  0.7209221  3.398114
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE

Great work! Caret makes it easy to try different validation schemes with the same model and compare RMSE.

1.3.4 5 x 5-fold cross-validation

You can do more than just one iteration of cross-validation. Repeated cross-validation gives you a better estimate of the test-set error. You can also repeat the entire cross-validation procedure. This takes longer, but gives you many more out-of-sample datasets to look at and much more precise assessments of how well the model performs.

One of the awesome things about the train() function in caret is how easy it is to run very different models or methods of cross-validation just by tweaking a few simple arguments to the function call. For example, you could repeat your entire cross-validation procedure 5 times for greater confidence in your estimates of the model’s out-of-sample accuracy, e.g.:

trControl = trainControl(
  method = "repeatedcv", 
  number = 5,
  repeats = 5, 
  verboseIter = TRUE
)

Re-fit the linear regression model to the Boston housing dataset.

Use 5 repeats of 5-fold cross-validation.

# Fit lm model using 5 x 5-fold CV: model
model <- train(
  medv ~ ., 
  Boston,
  method = "lm",
  trControl = trainControl(
    method = "repeatedcv", 
    number = 5,
    repeats = 5, 
    verboseIter = TRUE
  )
)

## + Fold1.Rep1: intercept=TRUE 
## - Fold1.Rep1: intercept=TRUE 
## + Fold2.Rep1: intercept=TRUE 
## - Fold2.Rep1: intercept=TRUE 
## + Fold3.Rep1: intercept=TRUE 
## - Fold3.Rep1: intercept=TRUE 
## + Fold4.Rep1: intercept=TRUE 
## - Fold4.Rep1: intercept=TRUE 
## + Fold5.Rep1: intercept=TRUE 
## - Fold5.Rep1: intercept=TRUE 
## + Fold1.Rep2: intercept=TRUE 
## - Fold1.Rep2: intercept=TRUE 
## + Fold2.Rep2: intercept=TRUE 
## - Fold2.Rep2: intercept=TRUE 
## + Fold3.Rep2: intercept=TRUE 
## - Fold3.Rep2: intercept=TRUE 
## + Fold4.Rep2: intercept=TRUE 
## - Fold4.Rep2: intercept=TRUE 
## + Fold5.Rep2: intercept=TRUE 
## - Fold5.Rep2: intercept=TRUE 
## + Fold1.Rep3: intercept=TRUE 
## - Fold1.Rep3: intercept=TRUE 
## + Fold2.Rep3: intercept=TRUE 
## - Fold2.Rep3: intercept=TRUE 
## + Fold3.Rep3: intercept=TRUE 
## - Fold3.Rep3: intercept=TRUE 
## + Fold4.Rep3: intercept=TRUE 
## - Fold4.Rep3: intercept=TRUE 
## + Fold5.Rep3: intercept=TRUE 
## - Fold5.Rep3: intercept=TRUE 
## + Fold1.Rep4: intercept=TRUE 
## - Fold1.Rep4: intercept=TRUE 
## + Fold2.Rep4: intercept=TRUE 
## - Fold2.Rep4: intercept=TRUE 
## + Fold3.Rep4: intercept=TRUE 
## - Fold3.Rep4: intercept=TRUE 
## + Fold4.Rep4: intercept=TRUE 
## - Fold4.Rep4: intercept=TRUE 
## + Fold5.Rep4: intercept=TRUE 
## - Fold5.Rep4: intercept=TRUE 
## + Fold1.Rep5: intercept=TRUE 
## - Fold1.Rep5: intercept=TRUE 
## + Fold2.Rep5: intercept=TRUE 
## - Fold2.Rep5: intercept=TRUE 
## + Fold3.Rep5: intercept=TRUE 
## - Fold3.Rep5: intercept=TRUE 
## + Fold4.Rep5: intercept=TRUE 
## - Fold4.Rep5: intercept=TRUE 
## + Fold5.Rep5: intercept=TRUE 
## - Fold5.Rep5: intercept=TRUE 
## Aggregating results
## Fitting final model on full training set

Print the model to the console.

# Print model to console
model

## Linear Regression 
## 
## 506 samples
##  13 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold, repeated 5 times) 
## Summary of sample sizes: 405, 406, 405, 403, 405, 405, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   4.845724  0.7277269  3.402735
## 
## Tuning parameter 'intercept' was held constant at a value of TRUE

Fantastic work! You can use caret to do some very complicated cross-validation schemes.

1.3.5 Making predictions on new data

Finally, the model you fit with the train() function has the exact same predict() interface as the linear regression models you fit earlier in this chapter.

After fitting a model with train(), you can simply call predict() with new data, e.g:

predict(my_model, new_data)

Use the predict() function to make predictions with model on the full Boston housing dataset. Print the result to the console.

# Predict on full Boston dataset
predict(model, Boston)

##          1          2          3          4          5          6          7 
## 30.0038434 25.0255624 30.5675967 28.6070365 27.9435242 25.2562845 23.0018083 
##          8          9         10         11         12         13         14 
## 19.5359884 11.5236369 18.9202621 18.9994965 21.5867957 20.9065215 19.5529028 
##         15         16         17         18         19         20         21 
## 19.2834821 19.2974832 20.5275098 16.9114013 16.1780111 18.4061360 12.5238575 
##         22         23         24         25         26         27         28 
## 17.6710367 15.8328813 13.8062853 15.6783383 13.3866856 15.4639765 14.7084743 
##         29         30         31         32         33         34         35 
## 19.5473729 20.8764282 11.4551176 18.0592329  8.8110574 14.2827581 13.7067589 
##         36         37         38         39         40         41         42 
## 23.8146353 22.3419371 23.1089114 22.9150261 31.3576257 34.2151023 28.0205641 
##         43         44         45         46         47         48         49 
## 25.2038663 24.6097927 22.9414918 22.0966982 20.4232003 18.0365509  9.1065538 
##         50         51         52         53         54         55         56 
## 17.2060775 21.2815254 23.9722228 27.6558508 24.0490181 15.3618477 31.1526495 
##         57         58         59         60         61         62         63 
## 24.8568698 33.1091981 21.7753799 21.0849356 17.8725804 18.5111021 23.9874286 
##         64         65         66         67         68         69         70 
## 22.5540887 23.3730864 30.3614836 25.5305651 21.1133856 17.4215379 20.7848363 
##         71         72         73         74         75         76         77 
## 25.2014886 21.7426577 24.5574496 24.0429571 25.5049972 23.9669302 22.9454540 
##         78         79         80         81         82         83         84 
## 23.3569982 21.2619827 22.4281737 28.4057697 26.9948609 26.0357630 25.0587348 
##         85         86         87         88         89         90         91 
## 24.7845667 27.7904920 22.1685342 25.8927642 30.6746183 30.8311062 27.1190194 
##         92         93         94         95         96         97         98 
## 27.4126673 28.9412276 29.0810555 27.0397736 28.6245995 24.7274498 35.7815952 
##         99        100        101        102        103        104        105 
## 35.1145459 32.2510280 24.5802202 25.5941347 19.7901368 20.3116713 21.4348259 
##        106        107        108        109        110        111        112 
## 18.5399401 17.1875599 20.7504903 22.6482911 19.7720367 20.6496586 26.5258674 
##        113        114        115        116        117        118        119 
## 20.7732364 20.7154831 25.1720888 20.4302559 23.3772463 23.6904326 20.3357836 
##        120        121        122        123        124        125        126 
## 20.7918087 21.9163207 22.4710778 20.5573856 16.3666198 20.5609982 22.4817845 
##        127        128        129        130        131        132        133 
## 14.6170663 15.1787668 18.9386859 14.0557329 20.0352740 19.4101340 20.0619157 
##        134        135        136        137        138        139        140 
## 15.7580767 13.2564524 17.2627773 15.8784188 19.3616395 13.8148390 16.4488147 
##        141        142        143        144        145        146        147 
## 13.5714193  3.9888551 14.5949548 12.1488148  8.7282236 12.0358534 15.8208206 
##        148        149        150        151        152        153        154 
##  8.5149902  9.7184414 14.8045137 20.8385815 18.3010117 20.1228256 17.2860189 
##        155        156        157        158        159        160        161 
## 22.3660023 20.1037592 13.6212589 33.2598270 29.0301727 25.5675277 32.7082767 
##        162        163        164        165        166        167        168 
## 36.7746701 40.5576584 41.8472817 24.7886738 25.3788924 37.2034745 23.0874875 
##        169        170        171        172        173        174        175 
## 26.4027396 26.6538211 22.5551466 24.2908281 22.9765722 29.0719431 26.5219434 
##        176        177        178        179        180        181        182 
## 30.7220906 25.6166931 29.1374098 31.4357197 32.9223157 34.7244046 27.7655211 
##        183        184        185        186        187        188        189 
## 33.8878732 30.9923804 22.7182001 24.7664781 35.8849723 33.4247672 32.4119915 
##        190        191        192        193        194        195        196 
## 34.5150995 30.7610949 30.2893414 32.9191871 32.1126077 31.5587100 40.8455572 
##        197        198        199        200        201        202        203 
## 36.1277008 32.6692081 34.7046912 30.0934516 30.6439391 29.2871950 37.0714839 
##        204        205        206        207        208        209        210 
## 42.0319312 43.1894984 22.6903480 23.6828471 17.8544721 23.4942899 17.0058772 
##        211        212        213        214        215        216        217 
## 22.3925110 17.0604275 22.7389292 25.2194255 11.1191674 24.5104915 26.6033477 
##        218        219        220        221        222        223        224 
## 28.3551871 24.9152546 29.6865277 33.1841975 23.7745666 32.1405196 29.7458199 
##        225        226        227        228        229        230        231 
## 38.3710245 39.8146187 37.5860575 32.3995325 35.4566524 31.2341151 24.4844923 
##        232        233        234        235        236        237        238 
## 33.2883729 38.0481048 37.1632863 31.7138352 25.2670557 30.1001074 32.7198716 
##        239        240        241        242        243        244        245 
## 28.4271706 28.4294068 27.2937594 23.7426248 24.1200789 27.4020841 16.3285756 
##        246        247        248        249        250        251        252 
## 13.3989126 20.0163878 19.8618443 21.2883131 24.0798915 24.2063355 25.0421582 
##        253        254        255        256        257        258        259 
## 24.9196401 29.9456337 23.9722832 21.6958089 37.5110924 43.3023904 36.4836142 
##        260        261        262        263        264        265        266 
## 34.9898859 34.8121151 37.1663133 40.9892850 34.4463409 35.8339755 28.2457430 
##        267        268        269        270        271        272        273 
## 31.2267359 40.8395575 39.3179239 25.7081791 22.3029553 27.2034097 28.5116947 
##        274        275        276        277        278        279        280 
## 35.4767660 36.1063916 33.7966827 35.6108586 34.8399338 30.3519266 35.3098070 
##        281        282        283        284        285        286        287 
## 38.7975697 34.3312319 40.3396307 44.6730834 31.5968909 27.3565923 20.1017415 
##        288        289        290        291        292        293        294 
## 27.0420667 27.2136458 26.9139584 33.4356331 34.4034963 31.8333982 25.8178324 
##        295        296        297        298        299        300        301 
## 24.4298235 28.4576434 27.3626700 19.5392876 29.1130984 31.9105461 30.7715945 
##        302        303        304        305        306        307        308 
## 28.9427587 28.8819102 32.7988723 33.2090546 30.7683179 35.5622686 32.7090512 
##        309        310        311        312        313        314        315 
## 28.6424424 23.5896583 18.5426690 26.8788984 23.2813398 25.5458025 25.4812006 
##        316        317        318        319        320        321        322 
## 20.5390990 17.6157257 18.3758169 24.2907028 21.3252904 24.8868224 24.8693728 
##        323        324        325        326        327        328        329 
## 22.8695245 19.4512379 25.1178340 24.6678691 23.6807618 19.3408962 21.1741811 
##        330        331        332        333        334        335        336 
## 24.2524907 21.5926089 19.9844661 23.3388800 22.1406069 21.5550993 20.6187291 
##        337        338        339        340        341        342        343 
## 20.1609718 19.2849039 22.1667232 21.2496577 21.4293931 30.3278880 22.0473498 
##        344        345        346        347        348        349        350 
## 27.7064791 28.5479412 16.5450112 14.7835964 25.2738008 27.5420512 22.1483756 
##        351        352        353        354        355        356        357 
## 20.4594409 20.5460542 16.8806383 25.4025351 14.3248663 16.5948846 19.6370469 
##        358        359        360        361        362        363        364 
## 22.7180661 22.2021889 19.2054806 22.6661611 18.9319262 18.2284680 20.2315081 
##        365        366        367        368        369        370        371 
## 37.4944739 14.2819073 15.5428625 10.8316232 23.8007290 32.6440736 34.6068404 
##        372        373        374        375        376        377        378 
## 24.9433133 25.9998091  6.1263250  0.7777981 25.3071306 17.7406106 20.2327441 
##        379        380        381        382        383        384        385 
## 15.8333130 16.8351259 14.3699483 18.4768283 13.4276828 13.0617751  3.2791812 
##        386        387        388        389        390        391        392 
##  8.0602217  6.1284220  5.6186481  6.4519857 14.2076474 17.2122518 17.2988727 
##        393        394        395        396        397        398        399 
##  9.8911664 20.2212419 17.9418118 20.3044578 19.2955908 16.3363278  6.5516232 
##        400        401        402        403        404        405        406 
## 10.8901678 11.8814587 17.8117451 18.2612659 12.9794878  7.3781636  8.2111586 
##        407        408        409        410        411        412        413 
##  8.0662619 19.9829479 13.7075637 19.8526845 15.2230830 16.9607198  1.7185181 
##        414        415        416        417        418        419        420 
## 11.8057839 -4.2813107  9.5837674 13.3666081  6.8956236  6.1477985 14.6066179 
##        421        422        423        424        425        426        427 
## 19.6000267 18.1242748 18.5217713 13.1752861 14.6261762  9.9237498 16.3459065 
##        428        429        430        431        432        433        434 
## 14.0751943 14.2575624 13.0423479 18.1595569 18.6955435 21.5272830 17.0314186 
##        435        436        437        438        439        440        441 
## 15.9609044 13.3614161 14.5207938  8.8197601  4.8675110 13.0659131 12.7060970 
##        442        443        444        445        446        447        448 
## 17.2955806 18.7404850 18.0590103 11.5147468 11.9740036 17.6834462 18.1269524 
##        449        450        451        452        453        454        455 
## 17.5183465 17.2274251 16.5227163 19.4129110 18.5821524 22.4894479 15.2800013 
##        456        457        458        459        460        461        462 
## 15.8208934 12.6872558 12.8763379 17.1866853 18.5124761 19.0486053 20.1720893 
##        463        464        465        466        467        468        469 
## 19.7740732 22.4294077 20.3191185 17.8861625 14.3747852 16.9477685 16.9840576 
##        470        471        472        473        474        475        476 
## 18.5883840 20.1671944 22.9771803 22.4558073 25.5782463 16.3914763 16.1114628 
##        477        478        479        480        481        482        483 
## 20.5348160 11.5427274 19.2049630 21.8627639 23.4687887 27.0988732 28.5699430 
##        484        485        486        487        488        489        490 
## 21.0839878 19.4551620 22.2222591 19.6559196 21.3253610 11.8558372  8.2238669 
##        491        492        493        494        495        496        497 
##  3.6639967 13.7590854 15.9311855 20.6266205 20.6124941 16.8854196 14.0132079 
##        498        499        500        501        502        503        504 
## 19.1085414 21.2980517 18.4549884 20.4687085 23.5333405 22.3757189 27.6274261 
##        505        506 
## 26.1279668 22.3442123

Awesome job! Predicting with a caret model is as easy as predicting with a regular model!

2 Classification models: fitting them and evaluating their performance

In this chapter, you’ll fit classification models with train() and evaluate their out-of-sample performance using cross-validation and area under the curve (AUC).

2.1 Logistic regression on sonar

2.1.1 Why a train/test split?

What is the point of making a train/test split for binary classification problems?

To make the problem harder for the model by reducing the dataset size.

To evaluate your models out-of-sample, on new data.

To reduce the dataset size, so your models fit faster.
There is no real reason; it is no different than evaluating your models in-sample.

Correct! Out-of-sample evaluation is the gold standard of model validation.

2.1.2 Try a 60/40 split

As you saw in the video, you’ll be working with the Sonar dataset in this chapter, using a 60% training set and a 40% test set. We’ll practice making a train/test split one more time, just to be sure you have the hang of it. Recall that you can use the sample() function to get a random permutation of the row indices in a dataset, to use when making train/test splits, e.g.:

n_obs <- nrow(my_data)
permuted_rows <- sample(n_obs)

And then use those row indices to randomly reorder the dataset, e.g.:

my_data <- my_data[permuted_rows, ]

Once your dataset is randomly ordered, you can split off the first 60% as a training set and the last 40% as a test set.

Get the number of observations (rows) in Sonar, assigning to n_obs.

data(Sonar)
# Get the number of observations
n_obs <- nrow(Sonar)

Shuffle the row indices of Sonar and store the result in permuted_rows.

# Shuffle row indices: permuted_rows
permuted_rows <- sample(n_obs)

Use permuted_rows to randomly reorder the rows of Sonar, saving as Sonar_shuffled.

# Randomly order data: Sonar
Sonar_shuffled <- Sonar[permuted_rows, ]

Identify the proper row to split on for a 60/40 split. Store this row number as split.

# Identify row to split on: split
split <- round(n_obs * 0.6)

Save the first 60% of Sonar_shuffled as a training set.

# Create train
train <- Sonar_shuffled[1:split, ]

Save the last 40% of Sonar_shuffled as the test set.

# Create test
test <- Sonar_shuffled[(split + 1):n_obs, ]

Excellent work! Randomly shuffling your data makes it easy to manually create a train/test split.

2.1.3 Fit a logistic regression model

Once you have your random training and test sets you can fit a logistic regression model to your training set using the glm() function. glm() is a more advanced version of lm() that allows for more varied types of regression models, aside from plain vanilla ordinary least squares regression.

Be sure to pass the argument family = “binomial” to glm() to specify that you want to do logistic (rather than linear) regression. For example:

glm(Target ~ ., family = "binomial", dataset)

Don’t worry about warnings like glm.fit: algorithm did not converge or glm.fit: fitted probabilities numerically 0 or 1 occurred. These are common on smaller datasets and usually don’t cause any issues. They typically mean your dataset is perfectly separable, which can cause problems for the math behind the model, but R’s glm() function is almost always robust enough to handle this case with no problems.

Once you have a glm() model fit to your dataset, you can predict the outcome (e.g. rock or mine) on the test set using the predict() function with the argument type = “response”:

predict(my_model, test, type = "response")

Fit a logistic regression called model to predict Class using all other variables as predictors. Use the training set for Sonar.

# Fit glm model: model
model <- glm(Class ~ ., family = "binomial", train)

Predict on the test set using that model. Call the result p like you’ve done before.

# Predict on test: p
p <- predict(model, test, type = "response")

Great work! Manually fitting a glm model in R is very similar to fitting an lm model.

2.2 Confusion matrix

2.2.1 Confusion matrix takeaways

What information does a confusion matrix provide?

True positive rates
True negative rates
False positive rates
False negative rates

All of the above

Yes! It contains all of them.

2.2.2 Calculate a confusion matrix

As you saw in the video, a confusion matrix is a very useful tool for calibrating the output of a model and examining all possible outcomes of your predictions (true positive, true negative, false positive, false negative).

Before you make your confusion matrix, you need to “cut” your predicted probabilities at a given threshold to turn probabilities into a factor of class predictions. Combine ifelse() with factor() as follows:

pos_or_neg <- ifelse(probability_prediction > threshold, positive_class, negative_class)
p_class <- factor(pos_or_neg, levels = levels(test_values))

confusionMatrix() in caret improves on table() from base R by adding lots of useful ancillary statistics in addition to the base rates in the table. You can calculate the confusion matrix (and the associated statistics) using the predicted outcomes as well as the actual outcomes, e.g.:

confusionMatrix(p_class, test_values)

Use ifelse() to create a character vector, m_or_r that is the positive class, “M”, when p is greater than 0.5, and the negative class, “R”, otherwise.

# If p exceeds threshold of 0.5, M else R: m_or_r
m_or_r <- ifelse(p > 0.5, "M", "R")

Convert m_or_r to be a factor, p_class, with levels the same as those of test[[“Class”]].

# Convert to factor: p_class
p_class <- factor(m_or_r, levels = levels(test[["Class"]]))

Make a confusion matrix with confusionMatrix(), passing p_class and the “Class” column from the test dataset.

# Create confusion matrix
confusionMatrix(p_class, test[["Class"]])

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  M  R
##          M  7 31
##          R 32 13
##                                           
##                Accuracy : 0.241           
##                  95% CI : (0.1538, 0.3473)
##     No Information Rate : 0.5301          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : -0.5258         
##                                           
##  Mcnemar's Test P-Value : 1               
##                                           
##             Sensitivity : 0.17949         
##             Specificity : 0.29545         
##          Pos Pred Value : 0.18421         
##          Neg Pred Value : 0.28889         
##              Prevalence : 0.46988         
##          Detection Rate : 0.08434         
##    Detection Prevalence : 0.45783         
##       Balanced Accuracy : 0.23747         
##                                           
##        'Positive' Class : M               
##

Great work! The confusionMatrix function is a very easy way to get a detailed summary of your model’s accuracy.

2.2.3 Calculating accuracy

Use confusionMatrix(p_class, test[[“Class”]]) to calculate a confusion matrix on the test set.

What is the test set accuracy of this model (rounded to the nearest percent)?

70%

Nice one! This is the model’s accuracy.

2.2.4 Calculating true positive rate

Use confusionMatrix(p_class, test[[“Class”]]) to calculate a confusion matrix on the test set.

What is the test set true positive rate (or sensitivity) of this model (rounded to the nearest percent)?

83%

Nice one!

2.2.5 Calculating true negative rate

Use confusionMatrix(p_class, test[[“Class”]]) to calculate a confusion matrix on the test set.

What is the test set true negative rate (or specificity) of this model (rounded to the nearest percent)?

51%

Good job!

2.3 Class probabilities and predictions

2.3.1 Probabilities and classes

What’s the relationship between the predicted probabilities and the predicted classes?

You determine the predicted probabilities by looking at the average accuracy of the predicted classes.
There is no relationship; they’re completely different things.

Predicted classes are based off of predicted probabilities plus a classification threshold.

Correct! Probabilities are used to determine classes.

2.3.2 Try another threshold

In the previous exercises, you used a threshold of 0.50 to cut your predicted probabilities to make class predictions (rock vs mine). However, this classification threshold does not always align with the goals for a given modeling problem.

For example, pretend you want to identify the objects you are really certain are mines. In this case, you might want to use a probability threshold of 0.90 to get fewer predicted mines, but with greater confidence in each prediction.

The code pattern for cutting probabilities into predicted classes, then calculating a confusion matrix, was shown in Exercise 7 of this chapter.

Use ifelse() to create a character vector, m_or_r that is the positive class, “M”, when p is greater than 0.9, and the negative class, “R”, otherwise.

# If p exceeds threshold of 0.9, M else R: m_or_r
m_or_r <- ifelse(p > 0.9, "M", "R")

Convert m_or_r to be a factor, p_class, with levels the same as those of test[[“Class”]].

# Convert to factor: p_class
p_class <- factor(m_or_r, levels = levels(test[["Class"]]))

Make a confusion matrix with confusionMatrix(), passing p_class and the “Class” column from the test dataset.

# Create confusion matrix
confusionMatrix(p_class, test[["Class"]])

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  M  R
##          M  7 31
##          R 32 13
##                                           
##                Accuracy : 0.241           
##                  95% CI : (0.1538, 0.3473)
##     No Information Rate : 0.5301          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : -0.5258         
##                                           
##  Mcnemar's Test P-Value : 1               
##                                           
##             Sensitivity : 0.17949         
##             Specificity : 0.29545         
##          Pos Pred Value : 0.18421         
##          Neg Pred Value : 0.28889         
##              Prevalence : 0.46988         
##          Detection Rate : 0.08434         
##    Detection Prevalence : 0.45783         
##       Balanced Accuracy : 0.23747         
##                                           
##        'Positive' Class : M               
##

Amazing! Note that there are (slightly) fewer predicted mines with this higher threshold: 55 (40 + 15) as compared to 57 for the 0.50 threshold.

2.3.3 From probabilites to confusion matrix

Conversely, say you want to be really certain that your model correctly identifies all the mines as mines. In this case, you might use a prediction threshold of 0.10, instead of 0.90.

The code pattern for cutting probabilities into predicted classes, then calculating a confusion matrix, was shown in Exercise 7 of this chapter.

Use ifelse() to create a character vector, m_or_r that is the positive class, “M”, when p is greater than 0.1, and the negative class, “R”, otherwise.

# If p exceeds threshold of 0.1, M else R: m_or_r
m_or_r <- ifelse(p > 0.1, "M", "R")

Convert m_or_r to be a factor, p_class, with levels the same as those of test[[“Class”]].

# Convert to factor: p_class
p_class <- factor(m_or_r, levels = levels(test[["Class"]]))

Make a confusion matrix with confusionMatrix(), passing p_class and the “Class” column from the test dataset.

# Create confusion matrix
confusionMatrix(p_class, test[["Class"]])

## Confusion Matrix and Statistics
## 
##           Reference
## Prediction  M  R
##          M  7 31
##          R 32 13
##                                           
##                Accuracy : 0.241           
##                  95% CI : (0.1538, 0.3473)
##     No Information Rate : 0.5301          
##     P-Value [Acc > NIR] : 1               
##                                           
##                   Kappa : -0.5258         
##                                           
##  Mcnemar's Test P-Value : 1               
##                                           
##             Sensitivity : 0.17949         
##             Specificity : 0.29545         
##          Pos Pred Value : 0.18421         
##          Neg Pred Value : 0.28889         
##              Prevalence : 0.46988         
##          Detection Rate : 0.08434         
##    Detection Prevalence : 0.45783         
##       Balanced Accuracy : 0.23747         
##                                           
##        'Positive' Class : M               
##

Awesome! Note that there are (slightly) more predicted mines with this lower threshold: 58 (40 + 18) as compared to 47 for the 0.50 threshold.

2.4 Introducing the ROC curve

2.4.1 What’s the value of a ROC curve?

What is the primary value of an ROC curve?

It has a cool acronym.
It can be used to determine the true positive and false positive rates for a particular classification threshold.

It evaluates all possible thresholds for splitting predicted probabilities into predicted classes.

Yes! ROC curves let you evaluate how good a model is, without worry about calibrating its probabilities.

2.4.2 Plot an ROC curve

As you saw in the video, an ROC curve is a really useful shortcut for summarizing the performance of a classifier over all possible thresholds. This saves you a lot of tedious work computing class predictions for many different thresholds and examining the confusion matrix for each.

My favorite package for computing ROC curves is caTools, which contains a function called colAUC(). This function is very user-friendly and can actually calculate ROC curves for multiple predictors at once. In this case, you only need to calculate the ROC curve for one predictor, e.g.:

colAUC(predicted_probabilities, actual, plotROC = TRUE)

The function will return a score called AUC (more on that later) and the plotROC = TRUE argument will return the plot of the ROC curve for visual inspection.

model, test, and train from the last exercise using the sonar data are loaded in your workspace.

Predict probabilities (i.e. type = “response”) on the test set, then store the result as p.

# Predict on test: p
p <- predict(model, test, type = "response")

Make an ROC curve using the predicted test set probabilities.

library(caTools)
# Make ROC curve
colAUC(p, test[["Class"]], plotROC = TRUE)

##              [,1]
## M vs. R 0.8161422

Great work! The colAUC function makes plotting a roc curve as easy as calculating a confusion matrix.

2.5 Area under the curve (AUC)

2.5.1 Model, ROC, and AUC

What is the AUC of a perfect model?

0.00
0.50

1.00

Correct! A perfect model has an AUC of 1.

2.5.2 Customizing trainControl

As you saw in the video, area under the ROC curve is a very useful, single-number summary of a model’s ability to discriminate the positive from the negative class (e.g. mines from rocks). An AUC of 0.5 is no better than random guessing, an AUC of 1.0 is a perfectly predictive model, and an AUC of 0.0 is perfectly anti-predictive (which rarely happens).

This is often a much more useful metric than simply ranking models by their accuracy at a set threshold, as different models might require different calibration steps (looking at a confusion matrix at each step) to find the optimal classification threshold for that model.

You can use the trainControl() function in caret to use AUC (instead of acccuracy), to tune the parameters of your models. The twoClassSummary() convenience function allows you to do this easily.

When using twoClassSummary(), be sure to always include the argument classProbs = TRUE or your model will throw an error! (You cannot calculate AUC with just class predictions. You need to have class probabilities as well.)

Customize the trainControl object to use twoClassSummary rather than defaultSummary.

Use 10-fold cross-validation.

Be sure to tell trainControl() to return class probabilities.

# Create trainControl object: myControl
myControl <- trainControl(
  method = "cv",
  number = 10,
  summaryFunction = twoClassSummary,
  classProbs = TRUE, # IMPORTANT!
  verboseIter = TRUE
)

Great work! Don’t forget the classProbs argument to train control, especially if you’re going to calculate AUC or logloss.

2.5.3 Using custom trainControl

Now that you have a custom trainControl object, it’s easy to fit caret models that use AUC rather than accuracy to tune and evaluate the model. You can just pass your custom trainControl object to the train() function via the trControl argument, e.g.:

train(<standard arguments here>, trControl = myControl)

This syntax gives you a convenient way to store a lot of custom modeling parameters and then use them across multiple different calls to train(). You will make extensive use of this trick in Chapter 5.

Use train() to predict Class from all other variables in the Sonar data (that is, Class ~ .). It should be a glm model (that is, set method to “glm”) using your custom trainControl object, myControl. Save the result to model.

# Train glm with custom trainControl: model
model <- train(
  Class ~ ., 
  Sonar, 
  method = "glm",
  trControl = myControl
)

## + Fold01: parameter=none 
## - Fold01: parameter=none 
## + Fold02: parameter=none 
## - Fold02: parameter=none 
## + Fold03: parameter=none 
## - Fold03: parameter=none 
## + Fold04: parameter=none 
## - Fold04: parameter=none 
## + Fold05: parameter=none 
## - Fold05: parameter=none 
## + Fold06: parameter=none 
## - Fold06: parameter=none 
## + Fold07: parameter=none 
## - Fold07: parameter=none 
## + Fold08: parameter=none 
## - Fold08: parameter=none 
## + Fold09: parameter=none 
## - Fold09: parameter=none 
## + Fold10: parameter=none 
## - Fold10: parameter=none 
## Aggregating results
## Fitting final model on full training set

Print the model to the console and examine its output.

# Print model to console
model

## Generalized Linear Model 
## 
## 208 samples
##  60 predictor
##   2 classes: 'M', 'R' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 187, 187, 188, 187, 188, 188, ... 
## Resampling results:
## 
##   ROC        Sens   Spec
##   0.7255051  0.775  0.66

Great work! Note that fitting a glm with caret often produces warnings about convergence or probabilities. These warnings can almost always be safely ignored, as you can use the glm’s predictions to validate whether the model is accurate enough for your task.

3 Tuning model parameters to improve performance

In this chapter, you will use the train() function to tweak model parameters through cross-validation and grid search.

3.1 Random forests and wine

3.1.1 Random forests vs. linear models

What’s the primary advantage of random forests over linear models?

They make you sound cooler during job interviews.
You can’t understand what’s going on inside of a random forest model, so you don’t have to explain it to anyone.

A random forest is a more flexible model than a linear model, but just as easy to fit.

Correct! Random forests are very powerful non-linear models, but are also very easy to fit.

3.1.2 Fit a random forest

As you saw in the video, random forest models are much more flexible than linear models, and can model complicated nonlinear effects as well as automatically capture interactions between variables. They tend to give very good results on real world data, so let’s try one out on the wine quality dataset, where the goal is to predict the human-evaluated quality of a batch of wine, given some of the machine-measured chemical and physical properties of that batch.

Fitting a random forest model is exactly the same as fitting a generalized linear regression model, as you did in the previous chapter. You simply change the method argument in the train function to be “ranger”. The ranger package is a rewrite of R’s classic randomForest package and fits models much faster, but gives almost exactly the same results. We suggest that all beginners use the ranger package for random forest modeling.

Train a random forest called model on the wine quality dataset, wine, such that quality is the response variable and all other variables are explanatory variables.

Use method = “ranger”.

Use a tuneLength of 1.

Use 5 CV folds.

wine=readRDS("/Users/cliex159/Documents/Rstudio/DataCamp/MachineLearningwithcaretinR/datasets/wine_100.RDS")
# Fit random forest: model
model <- train(
  quality ~ .,
  tuneLength = 1,
  data = wine, 
  method = "ranger",
  trControl = trainControl(
    method = "cv", 
    number = 5, 
    verboseIter = TRUE
  )
)

## + Fold1: mtry=4, min.node.size=5, splitrule=variance 
## - Fold1: mtry=4, min.node.size=5, splitrule=variance 
## + Fold1: mtry=4, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=4, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=4, min.node.size=5, splitrule=variance 
## - Fold2: mtry=4, min.node.size=5, splitrule=variance 
## + Fold2: mtry=4, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=4, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=4, min.node.size=5, splitrule=variance 
## - Fold3: mtry=4, min.node.size=5, splitrule=variance 
## + Fold3: mtry=4, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=4, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=4, min.node.size=5, splitrule=variance 
## - Fold4: mtry=4, min.node.size=5, splitrule=variance 
## + Fold4: mtry=4, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=4, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=4, min.node.size=5, splitrule=variance 
## - Fold5: mtry=4, min.node.size=5, splitrule=variance 
## + Fold5: mtry=4, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=4, min.node.size=5, splitrule=extratrees 
## Aggregating results
## Selecting tuning parameters
## Fitting mtry = 4, splitrule = variance, min.node.size = 5 on full training set

Print model to the console.

# Print model to console
model

## Random Forest 
## 
## 100 samples
##  12 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold) 
## Summary of sample sizes: 81, 80, 79, 80, 80 
## Resampling results across tuning parameters:
## 
##   splitrule   RMSE       Rsquared   MAE      
##   variance    0.6423140  0.3314318  0.4940912
##   extratrees  0.6785689  0.2637406  0.5106034
## 
## Tuning parameter 'mtry' was held constant at a value of 4
## Tuning
##  parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 4, splitrule = variance
##  and min.node.size = 5.

Awesome job! Fitting a random forest is just as easy as fitting a glm. Caret makes it very easy to try out many different models.

3.2 Explore a wider model space

3.2.1 Advantage of a longer tune length

What’s the advantage of a longer tuneLength?

You explore more potential models and can potentially find a better model.

Your models take less time to fit.
There’s no advantage; you’ll always end up with the same final model.

You’re correct! Longer tune lengths explore more models.

3.2.2 Try a longer tune length

Recall from the video that random forest models have a primary tuning parameter of mtry, which controls how many variables are exposed to the splitting search routine at each split. For example, suppose that a tree has a total of 10 splits and mtry = 2. This means that there are 10 samples of 2 predictors each time a split is evaluated.

Use a larger tuning grid this time, but stick to the defaults provided by the train() function. Try a tuneLength of 3, rather than 1, to explore some more potential models, and plot the resulting model using the plot function.

Train a random forest model, model, using the wine dataset on the quality variable with all other variables as explanatory variables. (This will take a few seconds to run, so be patient!)

# Fit random forest: model
model <- train(
  quality ~ .,
  tuneLength = 3,
  data = wine, 
  method = "ranger",
  trControl = trainControl(
    method = "cv", 
    number = 5, 
    verboseIter = TRUE
  )
)

## + Fold1: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold1: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold1: mtry= 7, min.node.size=5, splitrule=variance 
## - Fold1: mtry= 7, min.node.size=5, splitrule=variance 
## + Fold1: mtry=12, min.node.size=5, splitrule=variance 
## - Fold1: mtry=12, min.node.size=5, splitrule=variance 
## + Fold1: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry= 7, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry= 7, min.node.size=5, splitrule=extratrees 
## + Fold1: mtry=12, min.node.size=5, splitrule=extratrees 
## - Fold1: mtry=12, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold2: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold2: mtry= 7, min.node.size=5, splitrule=variance 
## - Fold2: mtry= 7, min.node.size=5, splitrule=variance 
## + Fold2: mtry=12, min.node.size=5, splitrule=variance 
## - Fold2: mtry=12, min.node.size=5, splitrule=variance 
## + Fold2: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry= 7, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry= 7, min.node.size=5, splitrule=extratrees 
## + Fold2: mtry=12, min.node.size=5, splitrule=extratrees 
## - Fold2: mtry=12, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold3: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold3: mtry= 7, min.node.size=5, splitrule=variance 
## - Fold3: mtry= 7, min.node.size=5, splitrule=variance 
## + Fold3: mtry=12, min.node.size=5, splitrule=variance 
## - Fold3: mtry=12, min.node.size=5, splitrule=variance 
## + Fold3: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry= 7, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry= 7, min.node.size=5, splitrule=extratrees 
## + Fold3: mtry=12, min.node.size=5, splitrule=extratrees 
## - Fold3: mtry=12, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold4: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold4: mtry= 7, min.node.size=5, splitrule=variance 
## - Fold4: mtry= 7, min.node.size=5, splitrule=variance 
## + Fold4: mtry=12, min.node.size=5, splitrule=variance 
## - Fold4: mtry=12, min.node.size=5, splitrule=variance 
## + Fold4: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry= 7, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry= 7, min.node.size=5, splitrule=extratrees 
## + Fold4: mtry=12, min.node.size=5, splitrule=extratrees 
## - Fold4: mtry=12, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry= 2, min.node.size=5, splitrule=variance 
## - Fold5: mtry= 2, min.node.size=5, splitrule=variance 
## + Fold5: mtry= 7, min.node.size=5, splitrule=variance 
## - Fold5: mtry= 7, min.node.size=5, splitrule=variance 
## + Fold5: mtry=12, min.node.size=5, splitrule=variance 
## - Fold5: mtry=12, min.node.size=5, splitrule=variance 
## + Fold5: mtry= 2, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry= 2, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry= 7, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry= 7, min.node.size=5, splitrule=extratrees 
## + Fold5: mtry=12, min.node.size=5, splitrule=extratrees 
## - Fold5: mtry=12, min.node.size=5, splitrule=extratrees 
## Aggregating results
## Selecting tuning parameters
## Fitting mtry = 7, splitrule = variance, min.node.size = 5 on full training set

Use method = “ranger”.

Change the tuneLength to 3.

Use 5 CV folds.

Print model to the console.

# Print model to console
model

## Random Forest 
## 
## 100 samples
##  12 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold) 
## Summary of sample sizes: 79, 80, 81, 80, 80 
## Resampling results across tuning parameters:
## 
##   mtry  splitrule   RMSE       Rsquared   MAE      
##    2    variance    0.6493381  0.3234349  0.4966282
##    2    extratrees  0.6846140  0.2431224  0.5172347
##    7    variance    0.6246233  0.3767655  0.4770864
##    7    extratrees  0.6706236  0.2665631  0.5062392
##   12    variance    0.6264390  0.3771307  0.4856709
##   12    extratrees  0.6648015  0.2836460  0.5073460
## 
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 7, splitrule = variance
##  and min.node.size = 5.

Plot the model after fitting it.

# Plot model
plot(model)

Excellent! You can adjust the tuneLength variable to make a trade-off between runtime and how deep you want to grid-search the model.

3.3 Custom tuning grids

3.3.1 Advantages of a custom tuning grid

Why use a custom tuneGrid?

There’s no advantage; you’ll always end up with the same final model.

It gives you more fine-grained control over the tuning parameters that are explored.

It always makes your models run faster.

You’re right! A custom tune grid gives you full control over caret’s grid search.

3.3.2 Fit a random forest with custom tuning

Now that you’ve explored the default tuning grids provided by the train() function, let’s customize your models a bit more.

You can provide any number of values for mtry, from 2 up to the number of columns in the dataset. In practice, there are diminishing returns for much larger values of mtry, so you will use a custom tuning grid that explores 2 simple models (mtry = 2 and mtry = 3) as well as one more complicated model (mtry = 7).

Define a custom tuning grid.

Set the number of variables to possibly split at each node, .mtry, to a vector of 2, 3, and 7.

Set the rule to split on, .splitrule, to “variance”.

Set the minimum node size, .min.node.size, to 5.

# From previous step
tuneGrid <- data.frame(
  .mtry = c(2, 3, 7),
  .splitrule = "variance",
  .min.node.size = 5
)

Train another random forest model, model, using the wine dataset on the quality variable with all other variables as explanatory variables.

Use method = “ranger”.

Use the custom tuneGrid.

Use 5 CV folds.

# Fit random forest: model
model <- train(
  quality ~ .,
  tuneGrid = tuneGrid,
  data = wine, 
  method = "ranger",
  trControl = trainControl(
    method = "cv", 
    number = 5, 
    verboseIter = TRUE
  )
)

## + Fold1: mtry=2, splitrule=variance, min.node.size=5 
## - Fold1: mtry=2, splitrule=variance, min.node.size=5 
## + Fold1: mtry=3, splitrule=variance, min.node.size=5 
## - Fold1: mtry=3, splitrule=variance, min.node.size=5 
## + Fold1: mtry=7, splitrule=variance, min.node.size=5 
## - Fold1: mtry=7, splitrule=variance, min.node.size=5 
## + Fold2: mtry=2, splitrule=variance, min.node.size=5 
## - Fold2: mtry=2, splitrule=variance, min.node.size=5 
## + Fold2: mtry=3, splitrule=variance, min.node.size=5 
## - Fold2: mtry=3, splitrule=variance, min.node.size=5 
## + Fold2: mtry=7, splitrule=variance, min.node.size=5 
## - Fold2: mtry=7, splitrule=variance, min.node.size=5 
## + Fold3: mtry=2, splitrule=variance, min.node.size=5 
## - Fold3: mtry=2, splitrule=variance, min.node.size=5 
## + Fold3: mtry=3, splitrule=variance, min.node.size=5 
## - Fold3: mtry=3, splitrule=variance, min.node.size=5 
## + Fold3: mtry=7, splitrule=variance, min.node.size=5 
## - Fold3: mtry=7, splitrule=variance, min.node.size=5 
## + Fold4: mtry=2, splitrule=variance, min.node.size=5 
## - Fold4: mtry=2, splitrule=variance, min.node.size=5 
## + Fold4: mtry=3, splitrule=variance, min.node.size=5 
## - Fold4: mtry=3, splitrule=variance, min.node.size=5 
## + Fold4: mtry=7, splitrule=variance, min.node.size=5 
## - Fold4: mtry=7, splitrule=variance, min.node.size=5 
## + Fold5: mtry=2, splitrule=variance, min.node.size=5 
## - Fold5: mtry=2, splitrule=variance, min.node.size=5 
## + Fold5: mtry=3, splitrule=variance, min.node.size=5 
## - Fold5: mtry=3, splitrule=variance, min.node.size=5 
## + Fold5: mtry=7, splitrule=variance, min.node.size=5 
## - Fold5: mtry=7, splitrule=variance, min.node.size=5 
## Aggregating results
## Selecting tuning parameters
## Fitting mtry = 7, splitrule = variance, min.node.size = 5 on full training set

Print model to the console.

# Print model to console
model

## Random Forest 
## 
## 100 samples
##  12 predictor
## 
## No pre-processing
## Resampling: Cross-Validated (5 fold) 
## Summary of sample sizes: 80, 79, 80, 81, 80 
## Resampling results across tuning parameters:
## 
##   mtry  RMSE       Rsquared   MAE      
##   2     0.6709358  0.3161420  0.5148394
##   3     0.6662775  0.3087695  0.5154802
##   7     0.6483430  0.3447539  0.4921923
## 
## Tuning parameter 'splitrule' was held constant at a value of variance
## 
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 7, splitrule = variance
##  and min.node.size = 5.

Plot the model after fitting it using plot().

# Plot model
plot(model)

Great work! Model tuning plots can be very useful for understanding caret models.

3.4 Introducing glmnet

3.4.1 Advantage of glmnet

What’s the advantage of glmnet over regular glm models?

glmnet models automatically find interaction variables.
glmnet models don’t provide p-values or confidence intervals on predictions.

glmnet models place constraints on your coefficients, which helps prevent overfitting.

Yes! glmnet models give you an easy way to optimize for simpler models.

3.4.2 Make a custom trainControl

The wine quality dataset was a regression problem, but now you are looking at a classification problem. This is a simulated dataset based on the “don’t overfit” competition on Kaggle a number of years ago.

Classification problems are a little more complicated than regression problems because you have to provide a custom summaryFunction to the train() function to use the AUC metric to rank your models. Start by making a custom trainControl, as you did in the previous chapter. Be sure to set classProbs = TRUE, otherwise the twoClassSummary for summaryFunction will break.

Make a custom trainControl called myControl for classification using the trainControl function.

Use 10 CV folds.
Use twoClassSummary for the summaryFunction.
Be sure to set classProbs = TRUE.

# Create custom trainControl: myControl
myControl <- trainControl(
  method = "cv", 
  number = 10,
  summaryFunction = twoClassSummary,
  classProbs = TRUE, # IMPORTANT!
  verboseIter = TRUE
)

Great work! Creating a custome trainControl gives you much finer control over how caret searches for models.

3.4.3 Fit glmnet with custom trainControl

Now that you have a custom trainControl object, fit a glmnet model to the “don’t overfit” dataset. Recall from the video that glmnet is an extension of the generalized linear regression model (or glm) that places constraints on the magnitude of the coefficients to prevent overfitting. This is more commonly known as “penalized” regression modeling and is a very useful technique on datasets with many predictors and few values.

glmnet is capable of fitting two different kinds of penalized models, controlled by the alpha parameter:

Ridge regression (or alpha = 0)
Lasso regression (or alpha = 1)

You’ll now fit a glmnet model to the “don’t overfit” dataset using the defaults provided by the caret package.

Train a glmnet model called model on the overfit data. Use the custom trainControl from the previous exercise (myControl). The variable y is the response variable and all other variables are explanatory variables.

overfit=read.csv("https://assets.datacamp.com/production/repositories/223/datasets/0bd5f7c30d9aec3e1f1fa677a19bee3af407453a/overfit.csv")
# Fit glmnet model: model
model <- train(
  y ~ ., 
  overfit,
  method = "glmnet",
  trControl = myControl
)

## + Fold01: alpha=0.10, lambda=0.01013 
## - Fold01: alpha=0.10, lambda=0.01013 
## + Fold01: alpha=0.55, lambda=0.01013 
## - Fold01: alpha=0.55, lambda=0.01013 
## + Fold01: alpha=1.00, lambda=0.01013 
## - Fold01: alpha=1.00, lambda=0.01013 
## + Fold02: alpha=0.10, lambda=0.01013 
## - Fold02: alpha=0.10, lambda=0.01013 
## + Fold02: alpha=0.55, lambda=0.01013 
## - Fold02: alpha=0.55, lambda=0.01013 
## + Fold02: alpha=1.00, lambda=0.01013 
## - Fold02: alpha=1.00, lambda=0.01013 
## + Fold03: alpha=0.10, lambda=0.01013 
## - Fold03: alpha=0.10, lambda=0.01013 
## + Fold03: alpha=0.55, lambda=0.01013 
## - Fold03: alpha=0.55, lambda=0.01013 
## + Fold03: alpha=1.00, lambda=0.01013 
## - Fold03: alpha=1.00, lambda=0.01013 
## + Fold04: alpha=0.10, lambda=0.01013 
## - Fold04: alpha=0.10, lambda=0.01013 
## + Fold04: alpha=0.55, lambda=0.01013 
## - Fold04: alpha=0.55, lambda=0.01013 
## + Fold04: alpha=1.00, lambda=0.01013 
## - Fold04: alpha=1.00, lambda=0.01013 
## + Fold05: alpha=0.10, lambda=0.01013 
## - Fold05: alpha=0.10, lambda=0.01013 
## + Fold05: alpha=0.55, lambda=0.01013 
## - Fold05: alpha=0.55, lambda=0.01013 
## + Fold05: alpha=1.00, lambda=0.01013 
## - Fold05: alpha=1.00, lambda=0.01013 
## + Fold06: alpha=0.10, lambda=0.01013 
## - Fold06: alpha=0.10, lambda=0.01013 
## + Fold06: alpha=0.55, lambda=0.01013 
## - Fold06: alpha=0.55, lambda=0.01013 
## + Fold06: alpha=1.00, lambda=0.01013 
## - Fold06: alpha=1.00, lambda=0.01013 
## + Fold07: alpha=0.10, lambda=0.01013 
## - Fold07: alpha=0.10, lambda=0.01013 
## + Fold07: alpha=0.55, lambda=0.01013 
## - Fold07: alpha=0.55, lambda=0.01013 
## + Fold07: alpha=1.00, lambda=0.01013 
## - Fold07: alpha=1.00, lambda=0.01013 
## + Fold08: alpha=0.10, lambda=0.01013 
## - Fold08: alpha=0.10, lambda=0.01013 
## + Fold08: alpha=0.55, lambda=0.01013 
## - Fold08: alpha=0.55, lambda=0.01013 
## + Fold08: alpha=1.00, lambda=0.01013 
## - Fold08: alpha=1.00, lambda=0.01013 
## + Fold09: alpha=0.10, lambda=0.01013 
## - Fold09: alpha=0.10, lambda=0.01013 
## + Fold09: alpha=0.55, lambda=0.01013 
## - Fold09: alpha=0.55, lambda=0.01013 
## + Fold09: alpha=1.00, lambda=0.01013 
## - Fold09: alpha=1.00, lambda=0.01013 
## + Fold10: alpha=0.10, lambda=0.01013 
## - Fold10: alpha=0.10, lambda=0.01013 
## + Fold10: alpha=0.55, lambda=0.01013 
## - Fold10: alpha=0.55, lambda=0.01013 
## + Fold10: alpha=1.00, lambda=0.01013 
## - Fold10: alpha=1.00, lambda=0.01013 
## Aggregating results
## Selecting tuning parameters
## Fitting alpha = 0.1, lambda = 0.0101 on full training set

Print the model to the console.

# Print model to console
model

## glmnet 
## 
## 250 samples
## 200 predictors
##   2 classes: 'class1', 'class2' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 225, 225, 225, 224, 225, 225, ... 
## Resampling results across tuning parameters:
## 
##   alpha  lambda        ROC        Sens  Spec     
##   0.10   0.0001012745  0.4564312  0.1   0.9617754
##   0.10   0.0010127448  0.4524457  0.0   0.9786232
##   0.10   0.0101274483  0.4677536  0.0   0.9916667
##   0.55   0.0001012745  0.4137681  0.1   0.9615942
##   0.55   0.0010127448  0.4310688  0.1   0.9574275
##   0.55   0.0101274483  0.4398551  0.0   0.9789855
##   1.00   0.0001012745  0.4009058  0.1   0.9273551
##   1.00   0.0010127448  0.3989130  0.1   0.9360507
##   1.00   0.0101274483  0.4476449  0.1   0.9748188
## 
## ROC was used to select the optimal model using the largest value.
## The final values used for the model were alpha = 0.1 and lambda = 0.01012745.

Use the max() function to find the maximum of the ROC statistic contained somewhere in model[[“results”]].

# Print maximum ROC statistic
max(model[["results"]][["ROC"]])

## [1] 0.4677536

Awesome job! This glmnet will use AUC rather than accuracy to select the final model parameters.

3.5 glmnet with custom tuning grid

3.5.1 Why a custom tuning grid?

Why use a custom tuning grid for a glmnet model?

There’s no reason to use a custom grid; the default is always the best.

The default tuning grid is very small and there are many more potential glmnet models you want to explore.

glmnet models are really slow, so you should never try more than a few tuning parameters.

Good job! With a custom grid you can deeply explore machine learning models in caret.

3.5.2 glmnet with custom trainControl and tuning

As you saw in the video, the glmnet model actually fits many models at once (one of the great things about the package). You can exploit this by passing a large number of lambda values, which control the amount of penalization in the model. train() is smart enough to only fit one model per alpha value and pass all of the lambda values at once for simultaneous fitting.

My favorite tuning grid for glmnet models is:

expand.grid(
  alpha = 0:1,
  lambda = seq(0.0001, 1, length = 100)
)

This grid explores a large number of lambda values (100, in fact), from a very small one to a very large one. (You could increase the maximum lambda to 10, but in this exercise 1 is a good upper bound.)

If you want to explore fewer models, you can use a shorter lambda sequence. For example, lambda = seq(0.0001, 1, length = 10) would fit 10 models per value of alpha.

You also look at the two forms of penalized models with this tuneGrid: ridge regression and lasso regression. alpha = 0 is pure ridge regression, and alpha = 1 is pure lasso regression. You can fit a mixture of the two models (i.e. an elastic net) using an alpha between 0 and 1. For example, alpha = 0.05 would be 95% ridge regression and 5% lasso regression.

In this problem you’ll just explore the 2 extremes – pure ridge and pure lasso regression – for the purpose of illustrating their differences.

Train a glmnet model on the overfit data such that y is the response variable and all other variables are explanatory variables. Make sure to use your custom trainControl from the previous exercise (myControl). Also, use a custom tuneGrid to explore alpha = 0:1 and 20 values of lambda between 0.0001 and 1 per value of alpha.

# Train glmnet with custom trainControl and tuning: model
model <- train(
  y ~ ., 
  overfit,
  tuneGrid = expand.grid(
    alpha = 0:1,
    lambda = seq(0.0001, 1, length = 20)
  ),
  method = "glmnet",
  trControl = myControl
)

## + Fold01: alpha=0, lambda=1 
## - Fold01: alpha=0, lambda=1 
## + Fold01: alpha=1, lambda=1 
## - Fold01: alpha=1, lambda=1 
## + Fold02: alpha=0, lambda=1 
## - Fold02: alpha=0, lambda=1 
## + Fold02: alpha=1, lambda=1 
## - Fold02: alpha=1, lambda=1 
## + Fold03: alpha=0, lambda=1 
## - Fold03: alpha=0, lambda=1 
## + Fold03: alpha=1, lambda=1 
## - Fold03: alpha=1, lambda=1 
## + Fold04: alpha=0, lambda=1 
## - Fold04: alpha=0, lambda=1 
## + Fold04: alpha=1, lambda=1 
## - Fold04: alpha=1, lambda=1 
## + Fold05: alpha=0, lambda=1 
## - Fold05: alpha=0, lambda=1 
## + Fold05: alpha=1, lambda=1 
## - Fold05: alpha=1, lambda=1 
## + Fold06: alpha=0, lambda=1 
## - Fold06: alpha=0, lambda=1 
## + Fold06: alpha=1, lambda=1 
## - Fold06: alpha=1, lambda=1 
## + Fold07: alpha=0, lambda=1 
## - Fold07: alpha=0, lambda=1 
## + Fold07: alpha=1, lambda=1 
## - Fold07: alpha=1, lambda=1 
## + Fold08: alpha=0, lambda=1 
## - Fold08: alpha=0, lambda=1 
## + Fold08: alpha=1, lambda=1 
## - Fold08: alpha=1, lambda=1 
## + Fold09: alpha=0, lambda=1 
## - Fold09: alpha=0, lambda=1 
## + Fold09: alpha=1, lambda=1 
## - Fold09: alpha=1, lambda=1 
## + Fold10: alpha=0, lambda=1 
## - Fold10: alpha=0, lambda=1 
## + Fold10: alpha=1, lambda=1 
## - Fold10: alpha=1, lambda=1 
## Aggregating results
## Selecting tuning parameters
## Fitting alpha = 1, lambda = 0.0527 on full training set

Print model to the console.

# Print model to console
model

## glmnet 
## 
## 250 samples
## 200 predictors
##   2 classes: 'class1', 'class2' 
## 
## No pre-processing
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 225, 224, 226, 224, 225, 225, ... 
## Resampling results across tuning parameters:
## 
##   alpha  lambda      ROC        Sens  Spec     
##   0      0.00010000  0.4558877  0.0   0.9742754
##   0      0.05272632  0.4473732  0.0   0.9958333
##   0      0.10535263  0.4538949  0.0   1.0000000
##   0      0.15797895  0.4667572  0.0   1.0000000
##   0      0.21060526  0.4753623  0.0   1.0000000
##   0      0.26323158  0.4797101  0.0   1.0000000
##   0      0.31585789  0.4797101  0.0   1.0000000
##   0      0.36848421  0.4797101  0.0   1.0000000
##   0      0.42111053  0.4860507  0.0   1.0000000
##   0      0.47373684  0.4795290  0.0   1.0000000
##   0      0.52636316  0.4837862  0.0   1.0000000
##   0      0.57898947  0.4837862  0.0   1.0000000
##   0      0.63161579  0.4859601  0.0   1.0000000
##   0      0.68424211  0.4859601  0.0   1.0000000
##   0      0.73686842  0.4881341  0.0   1.0000000
##   0      0.78949474  0.4837862  0.0   1.0000000
##   0      0.84212105  0.4816123  0.0   1.0000000
##   0      0.89474737  0.4857790  0.0   1.0000000
##   0      0.94737368  0.4836051  0.0   1.0000000
##   0      1.00000000  0.4836051  0.0   1.0000000
##   1      0.00010000  0.4336051  0.1   0.9443841
##   1      0.05272632  0.5086957  0.0   1.0000000
##   1      0.10535263  0.5000000  0.0   1.0000000
##   1      0.15797895  0.5000000  0.0   1.0000000
##   1      0.21060526  0.5000000  0.0   1.0000000
##   1      0.26323158  0.5000000  0.0   1.0000000
##   1      0.31585789  0.5000000  0.0   1.0000000
##   1      0.36848421  0.5000000  0.0   1.0000000
##   1      0.42111053  0.5000000  0.0   1.0000000
##   1      0.47373684  0.5000000  0.0   1.0000000
##   1      0.52636316  0.5000000  0.0   1.0000000
##   1      0.57898947  0.5000000  0.0   1.0000000
##   1      0.63161579  0.5000000  0.0   1.0000000
##   1      0.68424211  0.5000000  0.0   1.0000000
##   1      0.73686842  0.5000000  0.0   1.0000000
##   1      0.78949474  0.5000000  0.0   1.0000000
##   1      0.84212105  0.5000000  0.0   1.0000000
##   1      0.89474737  0.5000000  0.0   1.0000000
##   1      0.94737368  0.5000000  0.0   1.0000000
##   1      1.00000000  0.5000000  0.0   1.0000000
## 
## ROC was used to select the optimal model using the largest value.
## The final values used for the model were alpha = 1 and lambda = 0.05272632.

Print the max() of the ROC statistic in model[[“results”]]. You can access it using model[[“results”]][[“ROC”]].

# Print maximum ROC statistic
max(model[["results"]][["ROC"]])

## [1] 0.5086957

Excellent work! I use this custom tuning grid for all my glmnet models – it’s a great place to start!

3.5.3 Interpreting glmnet plots

Here’s the tuning plot for the custom tuned glmnet model you created in the last exercise. For the overfit dataset, which value of alpha is better?

alpha = 0 (ridge)

alpha = 1 (lasso)

Correct! For this dataset, alpha = 1 (or lasso) is better.

4 Preprocessing your data

In this chapter, you will practice using train() to preprocess data before fitting models, improving your ability to making accurate predictions.

4.1 Median imputation

4.1.1 Median imputation vs. omitting rows

What’s the value of median imputation?

It removes some variance from your data, making it easier to model.

It lets you model data with missing values.

It’s useless; you should just throw out rows of data with any missings.

Correct! Missing data can be a big headache unless you handle it.

4.1.2 Apply median imputation

In this chapter, you’ll be using a version of the Wisconsin Breast Cancer dataset. This dataset presents a classic binary classification problem: 50% of the samples are benign, 50% are malignant, and the challenge is to identify which are which.

This dataset is interesting because many of the predictors contain missing values and most rows of the dataset have at least one missing value. This presents a modeling challenge, because most machine learning algorithms cannot handle missing values out of the box. For example, your first instinct might be to fit a logistic regression model to this data, but prior to doing this you need a strategy for handling the NAs.

Fortunately, the train() function in caret contains an argument called preProcess, which allows you to specify that median imputation should be used to fill in the missing values. In previous chapters, you created models with the train() function using formulas such as y ~ .. An alternative way is to specify the x and y arguments to train(), where x is an object with samples in rows and features in columns and y is a numeric or factor vector containing the outcomes. Said differently, x is a matrix or data frame that contains the whole dataset you’d use for the data argument to the lm() call, for example, but excludes the response variable column; y is a vector that contains just the response variable column.

For this exercise, the argument x to train() is loaded in your workspace as breast_cancer_x and y as breast_cancer_y.

Use the train() function to fit a glm model called median_model to the breast cancer dataset. Use preProcess = “medianImpute” to handle the missing values.

load("/Users/cliex159/Documents/Rstudio/DataCamp/MachineLearningwithcaretinR/datasets/BreastCancer.RData")
# Apply median imputation: median_model
median_model <- train(
  x = breast_cancer_x, 
  y = breast_cancer_y,
  method = "glm",
  trControl = myControl,
  preProcess = "medianImpute"
)

## + Fold01: parameter=none 
## - Fold01: parameter=none 
## + Fold02: parameter=none 
## - Fold02: parameter=none 
## + Fold03: parameter=none 
## - Fold03: parameter=none 
## + Fold04: parameter=none 
## - Fold04: parameter=none 
## + Fold05: parameter=none 
## - Fold05: parameter=none 
## + Fold06: parameter=none 
## - Fold06: parameter=none 
## + Fold07: parameter=none 
## - Fold07: parameter=none 
## + Fold08: parameter=none 
## - Fold08: parameter=none 
## + Fold09: parameter=none 
## - Fold09: parameter=none 
## + Fold10: parameter=none 
## - Fold10: parameter=none 
## Aggregating results
## Fitting final model on full training set

Print median_model to the console.

# Print median_model to console
median_model

## Generalized Linear Model 
## 
## 699 samples
##   9 predictor
##   2 classes: 'benign', 'malignant' 
## 
## Pre-processing: median imputation (9) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 629, 629, 629, 628, 629, 630, ... 
## Resampling results:
## 
##   ROC        Sens       Spec     
##   0.9913969  0.9694203  0.9461667

Fantastic job! Caret makes it very easy to include model preprocessing in your model validation.

4.2 KNN imputation

4.2.1 Comparing KNN imputation to median imputation

Will KNN imputation always be better than median imputation?

No, you should try both options and keep the one that gives more accurate models.

Yes, KNN is a more complicated model than medians, so it’s always better.
No, medians are more statistically valid than KNN and should always be used.

Correct! Always try everything and decide the best option empirically.

4.2.2 Use KNN imputation

In the previous exercise, you used median imputation to fill in missing values in the breast cancer dataset, but that is not the only possible method for dealing with missing data.

An alternative to median imputation is k-nearest neighbors, or KNN, imputation. This is a more advanced form of imputation where missing values are replaced with values from other rows that are similar to the current row. While this is a lot more complicated to implement in practice than simple median imputation, it is very easy to explore in caret using the preProcess argument to train(). You can simply use preProcess = “knnImpute” to change the method of imputation used prior to model fitting.

breast_cancer_x and breast_cancer_y are loaded in your workspace.

Use the train() function to fit a glm model called knn_model to the breast cancer dataset.

library(RANN)
# Apply KNN imputation: knn_model
knn_model <- train(
  x = breast_cancer_x, 
  y = breast_cancer_y,
  method = "glm",
  trControl = myControl,
  preProcess = "knnImpute"
)

## + Fold01: parameter=none 
## - Fold01: parameter=none 
## + Fold02: parameter=none 
## - Fold02: parameter=none 
## + Fold03: parameter=none 
## - Fold03: parameter=none 
## + Fold04: parameter=none 
## - Fold04: parameter=none 
## + Fold05: parameter=none 
## - Fold05: parameter=none 
## + Fold06: parameter=none 
## - Fold06: parameter=none 
## + Fold07: parameter=none 
## - Fold07: parameter=none 
## + Fold08: parameter=none 
## - Fold08: parameter=none 
## + Fold09: parameter=none 
## - Fold09: parameter=none 
## + Fold10: parameter=none 
## - Fold10: parameter=none 
## Aggregating results
## Fitting final model on full training set

Use KNN imputation to handle missing values.

# Print knn_model to console
knn_model

## Generalized Linear Model 
## 
## 699 samples
##   9 predictor
##   2 classes: 'benign', 'malignant' 
## 
## Pre-processing: nearest neighbor imputation (9), centered (9), scaled (9) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 630, 629, 629, 629, 630, 628, ... 
## Resampling results:
## 
##   ROC        Sens       Spec     
##   0.9909863  0.9715459  0.9376667

Good work! As you can see, you can easily try out different imputation methods.

4.2.3 Compare KNN and median imputation

All of the preprocessing steps in the train() function happen in the training set of each cross-validation fold, so the error metrics reported include the effects of the preprocessing.

This includes the imputation method used (e.g. knnImpute or medianImpute). This is useful because it allows you to compare different methods of imputation and choose the one that performs the best out-of-sample.

median_model and knn_model are available in your workspace, as is resamples, which contains the resampled results of both models. Look at the results of the models by calling

dotplot(resamples, metric = "ROC")

and choose the one that performs the best out-of-sample. Which method of imputation yields the highest out-of-sample ROC score for your glm model?

KNN imputation is much better than median imputation.

KNN imputation is slightly better than median imputation.

Median imputation is much better than KNN imputation.
Median imputation is slightly better than KNN imputation.

Nice!

4.3 Multiple preprocessing methods

4.3.1 Order of operations

Which comes first in caret’s preProcess() function: median imputation or centering and scaling of variables?

Median imputation comes before centering and scaling.

Centering and scaling come before median imputation.

Correct! Centering and scaling require data with no missing values.

4.3.2 Combining preprocessing methods

The preProcess argument to train() doesn’t just limit you to imputing missing values. It also includes a wide variety of other preProcess techniques to make your life as a data scientist much easier. You can read a full list of them by typing ?preProcess and reading the help page for this function.

One set of preprocessing functions that is particularly useful for fitting regression models is standardization: centering and scaling. You first center by subtracting the mean of each column from each value in that column, then you scale by dividing by the standard deviation.

Standardization transforms your data such that for each column, the mean is 0 and the standard deviation is 1. This makes it easier for regression models to find a good solution.

breast_cancer_x and breast_cancer_y are loaded in your workspace. Fit a logistic regression model using median imputation called model to the breast cancer data, then print it to the console.

# Fit glm with median imputation
model <- train(
  x = breast_cancer_x, 
  y = breast_cancer_y,
  method = "glm",
  trControl = myControl,
  preProcess = "medianImpute"
)

## + Fold01: parameter=none 
## - Fold01: parameter=none 
## + Fold02: parameter=none 
## - Fold02: parameter=none 
## + Fold03: parameter=none 
## - Fold03: parameter=none 
## + Fold04: parameter=none 
## - Fold04: parameter=none 
## + Fold05: parameter=none 
## - Fold05: parameter=none 
## + Fold06: parameter=none 
## - Fold06: parameter=none 
## + Fold07: parameter=none 
## - Fold07: parameter=none 
## + Fold08: parameter=none 
## - Fold08: parameter=none 
## + Fold09: parameter=none 
## - Fold09: parameter=none 
## + Fold10: parameter=none 
## - Fold10: parameter=none 
## Aggregating results
## Fitting final model on full training set

# Print model
model

## Generalized Linear Model 
## 
## 699 samples
##   9 predictor
##   2 classes: 'benign', 'malignant' 
## 
## Pre-processing: median imputation (9) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 629, 629, 629, 628, 629, 630, ... 
## Resampling results:
## 
##   ROC       Sens       Spec     
##   0.992494  0.9695169  0.9458333

Update the model to include two more pre-processing steps: centering and scaling.

# Update model with standardization
model <- train(
  x = breast_cancer_x, 
  y = breast_cancer_y,
  method = "glm",
  trControl = myControl,
  preProcess = c("medianImpute", "center", "scale")
)

## + Fold01: parameter=none 
## - Fold01: parameter=none 
## + Fold02: parameter=none 
## - Fold02: parameter=none 
## + Fold03: parameter=none 
## - Fold03: parameter=none 
## + Fold04: parameter=none 
## - Fold04: parameter=none 
## + Fold05: parameter=none 
## - Fold05: parameter=none 
## + Fold06: parameter=none 
## - Fold06: parameter=none 
## + Fold07: parameter=none 
## - Fold07: parameter=none 
## + Fold08: parameter=none 
## - Fold08: parameter=none 
## + Fold09: parameter=none 
## - Fold09: parameter=none 
## + Fold10: parameter=none 
## - Fold10: parameter=none 
## Aggregating results
## Fitting final model on full training set

# Print updated model
model

## Generalized Linear Model 
## 
## 699 samples
##   9 predictor
##   2 classes: 'benign', 'malignant' 
## 
## Pre-processing: median imputation (9), centered (9), scaled (9) 
## Resampling: Cross-Validated (10 fold) 
## Summary of sample sizes: 629, 628, 629, 629, 629, 630, ... 
## Resampling results:
## 
##   ROC        Sens       Spec     
##   0.9910105  0.9716425  0.9291667

Great work! You can combine many different preprocessing methods with caret.

4.4 Handling low-information predictors

4.4.1 Why remove near zero variance predictors?

What’s the best reason to remove near zero variance predictors from your data before building a model?

Because they are guaranteed to have no effect on your model.
Because their p-values in a linear regression will always be low.

To reduce model-fitting time without reducing model accuracy.

Correct! Low variance variables are unlikely to have a large impact on our models.

4.4.2 Remove near zero variance predictors

As you saw in the video, for the next set of exercises, you’ll be using the blood-brain dataset. This is a biochemical dataset in which the task is to predict the following value for a set of biochemical compounds:

log((concentration of compound in brain) /
      (concentration of compound in blood))

This gives a quantitative metric of the compound’s ability to cross the blood-brain barrier, and is useful for understanding the biological properties of that barrier.

One interesting aspect of this dataset is that it contains many variables and many of these variables have extremely low variances. This means that there is very little information in these variables because they mostly consist of a single value (e.g. zero).

Fortunately, caret contains a utility function called nearZeroVar() for removing such variables to save time during modeling.

nearZeroVar() takes in data x, then looks at the ratio of the most common value to the second most common value, freqCut, and the percentage of distinct values out of the number of total samples, uniqueCut. By default, caret uses freqCut = 19 and uniqueCut = 10, which is fairly conservative. I like to be a little more aggressive and use freqCut = 2 and uniqueCut = 20 when calling nearZeroVar().

bloodbrain_x and bloodbrain_y are loaded in your workspace.

Identify the near zero variance predictors by running nearZeroVar() on the blood-brain dataset. Store the result as an object called remove_cols. Use freqCut = 2 and uniqueCut = 20 in the call to nearZeroVar().

load("/Users/cliex159/Documents/Rstudio/DataCamp/MachineLearningwithcaretinR/datasets/BloodBrain.Rdata")
# Identify near zero variance predictors: remove_cols
remove_cols <- nearZeroVar(bloodbrain_x, names = TRUE, 
                           freqCut = 2, uniqueCut = 20)

Use names() to create a vector containing all column names of bloodbrain_x. Call this all_cols.

# Get all column names from bloodbrain_x: all_cols
all_cols <- names(bloodbrain_x)

Make a new data frame called bloodbrain_x_small with the near-zero variance variables removed. Use setdiff() to isolate the column names that you wish to keep (i.e. that you don’t want to remove.)

# Remove from data: bloodbrain_x_small
bloodbrain_x_small <- bloodbrain_x[ , setdiff(all_cols, remove_cols)]

Great work! Near zero variance variables can cause issues during cross-validation.

4.4.3 preProcess() and nearZeroVar()

Can you use the preProcess argument in caret to remove near-zero variance predictors? Or do you have to do this by hand, prior to modeling, using the nearZeroVar() function?

Yes! Set the preProcess argument equal to “nzv”.

No, unfortunately. You have to do this by hand.

Yes!

4.4.4 Fit model on reduced blood-brain data

Now that you’ve reduced your dataset, you can fit a glm model to it using the train() function. This model will run faster than using the full dataset and will yield very similar predictive accuracy.

Furthermore, zero variance variables can cause problems with cross-validation (e.g. if one fold ends up with only a single unique value for that variable), so removing them prior to modeling means you are less likely to get errors during the fitting process.

bloodbrain_x, bloodbrain_y, remove, and bloodbrain_x_small are loaded in your workspace.

Fit a glm model using the train() function and the reduced blood-brain dataset you created in the previous exercise.

# Fit model on reduced data: model
model <- train(
  x = bloodbrain_x_small, 
  y = bloodbrain_y, 
  method = "glm"
)

Print the result to the console.

# Print model to console
model

## Generalized Linear Model 
## 
## 208 samples
## 112 predictors
## 
## No pre-processing
## Resampling: Bootstrapped (25 reps) 
## Summary of sample sizes: 208, 208, 208, 208, 208, 208, ... 
## Resampling results:
## 
##   RMSE      Rsquared   MAE     
##   1082.142  0.1091276  257.9763

Excellent job! As discussed previously, glm generates a lot of warnings about convergence, but they’re never a big deal and you can use the out-of-sample accuracy to make sure your model makes good predictions.

4.5 Principle components analysis (PCA)

4.5.1 Using PCA as an alternative to nearZeroVar()

An alternative to removing low-variance predictors is to run PCA on your dataset. This is sometimes preferable because it does not throw out all of your data: many different low variance predictors may end up combined into one high variance PCA variable, which might have a positive impact on your model’s accuracy.

This is an especially good trick for linear models: the pca option in the preProcess argument will center and scale your data, combine low variance variables, and ensure that all of your predictors are orthogonal. This creates an ideal dataset for linear regression modeling, and can often improve the accuracy of your models.

bloodbrain_x and bloodbrain_y are loaded in your workspace.

Fit a glm model to the full blood-brain dataset using the “pca” option to preProcess.

# Fit glm model using PCA: model
model <- train(
  x = bloodbrain_x, 
  y = bloodbrain_y,
  method = "glm", 
  preProcess = "pca"
)

Print the model to the console and inspect the result.

# Print model to console
model

## Generalized Linear Model 
## 
## 208 samples
## 132 predictors
## 
## Pre-processing: principal component signal extraction (132), centered
##  (132), scaled (132) 
## Resampling: Bootstrapped (25 reps) 
## Summary of sample sizes: 208, 208, 208, 208, 208, 208, ... 
## Resampling results:
## 
##   RMSE       Rsquared   MAE      
##   0.6095159  0.4174699  0.4614212

Great work! Note that the PCA model’s accuracy is slightly higher than the nearZeroVar() model from the previous exercise. PCA is generally a better method for handling low-information predictors than throwing them out entirely.

5 Selecting models: a case study in churn prediction

In the final chapter of this course, you’ll learn how to use resamples() to compare multiple models and select (or ensemble) the best one(s).

5.1 Reusing a trainControl

5.1.1 Why reuse a trainControl?

Why reuse a trainControl?

So you can use the same summaryFunction and tuning parameters for multiple models.
So you don’t have to repeat code when fitting multiple models.
So you can compare models on the exact same training and test data.

All of the above.

Nice one!

5.1.2 Make custom train/test indices

As you saw in the video, for this chapter you will focus on a real-world dataset that brings together all of the concepts discussed in the previous chapters.

The churn dataset contains data on a variety of telecom customers and the modeling challenge is to predict which customers will cancel their service (or churn).

In this chapter, you will be exploring two different types of predictive models: glmnet and rf, so the first order of business is to create a reusable trainControl object you can use to reliably compare them.

churn_x and churn_y are loaded in your workspace.

Use createFolds() to create 5 CV folds on churn_y, your target variable for this exercise.

load("/Users/cliex159/Documents/Rstudio/DataCamp/MachineLearningwithcaretinR/datasets/Churn.RData")
# Create custom indices: myFolds
myFolds <- createFolds(churn_y, k = 5)

churn_x

##      stateAK stateAL stateAR stateAZ stateCA stateCO stateCT stateDC stateDE
## 4575       0       0       0       0       0       0       0       0       0
## 4685       0       0       0       0       0       0       0       0       0
## 1431       0       0       0       0       0       0       0       0       0
## 4150       0       0       0       0       0       0       0       0       1
## 3207       0       0       0       0       0       0       0       0       0
## 2593       0       0       0       0       0       0       0       0       0
## 3679       0       0       0       0       0       0       0       0       0
## 673        0       0       0       0       0       0       0       0       0
## 3280       0       0       0       0       0       0       0       0       0
## 3519       0       0       0       0       0       0       0       1       0
## 2285       0       0       0       0       0       0       0       0       0
## 3588       0       0       0       0       0       0       0       0       0
## 4663       0       0       0       0       0       0       0       0       0
## 1274       0       0       0       0       0       0       0       0       0
## 2305       0       0       0       0       0       0       0       0       0
## 4686       0       0       0       0       0       0       0       0       0
## 4876       0       0       0       0       0       0       0       0       0
## 586        0       0       1       0       0       0       0       0       0
## 2367       0       0       0       0       0       0       0       0       0
## 2792       0       0       0       0       0       0       0       0       0
## 4503       0       0       0       0       0       0       1       0       0
## 691        0       0       0       0       0       0       0       0       0
## 4923       0       0       0       0       0       0       0       0       0
## 4712       0       0       0       0       0       0       0       0       0
## 411        0       0       0       0       0       0       0       0       0
## 2559       0       0       1       0       0       0       0       0       0
## 1941       0       0       0       0       0       0       0       0       0
## 4505       0       1       0       0       0       0       0       0       0
## 2223       1       0       0       0       0       0       0       0       0
## 4156       0       0       0       0       0       0       0       0       0
## 3666       0       0       0       0       0       0       0       0       0
## 4031       0       0       0       0       0       0       0       0       0
## 1929       0       0       0       0       0       0       0       0       0
## 3404       0       0       0       0       0       0       0       0       0
## 20         0       0       0       0       0       0       0       0       0
## 4136       0       0       0       0       0       0       0       0       0
## 37         1       0       0       0       0       0       0       0       0
## 1031       0       0       0       0       0       0       0       0       0
## 4499       0       0       0       0       0       0       0       0       0
## 3036       0       0       0       0       0       0       0       0       0
## 1883       0       0       0       0       0       0       0       0       0
## 2161       0       0       0       0       0       0       0       0       0
## 186        0       0       0       0       0       0       0       0       0
## 4826       0       0       0       0       0       0       0       0       0
## 2140       0       0       0       0       0       0       0       0       0
## 4745       0       0       0       0       0       0       0       0       0
## 4398       0       0       0       0       0       0       0       0       0
## 3170       0       0       0       0       0       0       0       0       0
## 4809       0       0       0       0       0       0       0       0       0
## 3064       0       0       0       0       0       0       1       0       0
## 1651       0       0       0       0       0       0       0       0       0
## 1717       0       0       0       0       0       0       0       0       0
## 1972       0       0       0       0       0       0       0       0       0
## 3882       0       0       0       0       0       0       0       0       0
## 193        0       0       0       0       0       0       0       0       0
## 3703       0       0       0       0       0       0       0       0       0
## 3349       0       0       0       0       0       0       0       0       0
## 847        0       0       0       0       0       0       0       0       0
## 1291       0       0       0       0       1       0       0       0       0
## 2542       0       0       0       0       0       0       0       0       0
## 3338       0       0       0       0       0       0       0       0       0
## 4855       0       0       0       0       0       0       0       0       0
## 3751       0       0       0       0       0       0       0       0       0
## 2797       0       0       0       0       0       0       0       0       0
## 4195       0       0       0       0       0       0       0       0       0
## 936        0       0       0       0       0       0       0       0       0
## 1339       0       0       0       0       0       0       0       0       0
## 4086       0       0       0       0       0       0       0       0       0
## 3419       0       0       0       0       0       0       0       0       0
## 1187       0       0       0       0       0       0       0       0       0
## 212        0       0       0       1       0       0       0       0       0
## 693        0       0       0       0       0       0       0       0       0
## 1067       0       0       0       0       0       0       0       0       0
## 2362       0       0       0       0       0       0       0       0       0
## 973        0       0       0       0       0       0       0       0       0
## 3543       0       0       0       0       0       0       0       0       0
## 39         1       0       0       0       0       0       0       0       0
## 1849       0       0       0       0       0       0       0       0       0
## 2532       0       0       0       0       0       0       0       0       0
## 8          0       0       0       0       0       0       0       0       0
## 2862       0       0       0       0       1       0       0       0       0
## 777        0       1       0       0       0       0       0       0       0
## 1766       0       0       0       0       0       0       0       0       0
## 3175       0       0       0       0       0       0       0       0       0
## 3814       0       0       0       0       0       0       0       0       0
## 2771       0       0       0       0       0       0       0       0       0
## 1149       0       0       0       0       0       0       0       0       0
## 443        0       0       1       0       0       0       0       0       0
## 421        0       0       0       0       0       0       0       0       0
## 1499       0       0       0       0       0       0       0       0       0
## 3278       0       0       0       0       0       0       0       0       0
## 2          0       0       0       0       0       0       0       0       0
## 1024       0       0       1       0       0       0       0       0       0
## 4579       0       0       1       0       0       0       0       0       0
## 4542       0       0       0       0       0       0       0       0       0
## 3601       0       0       0       0       0       0       0       0       0
## 1634       0       0       0       0       0       0       0       0       0
## 2526       0       0       0       0       0       0       0       0       0
## 3647       0       0       0       0       0       0       0       0       0
## 3035       0       0       0       0       0       0       0       0       0
## 3069       0       0       0       0       0       0       0       0       0
## 1064       0       0       0       0       0       0       0       0       0
## 1061       0       0       0       0       0       0       0       0       0
## 1905       0       0       0       0       0       0       0       0       0
## 4615       0       0       0       0       0       0       0       0       0
## 4977       0       0       0       0       0       0       0       0       0
## 3621       0       0       0       0       0       0       0       0       0
## 4989       0       0       0       0       0       0       0       0       0
## 2621       0       0       0       0       0       0       0       0       0
## 12         0       0       0       0       0       0       0       0       0
## 2978       0       0       0       0       0       0       0       0       0
## 4092       0       0       0       0       0       0       0       0       0
## 3674       0       0       0       0       0       0       0       0       0
## 2213       0       1       0       0       0       0       0       0       0
## 2618       0       0       0       0       0       0       0       0       0
## 2626       0       0       1       0       0       0       0       0       0
## 7          0       0       0       0       0       0       0       0       0
## 1737       0       0       0       0       0       0       0       0       0
## 2989       0       0       0       0       0       0       0       0       0
## 4047       0       0       0       0       0       0       0       0       0
## 1741       0       0       0       0       0       0       0       0       0
## 2004       0       0       0       0       0       0       0       0       0
## 2798       0       0       0       0       0       0       0       0       0
## 2876       0       0       0       0       0       0       0       0       0
## 3510       0       0       0       0       0       0       0       1       0
## 1926       0       0       0       0       0       0       0       0       0
## 4481       0       0       0       0       0       0       0       0       0
## 4691       0       0       0       0       0       0       0       0       0
## 1138       0       0       0       0       0       0       0       0       0
## 3530       0       0       0       0       0       0       0       0       1
## 4401       0       0       0       0       0       0       0       0       0
## 2939       0       0       0       0       0       0       0       0       0
## 3075       0       0       0       0       0       0       0       0       0
## 4563       0       0       0       0       0       0       0       0       0
## 4139       0       0       0       0       0       0       0       0       0
## 2821       0       0       0       0       0       0       0       0       0
## 3996       0       0       0       0       0       0       0       0       0
## 554        0       0       0       0       0       0       0       0       0
## 3718       0       0       0       0       0       0       0       0       0
## 3032       0       0       0       0       0       0       0       0       0
## 722        0       0       0       0       0       0       0       0       0
## 391        0       0       0       0       0       0       0       0       0
## 2255       0       0       0       0       0       0       0       0       0
## 3786       0       0       0       0       0       0       0       0       0
## 3563       0       0       0       0       0       0       0       0       0
## 3968       0       0       0       0       0       0       0       0       0
## 826        0       0       0       0       0       0       0       0       0
## 4585       0       0       0       0       0       0       0       0       0
## 1425       0       0       0       0       0       0       0       0       0
## 724        0       0       0       0       0       0       0       0       0
## 3489       0       0       0       0       0       0       0       0       0
## 1572       0       0       0       0       0       0       0       0       0
## 3776       0       0       0       0       0       0       0       0       0
## 1912       0       0       0       0       0       1       0       0       0
## 3289       0       0       0       0       0       0       0       0       0
## 3759       0       0       0       0       0       0       0       0       0
## 911        0       0       0       0       0       0       0       0       0
## 141        0       0       0       0       0       0       0       0       1
## 658        1       0       0       0       0       0       0       0       0
## 3293       0       0       0       0       0       0       0       0       0
## 4525       1       0       0       0       0       0       0       0       0
## 2664       0       0       0       0       0       0       0       0       0
## 2912       0       0       0       0       0       0       0       0       0
## 953        0       0       0       0       0       0       0       0       0
## 2589       0       0       0       0       0       0       0       0       0
## 869        0       0       0       0       0       0       0       0       0
## 2185       0       0       0       0       0       0       0       0       0
## 1533       0       0       0       0       1       0       0       0       0
## 562        0       0       0       0       0       0       0       0       0
## 900        0       0       0       0       0       0       0       0       0
## 3525       0       0       0       0       0       0       0       0       0
## 1989       0       0       0       1       0       0       0       0       0
## 2000       0       0       0       0       0       0       0       0       0
## 2319       0       0       0       0       0       0       0       0       0
## 2064       0       0       0       0       0       0       0       0       0
## 659        0       0       0       0       0       0       0       0       0
## 3979       0       0       0       0       0       0       0       0       0
## 2857       0       0       0       0       0       0       0       0       0
## 3831       0       0       0       0       0       0       0       0       1
## 3708       0       0       0       0       0       0       0       0       0
## 4426       0       0       0       0       0       0       0       0       0
## 4158       0       0       0       0       0       0       0       0       0
## 1528       0       0       0       0       0       0       0       0       0
## 1249       0       0       0       0       0       0       0       0       0
## 3575       0       0       0       0       0       0       0       0       0
## 3599       0       0       0       0       0       0       0       0       0
## 4419       0       0       0       0       0       0       0       0       0
## 3818       0       0       0       0       0       0       0       0       0
## 642        0       0       0       0       0       0       0       0       0
## 1385       0       0       0       0       0       0       1       0       0
## 937        0       0       0       0       0       0       0       0       0
## 3771       0       0       0       0       0       0       0       0       0
## 620        0       0       0       0       0       0       0       0       0
## 621        0       0       0       0       0       0       0       0       0
## 348        0       0       0       0       0       0       0       0       0
## 256        0       0       0       0       0       0       0       0       0
## 2556       0       0       0       0       0       0       0       0       0
## 540        0       0       0       0       0       0       0       0       0
## 3569       0       0       0       0       0       0       0       0       0
## 3512       0       0       0       0       0       0       0       0       1
## 4249       0       0       0       0       0       0       0       0       0
## 2482       0       0       0       0       0       0       0       0       0
## 4088       0       0       0       0       0       0       0       0       0
## 2125       0       0       0       0       0       0       0       0       0
## 758        0       0       0       0       0       0       0       0       0
## 2121       0       0       0       0       0       0       0       0       0
## 4640       0       0       0       0       1       0       0       0       0
## 2323       0       0       0       0       0       0       0       0       0
## 1210       0       0       0       0       0       0       1       0       0
## 1245       0       0       0       0       0       0       0       0       0
## 2597       0       0       0       0       0       0       0       1       0
## 3113       0       0       1       0       0       0       0       0       0
## 1611       0       0       0       0       0       0       0       0       0
## 292        0       0       0       0       0       0       0       0       0
## 2160       0       0       0       0       0       0       0       0       0
## 4014       0       0       0       0       0       0       0       0       0
## 2750       0       1       0       0       0       0       0       0       0
## 1691       0       0       0       0       0       0       0       0       0
## 4886       0       0       0       0       0       0       0       0       0
## 4269       0       0       0       1       0       0       0       0       0
## 2343       0       0       0       0       0       0       0       0       0
## 821        0       0       0       0       0       0       0       0       0
## 2595       0       0       0       0       0       0       0       0       0
## 4593       0       0       0       0       0       0       0       0       0
## 4911       0       0       0       0       0       0       0       0       0
## 3918       0       0       0       0       0       0       0       0       0
## 1466       0       0       0       0       0       1       0       0       0
## 886        0       0       0       0       0       0       0       0       0
## 231        0       0       0       0       0       0       0       0       0
## 1173       0       0       0       0       0       0       0       0       0
## 1675       0       0       0       0       0       0       0       0       0
## 759        0       0       0       0       0       0       0       0       0
## 1450       0       0       0       0       0       1       0       0       0
## 84         0       0       0       0       0       0       0       0       0
## 4750       0       0       0       0       0       0       0       0       0
## 3833       0       0       0       0       0       0       0       0       0
## 413        0       0       0       0       0       0       0       0       0
## 4144       0       0       0       0       0       0       0       0       0
## 2641       0       0       0       0       0       0       0       0       0
## 2007       0       0       1       0       0       0       0       0       0
## 322        0       0       0       0       0       0       0       0       0
## 2672       0       0       0       0       0       0       0       0       0
## 337        0       0       0       0       0       0       0       0       0
## 1006       0       0       0       0       0       0       0       0       0
## 2614       0       0       0       0       0       0       0       0       0
## 2292       0       0       0       0       0       0       0       0       0
## 4769       0       0       0       0       0       0       0       0       0
## 711        0       0       0       0       0       0       0       0       0
## 2373       0       0       0       0       0       0       0       0       0
## 4469       0       0       0       0       0       0       0       0       0
##      stateFL stateGA stateHI stateIA stateID stateIL stateIN stateKS stateKY
## 4575       0       0       0       0       0       0       0       0       0
## 4685       0       0       0       0       0       0       0       0       0
## 1431       0       0       0       0       0       0       0       0       0
## 4150       0       0       0       0       0       0       0       0       0
## 3207       0       0       0       0       0       0       0       0       0
## 2593       0       0       0       0       0       0       0       0       0
## 3679       0       0       0       0       0       0       0       0       0
## 673        0       0       0       0       0       1       0       0       0
## 3280       0       0       0       0       0       0       0       0       0
## 3519       0       0       0       0       0       0       0       0       0
## 2285       0       0       0       0       0       0       0       0       0
## 3588       0       0       0       0       0       0       0       0       0
## 4663       0       0       0       0       0       1       0       0       0
## 1274       0       0       0       0       0       0       0       0       0
## 2305       0       0       0       0       0       0       0       0       0
## 4686       0       0       0       0       0       0       0       0       0
## 4876       0       0       0       0       0       0       0       0       0
## 586        0       0       0       0       0       0       0       0       0
## 2367       0       0       0       0       0       0       0       0       0
## 2792       0       0       0       0       1       0       0       0       0
## 4503       0       0       0       0       0       0       0       0       0
## 691        0       0       0       0       0       0       0       1       0
## 4923       0       0       0       0       0       0       0       0       0
## 4712       0       0       0       0       0       0       0       0       0
## 411        0       0       0       0       0       0       0       0       0
## 2559       0       0       0       0       0       0       0       0       0
## 1941       0       0       0       0       0       0       0       0       0
## 4505       0       0       0       0       0       0       0       0       0
## 2223       0       0       0       0       0       0       0       0       0
## 4156       0       0       0       0       0       0       0       0       0
## 3666       0       0       0       0       0       0       0       0       0
## 4031       0       0       0       0       1       0       0       0       0
## 1929       0       0       0       0       0       0       0       0       0
## 3404       0       0       0       0       0       0       0       0       0
## 20         0       0       0       0       0       0       0       0       0
## 4136       0       0       0       0       0       0       0       1       0
## 37         0       0       0       0       0       0       0       0       0
## 1031       0       0       0       0       0       0       0       0       0
## 4499       0       0       0       0       0       0       0       0       0
## 3036       0       0       0       0       0       0       0       0       0
## 1883       0       0       1       0       0       0       0       0       0
## 2161       0       0       0       0       0       0       0       0       0
## 186        0       0       0       0       0       0       0       0       0
## 4826       0       0       0       1       0       0       0       0       0
## 2140       0       0       0       0       0       0       0       0       0
## 4745       0       0       0       0       1       0       0       0       0
## 4398       0       0       0       0       0       0       0       0       0
## 3170       0       0       0       0       1       0       0       0       0
## 4809       0       0       0       0       1       0       0       0       0
## 3064       0       0       0       0       0       0       0       0       0
## 1651       0       0       0       0       0       0       0       0       0
## 1717       0       0       0       0       0       0       0       0       0
## 1972       0       0       0       0       0       0       0       1       0
## 3882       0       1       0       0       0       0       0       0       0
## 193        0       0       0       0       0       0       0       0       1
## 3703       0       0       0       0       0       0       0       0       0
## 3349       0       0       0       0       0       0       0       0       0
## 847        0       0       0       0       0       0       0       0       0
## 1291       0       0       0       0       0       0       0       0       0
## 2542       0       0       0       0       0       0       0       0       1
## 3338       0       0       0       0       0       0       0       0       0
## 4855       0       0       0       0       0       0       0       0       0
## 3751       0       0       0       0       0       0       0       0       0
## 2797       0       0       1       0       0       0       0       0       0
## 4195       0       0       0       0       0       0       0       0       0
## 936        0       0       0       0       0       0       0       0       0
## 1339       0       0       0       0       0       0       0       0       0
## 4086       0       0       0       0       0       0       0       0       0
## 3419       0       0       0       0       0       0       0       0       0
## 1187       0       0       0       0       0       0       0       0       0
## 212        0       0       0       0       0       0       0       0       0
## 693        0       0       0       0       0       0       0       0       0
## 1067       0       0       0       0       0       0       0       1       0
## 2362       0       0       0       0       0       0       0       0       0
## 973        0       0       0       0       0       0       0       0       0
## 3543       0       0       0       0       0       0       0       0       0
## 39         0       0       0       0       0       0       0       0       0
## 1849       1       0       0       0       0       0       0       0       0
## 2532       0       0       0       0       0       0       0       0       0
## 8          0       0       0       0       0       0       0       0       0
## 2862       0       0       0       0       0       0       0       0       0
## 777        0       0       0       0       0       0       0       0       0
## 1766       0       0       0       0       0       0       0       0       0
## 3175       0       0       0       0       0       0       0       0       0
## 3814       0       0       0       0       0       0       0       0       0
## 2771       0       0       0       0       0       0       0       0       0
## 1149       0       0       0       0       0       0       0       0       0
## 443        0       0       0       0       0       0       0       0       0
## 421        0       0       0       0       0       0       0       0       0
## 1499       0       0       0       0       0       0       0       0       0
## 3278       0       0       0       0       0       0       0       0       0
## 2          0       0       0       0       0       0       0       0       0
## 1024       0       0       0       0       0       0       0       0       0
## 4579       0       0       0       0       0       0       0       0       0
## 4542       0       0       0       0       0       0       0       0       0
## 3601       0       0       0       0       0       0       0       0       0
## 1634       0       0       0       0       0       0       0       0       0
## 2526       0       0       0       0       0       0       0       0       0
## 3647       0       0       0       0       0       0       0       0       0
## 3035       0       0       0       0       0       0       0       0       0
## 3069       0       0       0       0       0       0       0       0       0
## 1064       0       0       1       0       0       0       0       0       0
## 1061       0       0       0       0       0       0       0       0       0
## 1905       0       0       0       0       0       0       0       0       0
## 4615       0       0       0       0       0       0       0       0       0
## 4977       0       0       0       0       0       0       0       0       1
## 3621       0       0       0       0       0       0       0       0       0
## 4989       0       0       0       0       0       0       0       0       0
## 2621       0       0       0       0       0       0       0       0       0
## 12         0       0       0       0       0       0       0       0       0
## 2978       0       0       0       0       0       0       0       0       0
## 4092       0       0       0       0       0       0       0       0       0
## 3674       0       0       0       0       0       0       0       0       0
## 2213       0       0       0       0       0       0       0       0       0
## 2618       0       0       0       0       0       0       0       0       0
## 2626       0       0       0       0       0       0       0       0       0
## 7          0       0       0       0       0       0       0       0       0
## 1737       0       0       0       0       0       0       0       0       0
## 2989       0       0       0       0       0       0       0       0       0
## 4047       0       0       0       0       0       0       0       0       0
## 1741       0       0       0       0       0       0       0       0       0
## 2004       0       0       0       0       0       0       0       0       0
## 2798       0       0       0       0       0       0       0       0       0
## 2876       0       0       0       0       0       0       0       0       0
## 3510       0       0       0       0       0       0       0       0       0
## 1926       0       0       0       0       0       0       0       0       0
## 4481       0       0       0       0       0       0       0       0       0
## 4691       0       0       0       0       0       0       0       0       0
## 1138       0       0       0       0       0       0       0       0       0
## 3530       0       0       0       0       0       0       0       0       0
## 4401       0       0       0       0       0       0       0       0       0
## 2939       0       0       0       0       0       0       0       0       0
## 3075       0       0       0       0       0       0       0       0       0
## 4563       0       0       0       0       0       0       0       0       0
## 4139       0       0       0       0       0       0       0       0       1
## 2821       0       0       0       0       0       1       0       0       0
## 3996       0       0       0       0       0       0       0       0       1
## 554        0       0       0       0       0       0       0       0       0
## 3718       1       0       0       0       0       0       0       0       0
## 3032       0       0       0       0       0       0       0       0       1
## 722        0       0       0       0       0       0       0       0       0
## 391        0       0       0       0       0       0       0       0       0
## 2255       1       0       0       0       0       0       0       0       0
## 3786       0       0       0       0       0       0       0       0       0
## 3563       0       0       0       0       0       0       0       0       0
## 3968       0       0       0       0       0       0       0       0       0
## 826        0       0       0       0       0       0       0       0       0
## 4585       0       0       0       0       0       0       0       0       0
## 1425       0       0       0       0       0       0       0       0       0
## 724        0       0       0       0       0       0       0       0       0
## 3489       0       0       0       0       0       0       0       0       1
## 1572       0       0       0       0       0       1       0       0       0
## 3776       0       0       0       0       0       0       0       0       0
## 1912       0       0       0       0       0       0       0       0       0
## 3289       0       0       0       0       0       0       0       0       0
## 3759       0       0       0       0       1       0       0       0       0
## 911        0       0       0       0       0       0       0       0       0
## 141        0       0       0       0       0       0       0       0       0
## 658        0       0       0       0       0       0       0       0       0
## 3293       0       0       0       0       0       0       1       0       0
## 4525       0       0       0       0       0       0       0       0       0
## 2664       0       0       0       0       0       0       0       0       0
## 2912       0       0       0       0       0       0       0       0       0
## 953        0       0       0       0       0       0       0       0       0
## 2589       0       0       0       0       0       0       0       0       0
## 869        0       0       0       0       0       0       0       0       0
## 2185       0       0       0       0       1       0       0       0       0
## 1533       0       0       0       0       0       0       0       0       0
## 562        0       0       0       0       0       0       0       0       0
## 900        0       0       0       0       0       0       0       0       0
## 3525       0       0       0       0       0       0       0       0       0
## 1989       0       0       0       0       0       0       0       0       0
## 2000       0       0       0       0       0       0       0       0       0
## 2319       0       0       0       0       0       0       0       0       0
## 2064       0       0       0       0       0       0       0       0       0
## 659        0       0       0       0       0       0       0       0       0
## 3979       0       1       0       0       0       0       0       0       0
## 2857       0       0       0       1       0       0       0       0       0
## 3831       0       0       0       0       0       0       0       0       0
## 3708       0       0       0       0       0       0       0       0       0
## 4426       0       0       0       0       0       0       0       0       0
## 4158       0       0       0       0       0       0       0       0       0
## 1528       0       0       0       1       0       0       0       0       0
## 1249       0       0       0       0       0       0       0       0       0
## 3575       0       0       0       0       0       0       0       0       0
## 3599       0       0       0       0       0       0       0       0       0
## 4419       0       0       0       0       0       0       0       0       0
## 3818       0       0       0       0       1       0       0       0       0
## 642        0       0       0       0       0       0       0       0       0
## 1385       0       0       0       0       0       0       0       0       0
## 937        0       0       0       0       0       0       0       0       0
## 3771       0       0       0       0       0       0       0       0       0
## 620        0       0       0       0       0       0       0       1       0
## 621        0       0       0       0       0       0       0       1       0
## 348        0       0       0       0       0       0       0       0       0
## 256        1       0       0       0       0       0       0       0       0
## 2556       0       0       0       0       0       0       0       0       0
## 540        0       0       0       0       0       0       0       0       0
## 3569       0       0       0       0       0       0       0       1       0
## 3512       0       0       0       0       0       0       0       0       0
## 4249       0       0       0       0       0       0       0       0       0
## 2482       0       0       0       0       0       0       0       0       0
## 4088       0       0       0       0       0       0       0       0       0
## 2125       0       0       0       0       0       0       0       1       0
## 758        0       0       0       0       0       0       0       0       0
## 2121       0       0       0       0       0       0       0       0       0
## 4640       0       0       0       0       0       0       0       0       0
## 2323       0       1       0       0       0       0       0       0       0
## 1210       0       0       0       0       0       0       0       0       0
## 1245       0       0       0       0       0       0       0       0       0
## 2597       0       0       0       0       0       0       0       0       0
## 3113       0       0       0       0       0       0       0       0       0
## 1611       0       0       0       0       0       0       0       0       0
## 292        0       0       0       0       0       0       0       0       0
## 2160       0       0       0       0       0       0       0       1       0
## 4014       0       0       0       0       0       0       0       0       0
## 2750       0       0       0       0       0       0       0       0       0
## 1691       0       0       0       0       0       0       0       0       0
## 4886       0       0       0       0       0       0       0       0       0
## 4269       0       0       0       0       0       0       0       0       0
## 2343       0       0       0       0       0       0       0       0       0
## 821        0       0       0       0       0       0       0       0       0
## 2595       0       0       0       0       0       0       0       0       0
## 4593       0       0       0       0       0       0       0       0       0
## 4911       0       0       0       0       0       0       0       0       0
## 3918       0       0       0       0       0       0       0       0       0
## 1466       0       0       0       0       0       0       0       0       0
## 886        0       0       0       0       0       0       0       0       0
## 231        0       0       0       0       0       0       0       0       0
## 1173       0       0       0       0       0       0       0       0       0
## 1675       0       0       0       0       0       0       0       0       0
## 759        0       0       0       0       0       0       0       0       0
## 1450       0       0       0       0       0       0       0       0       0
## 84         0       1       0       0       0       0       0       0       0
## 4750       0       0       0       0       0       0       0       0       0
## 3833       0       0       0       0       0       0       0       0       0
## 413        0       0       0       0       0       0       0       0       0
## 4144       0       0       0       0       0       0       0       0       0
## 2641       0       0       0       0       0       0       0       0       1
## 2007       0       0       0       0       0       0       0       0       0
## 322        0       0       0       0       0       0       0       0       0
## 2672       0       0       0       0       0       0       0       0       0
## 337        0       0       0       0       0       0       0       0       0
## 1006       0       0       0       0       0       0       0       0       0
## 2614       0       0       0       0       0       0       0       0       0
## 2292       0       0       0       0       0       0       0       0       0
## 4769       0       0       0       0       0       0       0       0       0
## 711        0       0       0       0       0       0       0       0       0
## 2373       0       0       0       0       0       0       0       0       0
## 4469       0       0       0       0       1       0       0       0       0
##      stateLA stateMA stateMD stateME stateMI stateMN stateMO stateMS stateMT
## 4575       0       0       0       0       0       0       0       0       0
## 4685       0       0       0       0       0       0       1       0       0
## 1431       0       0       0       0       0       0       0       0       0
## 4150       0       0       0       0       0       0       0       0       0
## 3207       0       0       0       0       0       0       0       0       0
## 2593       0       0       0       0       0       0       0       0       0
## 3679       0       0       0       0       0       0       0       0       0
## 673        0       0       0       0       0       0       0       0       0
## 3280       0       0       0       0       0       0       0       0       0
## 3519       0       0       0       0       0       0       0       0       0
## 2285       0       0       0       0       0       0       0       0       0
## 3588       0       0       0       0       0       0       0       0       0
## 4663       0       0       0       0       0       0       0       0       0
## 1274       0       0       0       0       0       0       0       0       1
## 2305       0       0       0       0       0       0       0       0       0
## 4686       0       0       0       0       0       0       0       0       0
## 4876       0       0       0       0       0       0       0       0       0
## 586        0       0       0       0       0       0       0       0       0
## 2367       0       0       0       0       0       0       0       1       0
## 2792       0       0       0       0       0       0       0       0       0
## 4503       0       0       0       0       0       0       0       0       0
## 691        0       0       0       0       0       0       0       0       0
## 4923       0       0       0       0       0       1       0       0       0
## 4712       0       0       0       0       0       0       0       0       0
## 411        0       0       0       0       0       0       0       0       0
## 2559       0       0       0       0       0       0       0       0       0
## 1941       0       0       0       0       0       0       0       0       0
## 4505       0       0       0       0       0       0       0       0       0
## 2223       0       0       0       0       0       0       0       0       0
## 4156       0       0       0       0       0       0       0       0       0
## 3666       0       0       0       0       0       0       0       0       0
## 4031       0       0       0       0       0       0       0       0       0
## 1929       0       0       0       0       0       0       0       0       0
## 3404       0       0       0       0       0       0       0       0       0
## 20         0       0       0       0       0       0       0       0       0
## 4136       0       0       0       0       0       0       0       0       0
## 37         0       0       0       0       0       0       0       0       0
## 1031       0       0       0       0       0       0       0       0       0
## 4499       0       0       0       0       0       0       0       0       0
## 3036       0       0       0       1       0       0       0       0       0
## 1883       0       0       0       0       0       0       0       0       0
## 2161       0       0       0       0       0       0       0       0       0
## 186        0       0       0       0       0       0       0       0       0
## 4826       0       0       0       0       0       0       0       0       0
## 2140       0       0       0       0       0       0       0       0       0
## 4745       0       0       0       0       0       0       0       0       0
## 4398       0       0       0       0       0       0       0       0       0
## 3170       0       0       0       0       0       0       0       0       0
## 4809       0       0       0       0       0       0       0       0       0
## 3064       0       0       0       0       0       0       0       0       0
## 1651       0       0       0       0       0       0       0       0       0
## 1717       0       0       0       1       0       0       0       0       0
## 1972       0       0       0       0       0       0       0       0       0
## 3882       0       0       0       0       0       0       0       0       0
## 193        0       0       0       0       0       0       0       0       0
## 3703       0       0       0       0       0       0       0       0       0
## 3349       0       0       0       0       0       1       0       0       0
## 847        0       0       0       0       0       0       0       0       0
## 1291       0       0       0       0       0       0       0       0       0
## 2542       0       0       0       0       0       0       0       0       0
## 3338       0       0       0       0       0       0       0       0       0
## 4855       0       0       0       0       0       0       0       0       0
## 3751       0       0       0       0       0       0       0       0       0
## 2797       0       0       0       0       0       0       0       0       0
## 4195       0       0       0       0       0       0       1       0       0
## 936        0       0       1       0       0       0       0       0       0
## 1339       0       0       0       0       0       0       0       0       0
## 4086       0       0       0       0       0       0       0       0       0
## 3419       0       0       0       0       0       0       0       0       0
## 1187       0       0       0       0       0       0       0       0       0
## 212        0       0       0       0       0       0       0       0       0
## 693        0       0       0       0       0       0       0       0       0
## 1067       0       0       0       0       0       0       0       0       0
## 2362       0       1       0       0       0       0       0       0       0
## 973        1       0       0       0       0       0       0       0       0
## 3543       0       0       0       0       0       1       0       0       0
## 39         0       0       0       0       0       0       0       0       0
## 1849       0       0       0       0       0       0       0       0       0
## 2532       0       0       0       0       0       0       0       0       0
## 8          0       0       0       0       0       0       1       0       0
## 2862       0       0       0       0       0       0       0       0       0
## 777        0       0       0       0       0       0       0       0       0
## 1766       0       0       0       0       0       0       0       0       0
## 3175       0       0       0       0       0       0       0       0       0
## 3814       0       0       0       0       0       0       0       0       0
## 2771       0       0       0       0       0       0       0       0       0
## 1149       0       0       0       0       0       0       0       0       0
## 443        0       0       0       0       0       0       0       0       0
## 421        0       0       0       0       0       0       0       0       0
## 1499       0       0       0       0       0       0       0       0       0
## 3278       0       0       0       0       0       0       0       0       0
## 2          0       0       0       0       0       0       0       0       0
## 1024       0       0       0       0       0       0       0       0       0
## 4579       0       0       0       0       0       0       0       0       0
## 4542       1       0       0       0       0       0       0       0       0
## 3601       0       0       0       0       0       0       1       0       0
## 1634       0       0       0       0       0       0       0       0       0
## 2526       0       0       0       0       0       0       0       0       0
## 3647       0       0       0       0       0       1       0       0       0
## 3035       0       0       0       0       0       0       0       0       0
## 3069       0       0       0       0       0       0       0       0       0
## 1064       0       0       0       0       0       0       0       0       0
## 1061       0       0       0       0       0       0       0       0       0
## 1905       0       0       0       0       0       0       0       0       0
## 4615       0       0       0       0       0       0       0       0       0
## 4977       0       0       0       0       0       0       0       0       0
## 3621       0       0       0       0       0       0       0       0       0
## 4989       0       0       0       0       0       0       0       0       0
## 2621       0       0       0       0       0       0       0       0       0
## 12         0       0       0       0       0       0       0       0       0
## 2978       0       0       0       0       0       0       0       0       1
## 4092       0       0       0       0       0       0       0       0       1
## 3674       0       0       0       0       0       0       0       0       0
## 2213       0       0       0       0       0       0       0       0       0
## 2618       0       0       0       0       0       0       0       0       0
## 2626       0       0       0       0       0       0       0       0       0
## 7          0       1       0       0       0       0       0       0       0
## 1737       0       0       0       0       0       0       0       0       0
## 2989       0       0       0       0       0       0       0       0       0
## 4047       0       0       0       0       0       0       0       0       1
## 1741       0       0       0       0       0       0       0       0       0
## 2004       0       0       0       0       0       0       0       0       0
## 2798       0       0       0       0       0       0       0       0       0
## 2876       1       0       0       0       0       0       0       0       0
## 3510       0       0       0       0       0       0       0       0       0
## 1926       0       0       0       0       0       0       0       0       0
## 4481       0       0       0       0       0       0       0       0       0
## 4691       0       0       0       0       0       1       0       0       0
## 1138       0       0       0       0       0       0       0       0       0
## 3530       0       0       0       0       0       0       0       0       0
## 4401       0       0       0       0       0       0       0       0       0
## 2939       0       0       0       0       0       0       0       0       0
## 3075       0       0       0       0       0       0       0       0       0
## 4563       0       0       0       0       0       0       0       0       0
## 4139       0       0       0       0       0       0       0       0       0
## 2821       0       0       0       0       0       0       0       0       0
## 3996       0       0       0       0       0       0       0       0       0
## 554        0       0       0       0       0       0       0       0       0
## 3718       0       0       0       0       0       0       0       0       0
## 3032       0       0       0       0       0       0       0       0       0
## 722        0       0       0       0       0       0       0       0       0
## 391        0       0       0       0       0       0       0       0       0
## 2255       0       0       0       0       0       0       0       0       0
## 3786       0       0       0       0       0       0       0       0       0
## 3563       0       0       0       0       0       0       0       0       0
## 3968       0       0       0       0       0       0       0       0       0
## 826        0       0       0       0       0       0       1       0       0
## 4585       0       0       0       0       0       0       0       0       0
## 1425       0       0       0       0       0       0       0       0       0
## 724        0       0       0       0       1       0       0       0       0
## 3489       0       0       0       0       0       0       0       0       0
## 1572       0       0       0       0       0       0       0       0       0
## 3776       0       0       0       0       0       0       0       0       0
## 1912       0       0       0       0       0       0       0       0       0
## 3289       0       0       0       0       0       0       0       0       0
## 3759       0       0       0       0       0       0       0       0       0
## 911        0       0       0       0       0       0       0       0       0
## 141        0       0       0       0       0       0       0       0       0
## 658        0       0       0       0       0       0       0       0       0
## 3293       0       0       0       0       0       0       0       0       0
## 4525       0       0       0       0       0       0       0       0       0
## 2664       0       0       0       0       0       0       0       0       0
## 2912       0       0       0       0       0       0       0       0       0
## 953        0       0       0       0       0       0       0       0       0
## 2589       0       0       0       0       1       0       0       0       0
## 869        0       0       1       0       0       0       0       0       0
## 2185       0       0       0       0       0       0       0       0       0
## 1533       0       0       0       0       0       0       0       0       0
## 562        0       0       0       0       0       0       0       0       0
## 900        0       0       0       0       0       0       0       0       0
## 3525       0       0       0       0       0       0       0       0       0
## 1989       0       0       0       0       0       0       0       0       0
## 2000       0       0       0       0       0       0       0       0       0
## 2319       0       0       0       0       0       0       0       0       0
## 2064       0       0       1       0       0       0       0       0       0
## 659        0       0       0       0       0       0       0       0       0
## 3979       0       0       0       0       0       0       0       0       0
## 2857       0       0       0       0       0       0       0       0       0
## 3831       0       0       0       0       0       0       0       0       0
## 3708       0       0       0       0       0       0       0       0       0
## 4426       0       0       0       0       0       1       0       0       0
## 4158       0       0       0       0       0       0       0       0       0
## 1528       0       0       0       0       0       0       0       0       0
## 1249       0       0       0       0       0       0       0       0       0
## 3575       0       0       0       0       0       0       0       0       0
## 3599       0       0       0       0       0       0       0       0       0
## 4419       0       0       0       0       0       0       0       0       0
## 3818       0       0       0       0       0       0       0       0       0
## 642        0       0       0       0       0       0       0       0       0
## 1385       0       0       0       0       0       0       0       0       0
## 937        0       0       0       0       0       0       0       0       0
## 3771       0       0       0       0       0       0       0       0       0
## 620        0       0       0       0       0       0       0       0       0
## 621        0       0       0       0       0       0       0       0       0
## 348        0       0       0       0       0       0       0       0       0
## 256        0       0       0       0       0       0       0       0       0
## 2556       0       0       0       0       0       0       0       0       0
## 540        0       0       0       0       0       0       0       0       0
## 3569       0       0       0       0       0       0       0       0       0
## 3512       0       0       0       0       0       0       0       0       0
## 4249       0       0       0       0       0       0       1       0       0
## 2482       0       0       0       0       0       0       0       0       0
## 4088       0       0       0       0       0       0       0       0       0
## 2125       0       0       0       0       0       0       0       0       0
## 758        0       0       0       0       0       0       0       0       0
## 2121       0       0       0       0       0       0       0       0       0
## 4640       0       0       0       0       0       0       0       0       0
## 2323       0       0       0       0       0       0       0       0       0
## 1210       0       0       0       0       0       0       0       0       0
## 1245       0       0       0       0       0       0       0       0       0
## 2597       0       0       0       0       0       0       0       0       0
## 3113       0       0       0       0       0       0       0       0       0
## 1611       0       0       0       0       0       0       0       0       0
## 292        0       0       0       0       0       0       0       0       0
## 2160       0       0       0       0       0       0       0       0       0
## 4014       0       0       0       0       0       0       0       0       0
## 2750       0       0       0       0       0       0       0       0       0
## 1691       0       0       0       0       0       0       0       0       0
## 4886       0       0       0       1       0       0       0       0       0
## 4269       0       0       0       0       0       0       0       0       0
## 2343       0       0       0       0       1       0       0       0       0
## 821        0       0       0       0       0       0       0       0       0
## 2595       0       0       0       0       0       0       0       0       0
## 4593       0       0       0       0       0       0       0       0       0
## 4911       0       0       0       0       0       0       0       0       0
## 3918       0       0       0       0       0       0       0       0       0
## 1466       0       0       0       0       0       0       0       0       0
## 886        0       0       0       1       0       0       0       0       0
## 231        0       0       1       0       0       0       0       0       0
## 1173       0       1       0       0       0       0       0       0       0
## 1675       0       0       0       0       0       0       0       0       0
## 759        1       0       0       0       0       0       0       0       0
## 1450       0       0       0       0       0       0       0       0       0
## 84         0       0       0       0       0       0       0       0       0
## 4750       0       0       0       0       0       0       0       0       0
## 3833       0       0       0       0       0       0       0       0       0
## 413        0       0       0       0       0       0       0       0       0
## 4144       0       0       0       0       0       0       0       0       0
## 2641       0       0       0       0       0       0       0       0       0
## 2007       0       0       0       0       0       0       0       0       0
## 322        0       0       0       0       0       0       0       0       0
## 2672       0       0       0       0       0       0       0       0       0
## 337        0       0       0       0       0       0       0       0       0
## 1006       0       0       0       0       0       0       0       0       0
## 2614       0       0       0       0       1       0       0       0       0
## 2292       0       0       0       0       0       0       0       0       0
## 4769       0       0       0       1       0       0       0       0       0
## 711        0       0       0       0       0       0       0       0       0
## 2373       0       0       0       0       0       0       0       0       0
## 4469       0       0       0       0       0       0       0       0       0
##      stateNC stateND stateNE stateNH stateNJ stateNM stateNV stateNY stateOH
## 4575       0       0       0       0       0       0       0       0       0
## 4685       0       0       0       0       0       0       0       0       0
## 1431       0       0       0       0       0       0       0       0       0
## 4150       0       0       0       0       0       0       0       0       0
## 3207       0       0       0       0       0       0       0       0       0
## 2593       0       0       0       0       0       0       0       0       0
## 3679       0       0       0       0       0       0       1       0       0
## 673        0       0       0       0       0       0       0       0       0
## 3280       0       0       0       0       1       0       0       0       0
## 3519       0       0       0       0       0       0       0       0       0
## 2285       0       0       0       0       0       0       0       0       0
## 3588       0       0       0       0       0       0       0       0       0
## 4663       0       0       0       0       0       0       0       0       0
## 1274       0       0       0       0       0       0       0       0       0
## 2305       0       0       0       0       0       0       0       0       0
## 4686       0       0       0       0       0       0       0       0       0
## 4876       0       0       0       0       0       0       0       0       0
## 586        0       0       0       0       0       0       0       0       0
## 2367       0       0       0       0       0       0       0       0       0
## 2792       0       0       0       0       0       0       0       0       0
## 4503       0       0       0       0       0       0       0       0       0
## 691        0       0       0       0       0       0       0       0       0
## 4923       0       0       0       0       0       0       0       0       0
## 4712       0       0       0       0       0       0       0       0       0
## 411        0       0       0       0       0       0       0       0       0
## 2559       0       0       0       0       0       0       0       0       0
## 1941       0       0       0       0       0       0       0       0       0
## 4505       0       0       0       0       0       0       0       0       0
## 2223       0       0       0       0       0       0       0       0       0
## 4156       0       0       0       0       0       0       0       0       0
## 3666       0       0       0       0       0       0       0       0       0
## 4031       0       0       0       0       0       0       0       0       0
## 1929       0       0       0       0       0       0       0       0       0
## 3404       0       0       0       0       0       0       0       0       0
## 20         0       0       0       0       0       0       0       0       0
## 4136       0       0       0       0       0       0       0       0       0
## 37         0       0       0       0       0       0       0       0       0
## 1031       0       0       0       0       0       0       0       0       0
## 4499       0       0       0       0       0       0       0       1       0
## 3036       0       0       0       0       0       0       0       0       0
## 1883       0       0       0       0       0       0       0       0       0
## 2161       0       0       0       0       0       0       1       0       0
## 186        0       0       0       0       0       1       0       0       0
## 4826       0       0       0       0       0       0       0       0       0
## 2140       0       0       0       0       0       0       0       0       0
## 4745       0       0       0       0       0       0       0       0       0
## 4398       0       0       0       0       0       0       0       0       1
## 3170       0       0       0       0       0       0       0       0       0
## 4809       0       0       0       0       0       0       0       0       0
## 3064       0       0       0       0       0       0       0       0       0
## 1651       0       0       0       0       0       0       0       0       0
## 1717       0       0       0       0       0       0       0       0       0
## 1972       0       0       0       0       0       0       0       0       0
## 3882       0       0       0       0       0       0       0       0       0
## 193        0       0       0       0       0       0       0       0       0
## 3703       0       0       0       0       0       0       0       0       0
## 3349       0       0       0       0       0       0       0       0       0
## 847        0       0       1       0       0       0       0       0       0
## 1291       0       0       0       0       0       0       0       0       0
## 2542       0       0       0       0       0       0       0       0       0
## 3338       0       0       0       0       0       0       0       0       0
## 4855       0       0       0       0       0       0       0       0       0
## 3751       0       1       0       0       0       0       0       0       0
## 2797       0       0       0       0       0       0       0       0       0
## 4195       0       0       0       0       0       0       0       0       0
## 936        0       0       0       0       0       0       0       0       0
## 1339       0       0       0       0       0       0       0       0       0
## 4086       0       0       0       0       0       0       0       0       0
## 3419       0       0       0       0       0       0       0       0       0
## 1187       0       0       0       0       0       0       0       0       0
## 212        0       0       0       0       0       0       0       0       0
## 693        0       0       1       0       0       0       0       0       0
## 1067       0       0       0       0       0       0       0       0       0
## 2362       0       0       0       0       0       0       0       0       0
## 973        0       0       0       0       0       0       0       0       0
## 3543       0       0       0       0       0       0       0       0       0
## 39         0       0       0       0       0       0       0       0       0
## 1849       0       0       0       0       0       0       0       0       0
## 2532       0       0       0       0       0       0       0       0       0
## 8          0       0       0       0       0       0       0       0       0
## 2862       0       0       0       0       0       0       0       0       0
## 777        0       0       0       0       0       0       0       0       0
## 1766       0       0       0       0       0       1       0       0       0
## 3175       0       0       0       0       0       0       0       0       0
## 3814       0       0       0       0       0       0       0       0       0
## 2771       0       0       0       1       0       0       0       0       0
## 1149       0       0       0       0       0       0       0       1       0
## 443        0       0       0       0       0       0       0       0       0
## 421        0       0       0       0       0       1       0       0       0
## 1499       0       0       0       1       0       0       0       0       0
## 3278       0       0       0       0       0       0       0       0       0
## 2          0       0       0       0       0       0       0       0       1
## 1024       0       0       0       0       0       0       0       0       0
## 4579       0       0       0       0       0       0       0       0       0
## 4542       0       0       0       0       0       0       0       0       0
## 3601       0       0       0       0       0       0       0       0       0
## 1634       0       0       0       0       0       0       0       1       0
## 2526       0       0       0       0       0       0       0       0       0
## 3647       0       0       0       0       0       0       0       0       0
## 3035       0       0       0       0       1       0       0       0       0
## 3069       0       0       0       0       0       0       0       0       0
## 1064       0       0       0       0       0       0       0       0       0
## 1061       0       0       0       0       0       0       0       0       1
## 1905       0       0       0       0       0       0       0       0       0
## 4615       0       0       0       0       0       0       0       0       0
## 4977       0       0       0       0       0       0       0       0       0
## 3621       0       0       0       0       0       0       0       0       1
## 4989       0       0       0       0       0       0       0       0       0
## 2621       0       0       0       0       0       0       0       0       0
## 12         0       0       0       0       0       0       0       0       0
## 2978       0       0       0       0       0       0       0       0       0
## 4092       0       0       0       0       0       0       0       0       0
## 3674       0       0       0       1       0       0       0       0       0
## 2213       0       0       0       0       0       0       0       0       0
## 2618       0       0       0       0       0       0       0       0       0
## 2626       0       0       0       0       0       0       0       0       0
## 7          0       0       0       0       0       0       0       0       0
## 1737       0       0       0       0       0       0       1       0       0
## 2989       0       0       0       0       0       0       0       0       0
## 4047       0       0       0       0       0       0       0       0       0
## 1741       0       1       0       0       0       0       0       0       0
## 2004       0       0       0       0       0       0       0       0       0
## 2798       1       0       0       0       0       0       0       0       0
## 2876       0       0       0       0       0       0       0       0       0
## 3510       0       0       0       0       0       0       0       0       0
## 1926       0       0       0       0       1       0       0       0       0
## 4481       0       0       0       0       0       0       0       1       0
## 4691       0       0       0       0       0       0       0       0       0
## 1138       0       0       0       0       0       0       0       0       0
## 3530       0       0       0       0       0       0       0       0       0
## 4401       0       0       0       0       0       0       0       0       0
## 2939       0       0       1       0       0       0       0       0       0
## 3075       0       0       0       0       0       0       0       0       0
## 4563       0       0       0       0       0       1       0       0       0
## 4139       0       0       0       0       0       0       0       0       0
## 2821       0       0       0       0       0       0       0       0       0
## 3996       0       0       0       0       0       0       0       0       0
## 554        0       0       0       0       0       0       0       0       0
## 3718       0       0       0       0       0       0       0       0       0
## 3032       0       0       0       0       0       0       0       0       0
## 722        0       0       0       0       0       0       0       0       0
## 391        0       0       0       0       0       0       0       0       0
## 2255       0       0       0       0       0       0       0       0       0
## 3786       0       0       0       0       1       0       0       0       0
## 3563       0       0       0       0       0       0       0       0       0
## 3968       0       0       0       0       0       0       0       0       0
## 826        0       0       0       0       0       0       0       0       0
## 4585       0       0       0       0       0       0       0       0       0
## 1425       0       0       0       0       0       0       0       0       0
## 724        0       0       0       0       0       0       0       0       0
## 3489       0       0       0       0       0       0       0       0       0
## 1572       0       0       0       0       0       0       0       0       0
## 3776       0       0       0       0       1       0       0       0       0
## 1912       0       0       0       0       0       0       0       0       0
## 3289       0       0       0       0       0       0       0       0       0
## 3759       0       0       0       0       0       0       0       0       0
## 911        0       0       0       0       1       0       0       0       0
## 141        0       0       0       0       0       0       0       0       0
## 658        0       0       0       0       0       0       0       0       0
## 3293       0       0       0       0       0       0       0       0       0
## 4525       0       0       0       0       0       0       0       0       0
## 2664       0       0       0       0       0       0       0       0       0
## 2912       0       0       0       0       0       1       0       0       0
## 953        0       0       0       0       0       0       0       0       0
## 2589       0       0       0       0       0       0       0       0       0
## 869        0       0       0       0       0       0       0       0       0
## 2185       0       0       0       0       0       0       0       0       0
## 1533       0       0       0       0       0       0       0       0       0
## 562        0       0       0       0       0       0       0       0       0
## 900        0       0       0       0       0       0       0       0       0
## 3525       0       0       0       0       0       0       0       0       0
## 1989       0       0       0       0       0       0       0       0       0
## 2000       0       0       0       0       0       0       0       0       0
## 2319       0       0       0       0       0       0       0       0       0
## 2064       0       0       0       0       0       0       0       0       0
## 659        0       0       0       0       0       0       0       0       0
## 3979       0       0       0       0       0       0       0       0       0
## 2857       0       0       0       0       0       0       0       0       0
## 3831       0       0       0       0       0       0       0       0       0
## 3708       0       0       0       1       0       0       0       0       0
## 4426       0       0       0       0       0       0       0       0       0
## 4158       0       0       0       0       0       0       0       0       0
## 1528       0       0       0       0       0       0       0       0       0
## 1249       0       0       0       0       0       0       0       0       0
## 3575       0       0       0       0       0       0       0       0       0
## 3599       0       0       0       0       1       0       0       0       0
## 4419       0       0       1       0       0       0       0       0       0
## 3818       0       0       0       0       0       0       0       0       0
## 642        0       0       0       0       0       0       0       0       0
## 1385       0       0       0       0       0       0       0       0       0
## 937        0       0       1       0       0       0       0       0       0
## 3771       0       0       0       0       0       0       0       1       0
## 620        0       0       0       0       0       0       0       0       0
## 621        0       0       0       0       0       0       0       0       0
## 348        0       0       0       0       0       0       0       0       0
## 256        0       0       0       0       0       0       0       0       0
## 2556       1       0       0       0       0       0       0       0       0
## 540        0       0       0       0       0       0       0       1       0
## 3569       0       0       0       0       0       0       0       0       0
## 3512       0       0       0       0       0       0       0       0       0
## 4249       0       0       0       0       0       0       0       0       0
## 2482       0       0       0       0       0       0       0       0       0
## 4088       0       0       0       0       0       0       0       0       1
## 2125       0       0       0       0       0       0       0       0       0
## 758        0       0       0       0       0       0       0       0       0
## 2121       0       0       0       0       0       0       0       0       0
## 4640       0       0       0       0       0       0       0       0       0
## 2323       0       0       0       0       0       0       0       0       0
## 1210       0       0       0       0       0       0       0       0       0
## 1245       1       0       0       0       0       0       0       0       0
## 2597       0       0       0       0       0       0       0       0       0
## 3113       0       0       0       0       0       0       0       0       0
## 1611       0       0       0       0       0       0       0       0       0
## 292        0       0       1       0       0       0       0       0       0
## 2160       0       0       0       0       0       0       0       0       0
## 4014       0       0       0       0       0       0       0       0       0
## 2750       0       0       0       0       0       0       0       0       0
## 1691       0       0       0       0       0       0       0       0       0
## 4886       0       0       0       0       0       0       0       0       0
## 4269       0       0       0       0       0       0       0       0       0
## 2343       0       0       0       0       0       0       0       0       0
## 821        0       0       0       0       0       0       0       0       0
## 2595       0       0       0       0       0       0       0       0       1
## 4593       0       0       0       0       0       0       0       0       0
## 4911       0       0       0       0       0       0       1       0       0
## 3918       0       0       0       0       0       0       0       0       0
## 1466       0       0       0       0       0       0       0       0       0
## 886        0       0       0       0       0       0       0       0       0
## 231        0       0       0       0       0       0       0       0       0
## 1173       0       0       0       0       0       0       0       0       0
## 1675       0       0       0       0       0       0       1       0       0
## 759        0       0       0       0       0       0       0       0       0
## 1450       0       0       0       0       0       0       0       0       0
## 84         0       0       0       0       0       0       0       0       0
## 4750       0       0       1       0       0       0       0       0       0
## 3833       0       0       0       0       0       0       0       0       0
## 413        0       0       1       0       0       0       0       0       0
## 4144       0       0       0       0       0       0       0       0       0
## 2641       0       0       0       0       0       0       0       0       0
## 2007       0       0       0       0       0       0       0       0       0
## 322        0       0       0       0       0       0       0       0       0
## 2672       0       0       0       0       0       0       0       0       0
## 337        0       0       0       0       0       0       0       0       0
## 1006       0       1       0       0       0       0       0       0       0
## 2614       0       0       0       0       0       0       0       0       0
## 2292       0       0       0       0       0       0       0       0       0
## 4769       0       0       0       0       0       0       0       0       0
## 711        0       0       0       0       0       0       0       0       0
## 2373       0       0       0       0       0       0       0       0       1
## 4469       0       0       0       0       0       0       0       0       0
##      stateOK stateOR statePA stateRI stateSC stateSD stateTN stateTX stateUT
## 4575       0       0       0       0       1       0       0       0       0
## 4685       0       0       0       0       0       0       0       0       0
## 1431       0       0       0       0       0       0       0       0       0
## 4150       0       0       0       0       0       0       0       0       0
## 3207       0       0       0       0       0       0       0       0       0
## 2593       0       0       0       0       0       0       0       0       0
## 3679       0       0       0       0       0       0       0       0       0
## 673        0       0       0       0       0       0       0       0       0
## 3280       0       0       0       0       0       0       0       0       0
## 3519       0       0       0       0       0       0       0       0       0
## 2285       0       0       1       0       0       0       0       0       0
## 3588       0       0       0       0       0       0       0       0       0
## 4663       0       0       0       0       0       0       0       0       0
## 1274       0       0       0       0       0       0       0       0       0
## 2305       1       0       0       0       0       0       0       0       0
## 4686       0       1       0       0       0       0       0       0       0
## 4876       0       0       0       0       1       0       0       0       0
## 586        0       0       0       0       0       0       0       0       0
## 2367       0       0       0       0       0       0       0       0       0
## 2792       0       0       0       0       0       0       0       0       0
## 4503       0       0       0       0       0       0       0       0       0
## 691        0       0       0       0       0       0       0       0       0
## 4923       0       0       0       0       0       0       0       0       0
## 4712       0       0       0       0       0       0       0       1       0
## 411        0       1       0       0       0       0       0       0       0
## 2559       0       0       0       0       0       0       0       0       0
## 1941       0       0       0       0       0       0       0       0       0
## 4505       0       0       0       0       0       0       0       0       0
## 2223       0       0       0       0       0       0       0       0       0
## 4156       0       0       0       1       0       0       0       0       0
## 3666       0       0       0       0       0       0       0       0       0
## 4031       0       0       0       0       0       0       0       0       0
## 1929       0       0       0       0       0       0       1       0       0
## 3404       0       0       0       0       1       0       0       0       0
## 20         0       0       0       0       0       0       0       1       0
## 4136       0       0       0       0       0       0       0       0       0
## 37         0       0       0       0       0       0       0       0       0
## 1031       0       0       0       0       0       0       0       0       0
## 4499       0       0       0       0       0       0       0       0       0
## 3036       0       0       0       0       0       0       0       0       0
## 1883       0       0       0       0       0       0       0       0       0
## 2161       0       0       0       0       0       0       0       0       0
## 186        0       0       0       0       0       0       0       0       0
## 4826       0       0       0       0       0       0       0       0       0
## 2140       0       0       0       0       0       0       0       0       0
## 4745       0       0       0       0       0       0       0       0       0
## 4398       0       0       0       0       0       0       0       0       0
## 3170       0       0       0       0       0       0       0       0       0
## 4809       0       0       0       0       0       0       0       0       0
## 3064       0       0       0       0       0       0       0       0       0
## 1651       1       0       0       0       0       0       0       0       0
## 1717       0       0       0       0       0       0       0       0       0
## 1972       0       0       0       0       0       0       0       0       0
## 3882       0       0       0       0       0       0       0       0       0
## 193        0       0       0       0       0       0       0       0       0
## 3703       0       1       0       0       0       0       0       0       0
## 3349       0       0       0       0       0       0       0       0       0
## 847        0       0       0       0       0       0       0       0       0
## 1291       0       0       0       0       0       0       0       0       0
## 2542       0       0       0       0       0       0       0       0       0
## 3338       0       0       0       0       1       0       0       0       0
## 4855       0       0       0       1       0       0       0       0       0
## 3751       0       0       0       0       0       0       0       0       0
## 2797       0       0       0       0       0       0       0       0       0
## 4195       0       0       0       0       0       0       0       0       0
## 936        0       0       0       0       0       0       0       0       0
## 1339       0       0       0       0       0       0       0       1       0
## 4086       0       0       0       0       0       0       0       0       0
## 3419       0       0       0       0       0       1       0       0       0
## 1187       0       0       0       0       0       0       1       0       0
## 212        0       0       0       0       0       0       0       0       0
## 693        0       0       0       0       0       0       0       0       0
## 1067       0       0       0       0       0       0       0       0       0
## 2362       0       0       0       0       0       0       0       0       0
## 973        0       0       0       0       0       0       0       0       0
## 3543       0       0       0       0       0       0       0       0       0
## 39         0       0       0       0       0       0       0       0       0
## 1849       0       0       0       0       0       0       0       0       0
## 2532       0       0       0       1       0       0       0       0       0
## 8          0       0       0       0       0       0       0       0       0
## 2862       0       0       0       0       0       0       0       0       0
## 777        0       0       0       0       0       0       0       0       0
## 1766       0       0       0       0       0       0       0       0       0
## 3175       0       0       0       0       1       0       0       0       0
## 3814       0       0       0       0       0       0       0       0       0
## 2771       0       0       0       0       0       0       0       0       0
## 1149       0       0       0       0       0       0       0       0       0
## 443        0       0       0       0       0       0       0       0       0
## 421        0       0       0       0       0       0       0       0       0
## 1499       0       0       0       0       0       0       0       0       0
## 3278       1       0       0       0       0       0       0       0       0
## 2          0       0       0       0       0       0       0       0       0
## 1024       0       0       0       0       0       0       0       0       0
## 4579       0       0       0       0       0       0       0       0       0
## 4542       0       0       0       0       0       0       0       0       0
## 3601       0       0       0       0       0       0       0       0       0
## 1634       0       0       0       0       0       0       0       0       0
## 2526       0       0       0       0       0       0       1       0       0
## 3647       0       0       0       0       0       0       0       0       0
## 3035       0       0       0       0       0       0       0       0       0
## 3069       0       0       0       0       1       0       0       0       0
## 1064       0       0       0       0       0       0       0       0       0
## 1061       0       0       0       0       0       0       0       0       0
## 1905       0       0       0       0       0       0       0       0       0
## 4615       0       0       0       0       0       0       0       0       0
## 4977       0       0       0       0       0       0       0       0       0
## 3621       0       0       0       0       0       0       0       0       0
## 4989       0       0       0       0       0       0       0       0       0
## 2621       0       0       0       0       0       0       1       0       0
## 12         0       0       0       1       0       0       0       0       0
## 2978       0       0       0       0       0       0       0       0       0
## 4092       0       0       0       0       0       0       0       0       0
## 3674       0       0       0       0       0       0       0       0       0
## 2213       0       0       0       0       0       0       0       0       0
## 2618       0       0       0       0       0       1       0       0       0
## 2626       0       0       0       0       0       0       0       0       0
## 7          0       0       0       0       0       0       0       0       0
## 1737       0       0       0       0       0       0       0       0       0
## 2989       0       0       0       0       0       0       0       0       0
## 4047       0       0       0       0       0       0       0       0       0
## 1741       0       0       0       0       0       0       0       0       0
## 2004       0       1       0       0       0       0       0       0       0
## 2798       0       0       0       0       0       0       0       0       0
## 2876       0       0       0       0       0       0       0       0       0
## 3510       0       0       0       0       0       0       0       0       0
## 1926       0       0       0       0       0       0       0       0       0
## 4481       0       0       0       0       0       0       0       0       0
## 4691       0       0       0       0       0       0       0       0       0
## 1138       0       0       0       0       0       0       0       1       0
## 3530       0       0       0       0       0       0       0       0       0
## 4401       0       0       1       0       0       0       0       0       0
## 2939       0       0       0       0       0       0       0       0       0
## 3075       0       0       0       0       0       0       0       0       0
## 4563       0       0       0       0       0       0       0       0       0
## 4139       0       0       0       0       0       0       0       0       0
## 2821       0       0       0       0       0       0       0       0       0
## 3996       0       0       0       0       0       0       0       0       0
## 554        0       0       0       0       0       0       0       0       1
## 3718       0       0       0       0       0       0       0       0       0
## 3032       0       0       0       0       0       0       0       0       0
## 722        0       0       0       0       0       0       0       0       1
## 391        0       0       0       0       0       0       0       0       0
## 2255       0       0       0       0       0       0       0       0       0
## 3786       0       0       0       0       0       0       0       0       0
## 3563       0       0       0       0       0       0       0       0       0
## 3968       0       0       0       0       0       0       0       0       1
## 826        0       0       0       0       0       0       0       0       0
## 4585       0       0       0       0       0       0       0       0       0
## 1425       0       0       0       0       0       0       0       0       0
## 724        0       0       0       0       0       0       0       0       0
## 3489       0       0       0       0       0       0       0       0       0
## 1572       0       0       0       0       0       0       0       0       0
## 3776       0       0       0       0       0       0       0       0       0
## 1912       0       0       0       0       0       0       0       0       0
## 3289       0       0       0       0       1       0       0       0       0
## 3759       0       0       0       0       0       0       0       0       0
## 911        0       0       0       0       0       0       0       0       0
## 141        0       0       0       0       0       0       0       0       0
## 658        0       0       0       0       0       0       0       0       0
## 3293       0       0       0       0       0       0       0       0       0
## 4525       0       0       0       0       0       0       0       0       0
## 2664       0       0       0       0       0       0       0       0       0
## 2912       0       0       0       0       0       0       0       0       0
## 953        0       0       1       0       0       0       0       0       0
## 2589       0       0       0       0       0       0       0       0       0
## 869        0       0       0       0       0       0       0       0       0
## 2185       0       0       0       0       0       0       0       0       0
## 1533       0       0       0       0       0       0       0       0       0
## 562        0       0       0       1       0       0       0       0       0
## 900        0       0       0       0       0       0       0       0       0
## 3525       0       0       0       0       0       0       0       0       0
## 1989       0       0       0       0       0       0       0       0       0
## 2000       0       0       0       0       0       0       0       0       0
## 2319       0       0       0       0       0       0       0       0       0
## 2064       0       0       0       0       0       0       0       0       0
## 659        0       0       0       0       0       0       0       0       0
## 3979       0       0       0       0       0       0       0       0       0
## 2857       0       0       0       0       0       0       0       0       0
## 3831       0       0       0       0       0       0       0       0       0
## 3708       0       0       0       0       0       0       0       0       0
## 4426       0       0       0       0       0       0       0       0       0
## 4158       0       1       0       0       0       0       0       0       0
## 1528       0       0       0       0       0       0       0       0       0
## 1249       0       0       1       0       0       0       0       0       0
## 3575       1       0       0       0       0       0       0       0       0
## 3599       0       0       0       0       0       0       0       0       0
## 4419       0       0       0       0       0       0       0       0       0
## 3818       0       0       0       0       0       0       0       0       0
## 642        0       0       0       0       0       0       0       0       0
## 1385       0       0       0       0       0       0       0       0       0
## 937        0       0       0       0       0       0       0       0       0
## 3771       0       0       0       0       0       0       0       0       0
## 620        0       0       0       0       0       0       0       0       0
## 621        0       0       0       0       0       0       0       0       0
## 348        0       0       0       0       0       0       0       0       0
## 256        0       0       0       0       0       0       0       0       0
## 2556       0       0       0       0       0       0       0       0       0
## 540        0       0       0       0       0       0       0       0       0
## 3569       0       0       0       0       0       0       0       0       0
## 3512       0       0       0       0       0       0       0       0       0
## 4249       0       0       0       0       0       0       0       0       0
## 2482       0       0       0       0       0       0       0       0       0
## 4088       0       0       0       0       0       0       0       0       0
## 2125       0       0       0       0       0       0       0       0       0
## 758        0       0       0       0       0       0       0       0       1
## 2121       0       0       0       0       0       0       0       0       1
## 4640       0       0       0       0       0       0       0       0       0
## 2323       0       0       0       0       0       0       0       0       0
## 1210       0       0       0       0       0       0       0       0       0
## 1245       0       0       0       0       0       0       0       0       0
## 2597       0       0       0       0       0       0       0       0       0
## 3113       0       0       0       0       0       0       0       0       0
## 1611       0       0       0       1       0       0       0       0       0
## 292        0       0       0       0       0       0       0       0       0
## 2160       0       0       0       0       0       0       0       0       0
## 4014       0       0       0       0       0       0       0       0       1
## 2750       0       0       0       0       0       0       0       0       0
## 1691       0       0       1       0       0       0       0       0       0
## 4886       0       0       0       0       0       0       0       0       0
## 4269       0       0       0       0       0       0       0       0       0
## 2343       0       0       0       0       0       0       0       0       0
## 821        0       0       0       0       0       0       0       0       0
## 2595       0       0       0       0       0       0       0       0       0
## 4593       0       0       0       0       0       1       0       0       0
## 4911       0       0       0       0       0       0       0       0       0
## 3918       0       0       0       0       0       0       0       0       0
## 1466       0       0       0       0       0       0       0       0       0
## 886        0       0       0       0       0       0       0       0       0
## 231        0       0       0       0       0       0       0       0       0
## 1173       0       0       0       0       0       0       0       0       0
## 1675       0       0       0       0       0       0       0       0       0
## 759        0       0       0       0       0       0       0       0       0
## 1450       0       0       0       0       0       0       0       0       0
## 84         0       0       0       0       0       0       0       0       0
## 4750       0       0       0       0       0       0       0       0       0
## 3833       0       0       0       0       0       0       0       0       0
## 413        0       0       0       0       0       0       0       0       0
## 4144       0       0       0       0       0       0       0       0       0
## 2641       0       0       0       0       0       0       0       0       0
## 2007       0       0       0       0       0       0       0       0       0
## 322        0       0       0       0       0       0       0       0       0
## 2672       0       0       0       0       1       0       0       0       0
## 337        0       0       0       0       1       0       0       0       0
## 1006       0       0       0       0       0       0       0       0       0
## 2614       0       0       0       0       0       0       0       0       0
## 2292       0       1       0       0       0       0       0       0       0
## 4769       0       0       0       0       0       0       0       0       0
## 711        0       0       0       0       0       0       1       0       0
## 2373       0       0       0       0       0       0       0       0       0
## 4469       0       0       0       0       0       0       0       0       0
##      stateVA stateVT stateWA stateWI stateWV stateWY account_length
## 4575       0       0       0       0       0       0            137
## 4685       0       0       0       0       0       0             83
## 1431       0       0       0       0       1       0             48
## 4150       0       0       0       0       0       0             67
## 3207       0       0       1       0       0       0            143
## 2593       1       0       0       0       0       0            163
## 3679       0       0       0       0       0       0            100
## 673        0       0       0       0       0       0            151
## 3280       0       0       0       0       0       0            139
## 3519       0       0       0       0       0       0             17
## 2285       0       0       0       0       0       0            114
## 3588       0       0       0       0       1       0             78
## 4663       0       0       0       0       0       0             65
## 1274       0       0       0       0       0       0             28
## 2305       0       0       0       0       0       0             93
## 4686       0       0       0       0       0       0             92
## 4876       0       0       0       0       0       0             77
## 586        0       0       0       0       0       0            110
## 2367       0       0       0       0       0       0            122
## 2792       0       0       0       0       0       0            151
## 4503       0       0       0       0       0       0             60
## 691        0       0       0       0       0       0             88
## 4923       0       0       0       0       0       0            102
## 4712       0       0       0       0       0       0             26
## 411        0       0       0       0       0       0             25
## 2559       0       0       0       0       0       0             94
## 1941       0       0       0       1       0       0            123
## 4505       0       0       0       0       0       0            123
## 2223       0       0       0       0       0       0             97
## 4156       0       0       0       0       0       0            131
## 3666       0       0       0       0       1       0            166
## 4031       0       0       0       0       0       0            107
## 1929       0       0       0       0       0       0            102
## 3404       0       0       0       0       0       0             84
## 20         0       0       0       0       0       0             73
## 4136       0       0       0       0       0       0             74
## 37         0       0       0       0       0       0             36
## 1031       1       0       0       0       0       0             99
## 4499       0       0       0       0       0       0             77
## 3036       0       0       0       0       0       0             88
## 1883       0       0       0       0       0       0            132
## 2161       0       0       0       0       0       0             94
## 186        0       0       0       0       0       0             73
## 4826       0       0       0       0       0       0             83
## 2140       1       0       0       0       0       0             92
## 4745       0       0       0       0       0       0             43
## 4398       0       0       0       0       0       0             69
## 3170       0       0       0       0       0       0             78
## 4809       0       0       0       0       0       0            145
## 3064       0       0       0       0       0       0             63
## 1651       0       0       0       0       0       0             91
## 1717       0       0       0       0       0       0             36
## 1972       0       0       0       0       0       0             74
## 3882       0       0       0       0       0       0            162
## 193        0       0       0       0       0       0             80
## 3703       0       0       0       0       0       0              3
## 3349       0       0       0       0       0       0             96
## 847        0       0       0       0       0       0             96
## 1291       0       0       0       0       0       0             92
## 2542       0       0       0       0       0       0             73
## 3338       0       0       0       0       0       0            108
## 4855       0       0       0       0       0       0            158
## 3751       0       0       0       0       0       0            141
## 2797       0       0       0       0       0       0             24
## 4195       0       0       0       0       0       0             97
## 936        0       0       0       0       0       0             86
## 1339       0       0       0       0       0       0             28
## 4086       0       0       0       0       1       0            159
## 3419       0       0       0       0       0       0            166
## 1187       0       0       0       0       0       0            130
## 212        0       0       0       0       0       0            144
## 693        0       0       0       0       0       0             82
## 1067       0       0       0       0       0       0            117
## 2362       0       0       0       0       0       0             66
## 973        0       0       0       0       0       0             87
## 3543       0       0       0       0       0       0             93
## 39         0       0       0       0       0       0            136
## 1849       0       0       0       0       0       0            148
## 2532       0       0       0       0       0       0            180
## 8          0       0       0       0       0       0            147
## 2862       0       0       0       0       0       0             75
## 777        0       0       0       0       0       0             61
## 1766       0       0       0       0       0       0            111
## 3175       0       0       0       0       0       0             36
## 3814       0       0       0       1       0       0             77
## 2771       0       0       0       0       0       0             84
## 1149       0       0       0       0       0       0            122
## 443        0       0       0       0       0       0             59
## 421        0       0       0       0       0       0            141
## 1499       0       0       0       0       0       0             50
## 3278       0       0       0       0       0       0            134
## 2          0       0       0       0       0       0            107
## 1024       0       0       0       0       0       0            100
## 4579       0       0       0       0       0       0            116
## 4542       0       0       0       0       0       0             34
## 3601       0       0       0       0       0       0            104
## 1634       0       0       0       0       0       0             87
## 2526       0       0       0       0       0       0             95
## 3647       0       0       0       0       0       0            167
## 3035       0       0       0       0       0       0             75
## 3069       0       0       0       0       0       0             78
## 1064       0       0       0       0       0       0            101
## 1061       0       0       0       0       0       0            124
## 1905       0       0       1       0       0       0            174
## 4615       0       1       0       0       0       0             67
## 4977       0       0       0       0       0       0            145
## 3621       0       0       0       0       0       0            160
## 4989       0       0       1       0       0       0             80
## 2621       0       0       0       0       0       0            115
## 12         0       0       0       0       0       0             74
## 2978       0       0       0       0       0       0            132
## 4092       0       0       0       0       0       0             88
## 3674       0       0       0       0       0       0            171
## 2213       0       0       0       0       0       0              8
## 2618       0       0       0       0       0       0             64
## 2626       0       0       0       0       0       0             88
## 7          0       0       0       0       0       0            121
## 1737       0       0       0       0       0       0            115
## 2989       1       0       0       0       0       0            105
## 4047       0       0       0       0       0       0             73
## 1741       0       0       0       0       0       0            153
## 2004       0       0       0       0       0       0             33
## 2798       0       0       0       0       0       0            169
## 2876       0       0       0       0       0       0            123
## 3510       0       0       0       0       0       0             59
## 1926       0       0       0       0       0       0             84
## 4481       0       0       0       0       0       0            114
## 4691       0       0       0       0       0       0            138
## 1138       0       0       0       0       0       0             19
## 3530       0       0       0       0       0       0            148
## 4401       0       0       0       0       0       0             86
## 2939       0       0       0       0       0       0             31
## 3075       0       0       0       0       1       0            113
## 4563       0       0       0       0       0       0             42
## 4139       0       0       0       0       0       0            102
## 2821       0       0       0       0       0       0             74
## 3996       0       0       0       0       0       0            150
## 554        0       0       0       0       0       0             61
## 3718       0       0       0       0       0       0            185
## 3032       0       0       0       0       0       0             64
## 722        0       0       0       0       0       0            103
## 391        0       0       0       0       1       0             43
## 2255       0       0       0       0       0       0             97
## 3786       0       0       0       0       0       0            126
## 3563       0       0       0       0       1       0             66
## 3968       0       0       0       0       0       0             91
## 826        0       0       0       0       0       0             74
## 4585       0       0       0       0       0       1             25
## 1425       0       0       1       0       0       0             78
## 724        0       0       0       0       0       0             78
## 3489       0       0       0       0       0       0            131
## 1572       0       0       0       0       0       0            179
## 3776       0       0       0       0       0       0             83
## 1912       0       0       0       0       0       0            118
## 3289       0       0       0       0       0       0             78
## 3759       0       0       0       0       0       0             89
## 911        0       0       0       0       0       0             42
## 141        0       0       0       0       0       0            110
## 658        0       0       0       0       0       0            130
## 3293       0       0       0       0       0       0            114
## 4525       0       0       0       0       0       0            146
## 2664       1       0       0       0       0       0            172
## 2912       0       0       0       0       0       0            127
## 953        0       0       0       0       0       0             53
## 2589       0       0       0       0       0       0             53
## 869        0       0       0       0       0       0             42
## 2185       0       0       0       0       0       0             92
## 1533       0       0       0       0       0       0            103
## 562        0       0       0       0       0       0             53
## 900        1       0       0       0       0       0             89
## 3525       1       0       0       0       0       0             71
## 1989       0       0       0       0       0       0             59
## 2000       0       0       0       0       0       1            160
## 2319       0       0       0       0       1       0            106
## 2064       0       0       0       0       0       0             90
## 659        1       0       0       0       0       0             34
## 3979       0       0       0       0       0       0            111
## 2857       0       0       0       0       0       0            123
## 3831       0       0       0       0       0       0             60
## 3708       0       0       0       0       0       0             74
## 4426       0       0       0       0       0       0             85
## 4158       0       0       0       0       0       0            174
## 1528       0       0       0       0       0       0             36
## 1249       0       0       0       0       0       0            101
## 3575       0       0       0       0       0       0            177
## 3599       0       0       0       0       0       0            139
## 4419       0       0       0       0       0       0             90
## 3818       0       0       0       0       0       0             40
## 642        1       0       0       0       0       0             74
## 1385       0       0       0       0       0       0            141
## 937        0       0       0       0       0       0             91
## 3771       0       0       0       0       0       0            112
## 620        0       0       0       0       0       0            110
## 621        0       0       0       0       0       0            163
## 348        0       1       0       0       0       0            162
## 256        0       0       0       0       0       0            106
## 2556       0       0       0       0       0       0            190
## 540        0       0       0       0       0       0             59
## 3569       0       0       0       0       0       0             48
## 3512       0       0       0       0       0       0             12
## 4249       0       0       0       0       0       0            120
## 2482       1       0       0       0       0       0             80
## 4088       0       0       0       0       0       0            142
## 2125       0       0       0       0       0       0             43
## 758        0       0       0       0       0       0            112
## 2121       0       0       0       0       0       0             81
## 4640       0       0       0       0       0       0             91
## 2323       0       0       0       0       0       0            109
## 1210       0       0       0       0       0       0            144
## 1245       0       0       0       0       0       0             45
## 2597       0       0       0       0       0       0             73
## 3113       0       0       0       0       0       0            115
## 1611       0       0       0       0       0       0            104
## 292        0       0       0       0       0       0            132
## 2160       0       0       0       0       0       0            132
## 4014       0       0       0       0       0       0            188
## 2750       0       0       0       0       0       0             95
## 1691       0       0       0       0       0       0            174
## 4886       0       0       0       0       0       0            122
## 4269       0       0       0       0       0       0            155
## 2343       0       0       0       0       0       0             61
## 821        0       0       0       0       1       0             92
## 2595       0       0       0       0       0       0            115
## 4593       0       0       0       0       0       0             72
## 4911       0       0       0       0       0       0             58
## 3918       0       0       0       0       0       1             65
## 1466       0       0       0       0       0       0             83
## 886        0       0       0       0       0       0             66
## 231        0       0       0       0       0       0             93
## 1173       0       0       0       0       0       0            166
## 1675       0       0       0       0       0       0             76
## 759        0       0       0       0       0       0            108
## 1450       0       0       0       0       0       0            192
## 84         0       0       0       0       0       0             70
## 4750       0       0       0       0       0       0            124
## 3833       1       0       0       0       0       0             54
## 413        0       0       0       0       0       0             85
## 4144       0       0       0       0       0       1             86
## 2641       0       0       0       0       0       0            105
## 2007       0       0       0       0       0       0             91
## 322        0       0       0       0       0       1            114
## 2672       0       0       0       0       0       0            152
## 337        0       0       0       0       0       0             99
## 1006       0       0       0       0       0       0             12
## 2614       0       0       0       0       0       0             48
## 2292       0       0       0       0       0       0             69
## 4769       0       0       0       0       0       0             48
## 711        0       0       0       0       0       0             69
## 2373       0       0       0       0       0       0            114
## 4469       0       0       0       0       0       0             90
##      area_codearea_code_415 area_codearea_code_510 international_planyes
## 4575                      0                      1                     0
## 4685                      1                      0                     0
## 1431                      1                      0                     0
## 4150                      1                      0                     0
## 3207                      0                      1                     0
## 2593                      1                      0                     0
## 3679                      1                      0                     0
## 673                       0                      0                     0
## 3280                      1                      0                     0
## 3519                      0                      1                     0
## 2285                      1                      0                     0
## 3588                      0                      0                     0
## 4663                      1                      0                     0
## 1274                      1                      0                     0
## 2305                      1                      0                     0
## 4686                      1                      0                     0
## 4876                      1                      0                     0
## 586                       0                      0                     0
## 2367                      0                      0                     0
## 2792                      1                      0                     0
## 4503                      1                      0                     1
## 691                       1                      0                     0
## 4923                      1                      0                     0
## 4712                      1                      0                     1
## 411                       0                      0                     0
## 2559                      0                      0                     0
## 1941                      1                      0                     0
## 4505                      1                      0                     0
## 2223                      1                      0                     0
## 4156                      1                      0                     0
## 3666                      0                      0                     0
## 4031                      1                      0                     0
## 1929                      0                      1                     0
## 3404                      1                      0                     0
## 20                        1                      0                     0
## 4136                      0                      0                     0
## 37                        0                      0                     0
## 1031                      1                      0                     0
## 4499                      1                      0                     1
## 3036                      1                      0                     0
## 1883                      0                      0                     1
## 2161                      0                      0                     1
## 186                       1                      0                     0
## 4826                      0                      0                     1
## 2140                      0                      1                     1
## 4745                      1                      0                     0
## 4398                      0                      1                     0
## 3170                      1                      0                     0
## 4809                      1                      0                     0
## 3064                      0                      0                     0
## 1651                      0                      0                     0
## 1717                      0                      1                     0
## 1972                      1                      0                     0
## 3882                      0                      0                     0
## 193                       0                      0                     0
## 3703                      0                      1                     0
## 3349                      0                      0                     0
## 847                       1                      0                     0
## 1291                      0                      0                     0
## 2542                      0                      0                     0
## 3338                      1                      0                     0
## 4855                      0                      1                     0
## 3751                      1                      0                     0
## 2797                      1                      0                     0
## 4195                      0                      1                     0
## 936                       0                      0                     0
## 1339                      1                      0                     0
## 4086                      1                      0                     0
## 3419                      0                      1                     0
## 1187                      1                      0                     0
## 212                       0                      1                     1
## 693                       0                      0                     0
## 1067                      0                      1                     0
## 2362                      1                      0                     0
## 973                       1                      0                     0
## 3543                      1                      0                     0
## 39                        1                      0                     1
## 1849                      0                      1                     1
## 2532                      1                      0                     0
## 8                         1                      0                     1
## 2862                      1                      0                     0
## 777                       1                      0                     0
## 1766                      1                      0                     0
## 3175                      0                      0                     0
## 3814                      0                      0                     0
## 2771                      0                      0                     0
## 1149                      1                      0                     0
## 443                       0                      1                     0
## 421                       1                      0                     0
## 1499                      0                      0                     0
## 3278                      1                      0                     0
## 2                         1                      0                     0
## 1024                      0                      1                     0
## 4579                      0                      0                     0
## 4542                      0                      0                     0
## 3601                      0                      1                     0
## 1634                      1                      0                     0
## 2526                      0                      1                     0
## 3647                      1                      0                     1
## 3035                      1                      0                     0
## 3069                      1                      0                     0
## 1064                      0                      1                     0
## 1061                      0                      1                     0
## 1905                      0                      0                     0
## 4615                      0                      0                     0
## 4977                      0                      0                     0
## 3621                      1                      0                     0
## 4989                      0                      1                     0
## 2621                      1                      0                     0
## 12                        1                      0                     0
## 2978                      0                      0                     0
## 4092                      0                      0                     0
## 3674                      1                      0                     0
## 2213                      1                      0                     0
## 2618                      1                      0                     0
## 2626                      0                      0                     0
## 7                         0                      1                     0
## 1737                      1                      0                     0
## 2989                      1                      0                     0
## 4047                      1                      0                     0
## 1741                      1                      0                     0
## 2004                      1                      0                     1
## 2798                      0                      0                     0
## 2876                      0                      1                     0
## 3510                      1                      0                     0
## 1926                      1                      0                     0
## 4481                      0                      0                     0
## 4691                      0                      1                     0
## 1138                      0                      1                     0
## 3530                      0                      1                     0
## 4401                      0                      1                     0
## 2939                      1                      0                     0
## 3075                      0                      1                     0
## 4563                      1                      0                     0
## 4139                      1                      0                     1
## 2821                      0                      1                     0
## 3996                      1                      0                     0
## 554                       0                      1                     1
## 3718                      0                      1                     0
## 3032                      1                      0                     1
## 722                       0                      1                     0
## 391                       1                      0                     0
## 2255                      1                      0                     0
## 3786                      1                      0                     0
## 3563                      1                      0                     0
## 3968                      0                      1                     0
## 826                       1                      0                     0
## 4585                      0                      0                     0
## 1425                      0                      0                     0
## 724                       0                      1                     0
## 3489                      0                      0                     0
## 1572                      0                      0                     0
## 3776                      0                      0                     0
## 1912                      0                      0                     0
## 3289                      1                      0                     0
## 3759                      1                      0                     0
## 911                       1                      0                     0
## 141                       0                      1                     0
## 658                       1                      0                     0
## 3293                      0                      0                     0
## 4525                      0                      1                     1
## 2664                      0                      1                     0
## 2912                      1                      0                     1
## 953                       1                      0                     0
## 2589                      0                      1                     0
## 869                       0                      0                     0
## 2185                      1                      0                     0
## 1533                      1                      0                     0
## 562                       0                      0                     0
## 900                       1                      0                     0
## 3525                      1                      0                     0
## 1989                      0                      0                     0
## 2000                      0                      0                     0
## 2319                      0                      1                     0
## 2064                      1                      0                     0
## 659                       1                      0                     0
## 3979                      0                      0                     1
## 2857                      1                      0                     0
## 3831                      0                      1                     0
## 3708                      0                      1                     0
## 4426                      1                      0                     0
## 4158                      1                      0                     0
## 1528                      0                      1                     0
## 1249                      1                      0                     1
## 3575                      1                      0                     0
## 3599                      1                      0                     0
## 4419                      1                      0                     0
## 3818                      0                      1                     0
## 642                       0                      0                     0
## 1385                      0                      0                     0
## 937                       0                      1                     0
## 3771                      0                      1                     0
## 620                       1                      0                     1
## 621                       1                      0                     0
## 348                       0                      1                     0
## 256                       0                      0                     0
## 2556                      0                      0                     0
## 540                       0                      0                     0
## 3569                      0                      0                     0
## 3512                      1                      0                     0
## 4249                      1                      0                     0
## 2482                      1                      0                     0
## 4088                      0                      0                     1
## 2125                      1                      0                     0
## 758                       1                      0                     0
## 2121                      0                      1                     0
## 4640                      0                      1                     0
## 2323                      0                      1                     0
## 1210                      0                      1                     1
## 1245                      0                      0                     0
## 2597                      0                      0                     0
## 3113                      1                      0                     0
## 1611                      0                      0                     1
## 292                       0                      1                     0
## 2160                      1                      0                     0
## 4014                      0                      0                     0
## 2750                      1                      0                     0
## 1691                      1                      0                     0
## 4886                      0                      1                     0
## 4269                      1                      0                     1
## 2343                      1                      0                     0
## 821                       0                      1                     0
## 2595                      0                      1                     1
## 4593                      1                      0                     0
## 4911                      0                      0                     0
## 3918                      1                      0                     0
## 1466                      0                      1                     0
## 886                       0                      0                     0
## 231                       0                      0                     1
## 1173                      1                      0                     0
## 1675                      0                      1                     1
## 759                       0                      1                     0
## 1450                      1                      0                     0
## 84                        1                      0                     0
## 4750                      1                      0                     0
## 3833                      0                      0                     0
## 413                       1                      0                     0
## 4144                      1                      0                     0
## 2641                      1                      0                     0
## 2007                      0                      1                     0
## 322                       1                      0                     0
## 2672                      0                      0                     0
## 337                       0                      1                     0
## 1006                      0                      1                     1
## 2614                      1                      0                     0
## 2292                      0                      0                     0
## 4769                      0                      1                     0
## 711                       0                      1                     0
## 2373                      1                      0                     0
## 4469                      1                      0                     0
##      voice_mail_planyes number_vmail_messages total_day_minutes total_day_calls
## 4575                  0                     0             109.8             112
## 4685                  0                     0             196.7             117
## 1431                  1                    34             198.0              70
## 4150                  0                     0             164.5              79
## 3207                  0                     0             133.4             107
## 2593                  0                     0             202.9             100
## 3679                  1                    39             204.9              74
## 673                   0                     0             175.3             106
## 3280                  1                    43             231.0              85
## 3519                  1                    30             280.0             101
## 2285                  0                     0             155.3              75
## 3588                  0                     0             226.3              88
## 4663                  0                     0             180.3              84
## 1274                  0                     0             121.7              48
## 2305                  1                    32             116.9             120
## 4686                  0                     0             114.6              94
## 4876                  0                     0             175.8             116
## 586                   0                     0              55.3             102
## 2367                  1                    45             147.8              85
## 2792                  0                     0             194.8             106
## 4503                  1                    45             135.6             101
## 691                   0                     0             189.8             111
## 4923                  1                    32             135.7             108
## 4712                  0                     0             194.4             105
## 411                   0                     0             178.8              90
## 2559                  0                     0             136.2             114
## 1941                  0                     0             172.2              92
## 4505                  0                     0             132.6             100
## 2223                  1                    24             176.1             109
## 4156                  0                     0             131.3              39
## 3666                  1                    41             196.7             109
## 4031                  0                     0             260.6              81
## 1929                  0                     0             103.1              70
## 3404                  0                     0             204.1             132
## 20                    0                     0             224.4              90
## 4136                  0                     0             286.9             113
## 37                    1                    30             146.3             128
## 1031                  1                    42             216.0             125
## 4499                  0                     0             132.9              86
## 3036                  0                     0              85.7             112
## 1883                  1                    33             200.3              75
## 2161                  0                     0              89.5              94
## 186                   0                     0             214.3             145
## 4826                  0                     0             197.6             105
## 2140                  0                     0             252.3             120
## 4745                  0                     0             253.6             110
## 4398                  0                     0             221.6              88
## 3170                  0                     0             103.5             115
## 4809                  0                     0             219.0              98
## 3064                  1                    25             190.0             137
## 1651                  1                    31             273.0              78
## 1717                  1                    25             152.8             110
## 1972                  1                    32             174.6             107
## 3882                  1                    21             205.2             128
## 193                   0                     0             209.9              74
## 3703                  0                     0             125.5             100
## 3349                  0                     0             208.1              93
## 847                   0                     0             180.6              92
## 1291                  0                     0             249.4             118
## 2542                  0                     0              94.9             121
## 3338                  0                     0             197.4              78
## 4855                  0                     0             245.1              68
## 3751                  0                     0             208.5             129
## 2797                  0                     0             235.6             132
## 4195                  0                     0              80.7              99
## 936                   0                     0             226.3              88
## 1339                  0                     0             159.7              79
## 4086                  0                     0             181.0             107
## 3419                  1                    32             148.4              86
## 1187                  1                    12             141.9              92
## 212                   0                     0             203.5             100
## 693                   0                     0             185.8              36
## 1067                  1                    25             216.0             140
## 2362                  0                     0             116.4              98
## 973                   0                     0             228.7              90
## 3543                  0                     0             172.0              80
## 39                    1                    33             203.9             106
## 1849                  0                     0             148.2             138
## 2532                  0                     0             143.3             134
## 8                     0                     0             157.0              79
## 2862                  1                    19             210.3              90
## 777                   1                    20             254.4             133
## 1766                  0                     0             132.6             125
## 3175                  1                    43              29.9             123
## 3814                  0                     0             168.1              83
## 2771                  1                    30             106.5              65
## 1149                  0                     0             173.6             110
## 443                   1                    29             133.1             114
## 421                   1                    28             206.9             126
## 1499                  0                     0             154.7             102
## 3278                  0                     0             164.9             115
## 2                     1                    26             161.6             123
## 1024                  0                     0             142.5              87
## 4579                  0                     0              85.8              88
## 4542                  0                     0             151.1              92
## 3601                  0                     0             189.3              73
## 1634                  0                     0             204.8             101
## 2526                  0                     0             174.0              57
## 3647                  0                     0             143.7             108
## 3035                  1                    42             248.9              93
## 3069                  1                    21             160.6              85
## 1064                  0                     0              93.8             127
## 1061                  0                     0             193.0              97
## 1905                  1                    33             167.8              91
## 4615                  1                    33             127.5             126
## 4977                  0                     0             135.0             122
## 3621                  0                     0             237.1             126
## 4989                  0                     0             157.0             101
## 2621                  0                     0             206.2             113
## 12                    0                     0             187.7             127
## 2978                  0                     0             195.1             100
## 4092                  0                     0             242.4              89
## 3674                  0                     0             262.1              78
## 2213                  1                    36             242.9              67
## 2618                  0                     0             174.5              98
## 2626                  0                     0             152.9             119
## 7                     1                    24             218.2              88
## 1737                  0                     0             286.4             125
## 2989                  0                     0             259.3              96
## 4047                  1                    33             129.8             112
## 1741                  1                    31             218.5             130
## 2004                  0                     0             190.6             100
## 2798                  0                     0             142.5              82
## 2876                  1                    32             212.3              77
## 3510                  1                    26             205.4              68
## 1926                  0                     0             190.2             102
## 4481                  0                     0             139.8             152
## 4691                  0                     0              54.8             123
## 1138                  1                    34             156.6              97
## 3530                  0                     0             230.6              92
## 4401                  0                     0             212.8              79
## 2939                  0                     0              97.5             129
## 3075                  0                     0              72.5              88
## 4563                  0                     0             137.9             160
## 4139                  0                     0             209.8              85
## 2821                  1                    27             154.1             122
## 3996                  1                    24             195.7             108
## 554                   0                     0              78.2             103
## 3718                  1                    24             121.6             100
## 3032                  0                     0             146.7              83
## 722                   1                    36              87.2              92
## 391                   0                     0             168.4             125
## 2255                  1                    28             202.3              97
## 3786                  0                     0             167.1             138
## 3563                  0                     0             180.4              99
## 3968                  0                     0             142.6              90
## 826                   0                     0             172.1             105
## 4585                  0                     0             199.2              80
## 1425                  0                     0             140.7              77
## 724                   0                     0             137.4             109
## 3489                  1                    29             239.7             102
## 1572                  0                     0             234.5             134
## 3776                  0                     0              48.4             105
## 1912                  0                     0             188.8              60
## 3289                  0                     0             109.5             105
## 3759                  1                    15             274.0              83
## 911                   1                    32             163.8              80
## 141                   0                     0             148.5             115
## 658                   0                     0             242.5             101
## 3293                  0                     0             203.8              85
## 4525                  0                     0             163.5              85
## 2664                  0                     0             169.8             123
## 2912                  0                     0             256.5              87
## 953                   0                     0             205.1              86
## 2589                  1                    37             167.3              99
## 869                   0                     0             196.5              89
## 2185                  1                    31             172.3             116
## 1533                  1                    18             149.9              84
## 562                   1                    18             146.8             107
## 900                   1                    32             209.9             113
## 3525                  1                    49             174.0             122
## 1989                  0                     0             150.2              70
## 2000                  0                     0              82.7             116
## 2319                  0                     0             194.8             133
## 2064                  0                     0             114.4             122
## 659                   0                     0             151.0             102
## 3979                  0                     0             159.4              47
## 2857                  0                     0             204.4              88
## 3831                  0                     0             230.7              66
## 3708                  0                     0             233.6              72
## 4426                  0                     0             234.5             113
## 4158                  0                     0             222.3             102
## 1528                  0                     0             117.1              94
## 1249                  0                     0             193.7             108
## 3575                  1                    21             234.0             108
## 3599                  0                     0              63.3             116
## 4419                  0                     0             114.4             108
## 3818                  0                     0             146.4             105
## 642                   0                     0             165.3             120
## 1385                  0                     0              51.9             108
## 937                   1                    37             162.3             107
## 3771                  0                     0             172.1              73
## 620                   0                     0             293.3              79
## 621                   0                     0             191.3              89
## 348                   0                     0             220.6             117
## 256                   1                    32             165.9             126
## 2556                  0                     0             182.2             101
## 540                   0                     0             107.8             113
## 3569                  1                    26             192.9              60
## 3512                  0                     0             206.1             105
## 4249                  0                     0              79.2             123
## 2482                  0                     0             197.5             114
## 4088                  0                     0             160.6             101
## 2125                  0                     0              27.0             117
## 758                   0                     0             115.8             108
## 2121                  0                     0             154.5              84
## 4640                  0                     0             172.5              56
## 2323                  1                    35             230.5             116
## 1210                  1                    35             174.8             127
## 1245                  1                    38             196.8              92
## 2597                  0                     0             122.0              92
## 3113                  0                     0             139.3              89
## 1611                  0                     0             160.4              73
## 292                   0                     0              99.5             110
## 2160                  0                     0             190.1             105
## 4014                  0                     0             231.1              92
## 2750                  0                     0             229.9             116
## 1691                  1                    15             221.8             143
## 4886                  0                     0             265.7             130
## 4269                  0                     0             176.1              88
## 2343                  0                     0             188.9             105
## 821                   1                    16             184.0              99
## 2595                  0                     0             345.3              81
## 4593                  1                    38              84.3             116
## 4911                  0                     0             191.3             120
## 3918                  1                    21             135.1             120
## 1466                  0                     0             132.4             120
## 886                   1                    26             254.9             108
## 231                   0                     0             312.0             109
## 1173                  0                     0             203.4              81
## 1675                  0                     0             241.0             120
## 759                   1                    30             276.6              99
## 1450                  0                     0             221.6             101
## 84                    1                    24             249.5             101
## 4750                  0                     0             182.7              98
## 3833                  0                     0             197.7              84
## 413                   0                     0             259.8              85
## 4144                  0                     0             227.9             105
## 2641                  1                    24             274.7              99
## 2007                  1                    27             204.6              96
## 322                   1                    32             125.2              79
## 2672                  0                     0             140.5              92
## 337                   0                     0             169.2              70
## 1006                  0                     0             216.7             117
## 2614                  0                     0             240.0              88
## 2292                  0                     0             196.1              87
## 4769                  0                     0             200.5             117
## 711                   0                     0             195.3              70
## 2373                  0                     0             187.8             109
## 4469                  0                     0             259.1             140
##      total_day_charge total_eve_minutes total_eve_calls total_eve_charge
## 4575            18.67             223.5              88            19.00
## 4685            33.44             272.0              89            23.12
## 1431            33.66             273.7             121            23.26
## 4150            27.97             110.3             108             9.38
## 3207            22.68             223.9             117            19.03
## 2593            34.49             178.6              46            15.18
## 3679            34.83             210.2              80            17.87
## 673             29.80             144.3              87            12.27
## 3280            39.27             222.3              82            18.90
## 3519            47.60             258.9              85            22.01
## 2285            26.40             169.9              87            14.44
## 3588            38.47             306.2              81            26.03
## 4663            30.65             199.9             129            16.99
## 1274            20.69             125.8             112            10.69
## 2305            19.87             232.4              97            19.75
## 4686            19.48             209.0              78            17.77
## 4876            29.89             156.2             125            13.28
## 586              9.40             164.7             124            14.00
## 2367            25.13             147.4              93            12.53
## 2792            33.12             292.7             103            24.88
## 4503            23.05              94.9              73             8.07
## 691             32.27             197.3             101            16.77
## 4923            23.07             216.5             134            18.40
## 4712            33.05             238.3             100            20.26
## 411             30.40             141.2              72            12.00
## 2559            23.15             165.1             118            14.03
## 1941            29.27             162.6              76            13.82
## 4505            22.54             198.8             102            16.90
## 2223            29.94             159.4              81            13.55
## 4156            22.32             242.9             101            20.65
## 3666            33.44             124.3             107            10.57
## 4031            44.30             246.1             116            20.92
## 1929            17.53             275.0             129            23.38
## 3404            34.70             164.4             117            13.97
## 20              38.15             159.5              88            13.56
## 4136            48.77             260.8             121            22.17
## 37              24.87             162.5              80            13.81
## 1031            36.72             232.3             104            19.75
## 4499            22.59             168.1             121            14.29
## 3036            14.57             221.6              70            18.84
## 1883            34.05             226.6              67            19.26
## 2161            15.22             339.9             106            28.89
## 186             36.43             268.5             135            22.82
## 4826            33.59             177.5              96            15.09
## 2140            42.89             207.0             112            17.60
## 4745            43.11             253.3              92            21.53
## 4398            37.67             231.8              87            19.70
## 3170            17.60             117.9             102            10.02
## 4809            37.23             176.1             109            14.97
## 3064            32.30             116.6              76             9.91
## 1651            46.41             215.5              98            18.32
## 1717            25.98             242.8              67            20.64
## 1972            29.68             310.6             115            26.40
## 3882            34.88             231.7             128            19.69
## 193             35.68             195.1              77            16.58
## 3703            21.34             266.2              78            22.63
## 3349            35.38             189.2             107            16.08
## 847             30.70             190.9             114            16.23
## 1291            42.40             211.5              95            17.98
## 2542            16.13             253.2              83            21.52
## 3338            33.56             124.0             101            10.54
## 4855            41.67             195.8             100            16.64
## 3751            35.45             190.3              86            16.18
## 2797            40.05             115.9             129             9.85
## 4195            13.72             262.3              93            22.30
## 936             38.47             223.0             107            18.96
## 1339            27.15             216.7             131            18.42
## 4086            30.77             188.4             114            16.01
## 3419            25.23             170.6             119            14.50
## 1187            24.12             228.9             102            19.46
## 212             34.60             247.6             103            21.05
## 693             31.59             276.5             134            23.50
## 1067            36.72             224.1              69            19.05
## 2362            19.79              95.6              74             8.13
## 973             38.88             163.0              99            13.86
## 3543            29.24             219.1              76            18.62
## 39              34.66             187.6              99            15.95
## 1849            25.19             159.6             123            13.57
## 2532            24.36             180.5             113            15.34
## 8               26.69             103.1              94             8.76
## 2862            35.75             241.8              87            20.55
## 777             43.25             161.7              96            13.74
## 1766            22.54             221.1              67            18.79
## 3175             5.08             129.1             117            10.97
## 3814            28.58             202.0              91            17.17
## 2771            18.11             225.7             108            19.18
## 1149            29.51              91.7              84             7.79
## 443             22.63             221.2              82            18.80
## 421             35.17             264.4             126            22.47
## 1499            26.30             298.0             108            25.33
## 3278            28.03             126.5              96            10.75
## 2               27.47             195.5             103            16.62
## 1024            24.23             195.7              88            16.63
## 4579            14.59             115.8             112             9.84
## 4542            25.69             211.7              86            17.99
## 3601            32.18             156.2             120            13.28
## 1634            34.82             161.0              80            13.69
## 2526            29.58             281.1             118            23.89
## 3647            24.43              75.6             111             6.43
## 3035            42.31             170.8             108            14.52
## 3069            27.30             223.1              79            18.96
## 1064            15.95             150.0             104            12.75
## 1061            32.81              89.8              99             7.63
## 1905            28.53             205.3              91            17.45
## 4615            21.68             296.1             129            25.17
## 4977            22.95             206.3              88            17.54
## 3621            40.31             238.4             127            20.26
## 4989            26.69             208.8             127            17.75
## 2621            35.05             176.4             102            14.99
## 12              31.91             163.4             148            13.89
## 2978            33.17             148.8              95            12.65
## 4092            41.21             161.4              89            13.72
## 3674            44.56             171.6             113            14.59
## 2213            41.29             170.9              59            14.53
## 2618            29.67             180.2             103            15.32
## 2626            25.99             171.2             107            14.55
## 7               37.09             348.5             108            29.62
## 1737            48.69             205.7              74            17.48
## 2989            44.08             175.2              97            14.89
## 4047            22.07             225.3              96            19.15
## 1741            37.15             134.2             103            11.41
## 2004            32.40             161.7             104            13.74
## 2798            24.23             231.4             110            19.67
## 2876            36.09             251.5              78            21.38
## 3510            34.92             115.0             115             9.78
## 1926            32.33             197.7             141            16.80
## 4481            23.77             215.9              76            18.35
## 4691             9.32             147.5              76            12.54
## 1138            26.62             224.2              97            19.06
## 3530            39.20             185.6              97            15.78
## 4401            36.18              90.9             145             7.73
## 2939            16.58             260.4              78            22.13
## 3075            12.33             204.0             112            17.34
## 4563            23.44             234.9             107            19.97
## 4139            35.67             165.7             100            14.08
## 2821            26.20             195.3             150            16.60
## 3996            33.27             208.8             110            17.75
## 554             13.29             195.9             149            16.65
## 3718            20.67             108.3              77             9.21
## 3032            24.94             148.3              91            12.61
## 722             14.82             169.3             110            14.39
## 391             28.63             243.8              89            20.72
## 2255            34.39              69.2              84             5.88
## 3786            28.41             154.4              93            13.12
## 3563            30.67             135.1             114            11.48
## 3968            24.24             286.8             102            24.38
## 826             29.26             211.7              99            17.99
## 4585            33.86             123.2              84            10.47
## 1425            23.92             195.2             114            16.59
## 724             23.36             237.6              49            20.20
## 3489            40.75             212.8             109            18.09
## 1572            39.87             164.2              94            13.96
## 3776             8.23             139.7              86            11.87
## 1912            32.10             217.4              64            18.48
## 3289            18.62             286.1              90            24.32
## 3759            46.58             192.2              97            16.34
## 911             27.85             177.8             123            15.11
## 141             25.25             276.4              84            23.49
## 658             41.23             102.8             114             8.74
## 3293            34.65              87.8             110             7.46
## 4525            27.80             224.8             149            19.11
## 2664            28.87             183.1              94            15.56
## 2912            43.61             222.1             101            18.88
## 953             34.87             160.5              95            13.64
## 2589            28.44             194.7              99            16.55
## 869             33.41             241.3             123            20.51
## 2185            29.29             266.2              91            22.63
## 1533            25.48             170.9              84            14.53
## 562             24.96             310.0              84            26.35
## 900             35.68             249.8             104            21.23
## 3525            29.58             168.6              85            14.33
## 1989            25.53             185.7              98            15.78
## 2000            14.06             194.6              95            16.54
## 2319            33.12             213.4              73            18.14
## 2064            19.45             127.7             154            10.85
## 659             25.67             131.4             101            11.17
## 3979            27.10             269.3              53            22.89
## 2857            34.75             137.5             111            11.69
## 3831            39.22             165.4              65            14.06
## 3708            39.71             168.0             103            14.28
## 4426            39.87             237.3             116            20.17
## 4158            37.79             173.8             123            14.77
## 1528            19.91             235.4             117            20.01
## 1249            32.93             186.6              98            15.86
## 3575            39.78             238.0              61            20.23
## 3599            10.76             177.3             113            15.07
## 4419            19.45             232.8             150            19.79
## 3818            24.89             196.4             143            16.69
## 642             28.10             198.5             106            16.87
## 1385             8.82             162.0              83            13.77
## 937             27.59             233.9             115            19.88
## 3771            29.26             228.0             136            19.38
## 620             49.86             188.5              90            16.02
## 621             32.52             193.9              87            16.48
## 348             37.50             155.2             121            13.19
## 256             28.20             216.5              93            18.40
## 2556            30.97             212.3              95            18.05
## 540             18.33             216.6             125            18.41
## 3569            32.79             179.2              87            15.23
## 3512            35.04             246.6             104            20.96
## 4249            13.46             212.1             106            18.03
## 2482            33.58             206.9             119            17.59
## 4088            27.30             202.0             125            17.17
## 2125             4.59             160.9              97            13.68
## 758             19.69             243.3             111            20.68
## 2121            26.27             216.2              91            18.38
## 4640            29.33             276.5              70            23.50
## 2323            39.19             265.8             130            22.59
## 1210            29.72             219.6              93            18.67
## 1245            33.46             254.2             108            21.61
## 2597            20.74             138.3             114            11.76
## 3113            23.68             192.3              95            16.35
## 1611            27.27             293.9             103            24.98
## 292             16.92             129.1              80            10.97
## 2160            32.32             182.2             116            15.49
## 4014            39.29             182.4              93            15.50
## 2750            39.08             202.4             110            17.20
## 1691            37.71             210.6             115            17.90
## 4886            45.17              47.3             116             4.02
## 4269            29.94             244.5              84            20.78
## 2343            32.11             153.6             116            13.06
## 821             31.28              76.4             134             6.49
## 2595            58.70             203.4             106            17.29
## 4593            14.33             267.2             127            22.71
## 4911            32.52             216.4             124            18.39
## 3918            22.97             238.4              90            20.26
## 1466            22.51             121.6             101            10.34
## 886             43.33             243.2             135            20.67
## 231             53.04             129.4             100            11.00
## 1173            34.58             167.7             110            14.25
## 1675            40.97             231.8              96            19.70
## 759             47.02             220.1             113            18.71
## 1450            37.67             285.2              50            24.24
## 84              42.42             259.7              98            22.07
## 4750            31.06             197.2              93            16.76
## 3833            33.61             221.3              89            18.81
## 413             44.17             242.3             117            20.60
## 4144            38.74             218.7             118            18.59
## 2641            46.70             193.5             118            16.45
## 2007            34.78             136.0              93            11.56
## 322             21.28             177.8             105            15.11
## 2672            23.89             186.8              96            15.88
## 337             28.76             271.5              77            23.08
## 1006            36.84             116.5             126             9.90
## 2614            40.80             141.0             117            11.99
## 2292            33.34             236.8              66            20.13
## 4769            34.09             225.3             123            19.15
## 711             33.20             216.7             108            18.42
## 2373            31.93             154.6              97            13.14
## 4469            44.05             124.0             119            10.54
##      total_night_minutes total_night_calls total_night_charge
## 4575               247.5                96              11.14
## 4685               199.9                62               9.00
## 1431               217.9                71               9.81
## 4150               203.9               102               9.18
## 3207               180.4                85               8.12
## 2593               203.8               116               9.17
## 3679               168.8                89               7.60
## 673                160.2                88               7.21
## 3280               148.0               105               6.66
## 3519               230.0               130              10.35
## 2285               207.0               133               9.32
## 3588               200.9               120               9.04
## 4663               280.1                62              12.60
## 1274               261.6               122              11.77
## 2305               127.7               112               5.75
## 4686               209.7                97               9.44
## 4876               237.4               106              10.68
## 586                200.7               108               9.03
## 2367               203.5               110               9.16
## 2792               224.6                82              10.11
## 4503               154.0               118               6.93
## 691                234.5               111              10.55
## 4923               160.5               125               7.22
## 4712               239.1               129              10.76
## 411                203.0                99               9.14
## 2559               137.9                71               6.21
## 1941               250.3               101              11.26
## 4505               168.9               105               7.60
## 2223               269.1                94              12.11
## 4156               278.8               100              12.55
## 3666               198.3                94               8.92
## 4031               243.3               117              10.95
## 1929               141.1                92               6.35
## 3404               165.1               123               7.43
## 20                 192.8                74               8.68
## 4136               137.1                94               6.17
## 37                 129.3               109               5.82
## 1031               215.5               100               9.70
## 4499               196.3               102               8.83
## 3036               190.6                75               8.58
## 1883               198.8                91               8.95
## 2161               172.9                76               7.78
## 186                241.2                92              10.85
## 4826               126.8               100               5.71
## 2140               284.6                95              12.81
## 4745               161.3               103               7.26
## 4398               211.7               116               9.53
## 3170               201.0                94               9.05
## 4809               166.3               132               7.48
## 3064               141.5               110               6.37
## 1651               104.7               114               4.71
## 1717               147.4                74               6.63
## 1972               234.7                92              10.56
## 3882               180.3               100               8.11
## 193                208.2               119               9.37
## 3703               270.4               119              12.17
## 3349               279.6                90              12.58
## 847                295.6               125              13.30
## 1291               169.0               116               7.61
## 2542               175.1                86               7.88
## 3338               204.5               107               9.20
## 4855               244.2                80              10.99
## 3751               144.7                93               6.51
## 2797               185.4               136               8.34
## 4195               214.6               130               9.66
## 936                255.6                92              11.50
## 1339               206.7               116               9.30
## 4086               175.8                41               7.91
## 3419               201.5               124               9.07
## 1187               195.1               101               8.78
## 212                194.3                94               8.74
## 693                192.1               104               8.64
## 1067               267.9               112              12.06
## 2362               181.5                94               8.17
## 973                154.1                90               6.93
## 3543               169.8               108               7.64
## 39                 101.7               107               4.58
## 1849               197.4                62               8.88
## 2532               184.2                87               8.29
## 8                  211.8                96               9.53
## 2862               215.7               102               9.71
## 777                251.4                91              11.31
## 1766               127.9               101               5.76
## 3175               325.9               105              14.67
## 3814               173.2                91               7.79
## 2771               188.6                61               8.49
## 1149               211.7               103               9.53
## 443                131.6               103               5.92
## 421                171.8               124               7.73
## 1499               210.2                95               9.46
## 3278               238.5               125              10.73
## 2                  254.4               103              11.45
## 1024               122.1               117               5.49
## 4579               195.9                91               8.82
## 4542               205.8                72               9.26
## 3601               232.2                76              10.45
## 1634               285.7                89              12.86
## 2526               197.2                94               8.87
## 3647               217.9                99               9.81
## 3035               104.5                91               4.70
## 3069               124.0                92               5.58
## 1064               241.1               116              10.85
## 1061               172.8               104               7.78
## 1905               130.0               132               5.85
## 4615               200.9                91               9.04
## 4977               210.4                90               9.47
## 3621               181.3               100               8.16
## 4989               113.3               109               5.10
## 2621               297.1               119              13.37
## 12                 196.0                94               8.82
## 2978               224.5               117              10.10
## 4092               142.4                95               6.41
## 3674               208.7               100               9.39
## 2213               177.3               130               7.98
## 2618               179.0                89               8.06
## 2626               257.0               106              11.57
## 7                  212.6               118               9.57
## 1737               191.4               141               8.61
## 2989               222.4                36              10.01
## 4047               262.5               104              11.81
## 1741               118.9               105               5.35
## 2004               189.9               136               8.55
## 2798               131.2                67               5.90
## 2876               208.7                85               9.39
## 3510               214.7               130               9.66
## 1926               247.5               102              11.14
## 4481                96.9               111               4.36
## 4691               173.6               119               7.81
## 1138               260.9               135              11.74
## 3530               183.2                79               8.24
## 4401               152.3                67               6.85
## 2939                88.7               100               3.99
## 3075               117.9               118               5.31
## 4563               166.6                70               7.50
## 4139               230.1                92              10.35
## 2821               276.7                86              12.45
## 3996               177.0                84               7.97
## 554                108.0               100               4.86
## 3718               206.3                98               9.28
## 3032               238.6                69              10.74
## 722                166.7                80               7.50
## 391                214.7               102               9.66
## 2255               257.6                64              11.59
## 3786               244.5               148              11.00
## 3563               206.0                98               9.27
## 3968               222.2               113              10.00
## 826                182.2               105               8.20
## 4585                82.3                77               3.70
## 1425               252.9               107              11.38
## 724                206.7               136               9.30
## 3489               187.7                77               8.45
## 1572               191.4                72               8.61
## 3776               154.1                80               6.93
## 1912               220.1               100               9.90
## 3289               247.6               113              11.14
## 3759               125.9               141               5.67
## 911                190.4               106               8.57
## 141                193.6               112               8.71
## 658                142.4                89               6.41
## 3293               166.2               122               7.48
## 4525               253.1               116              11.39
## 2664               395.0                72              17.77
## 2912               156.7               122               7.05
## 953                149.5               142               6.73
## 2589               236.7               112              10.65
## 869                143.2               105               6.44
## 2185               228.2                90              10.27
## 1533               171.5               112               7.72
## 562                178.7               130               8.04
## 900                224.2                92              10.09
## 3525               132.1               120               5.94
## 1989               212.5               128               9.56
## 2000               159.0                54               7.15
## 2319               190.8                92               8.59
## 2064               253.1               109              11.39
## 659                186.6                86               8.40
## 3979               179.8               103               8.09
## 2857               226.0               100              10.17
## 3831               309.1               139              13.91
## 3708               276.3                71              12.43
## 4426               122.7               110               5.52
## 4158               297.2                97              13.37
## 1528               221.3               108               9.96
## 1249               223.0               100              10.04
## 3575               201.9               125               9.09
## 3599               171.6                82               7.72
## 4419               198.1                78               8.91
## 3818               235.6               123              10.60
## 642                208.5               102               9.38
## 1385               223.5               115              10.06
## 937                277.4                94              12.48
## 3771               269.4                96              12.12
## 620                266.9                91              12.01
## 621                268.4               121              12.08
## 348                186.7                89               8.40
## 256                173.1                86               7.79
## 2556               233.0               123              10.49
## 540                217.5                92               9.79
## 3569               185.5               137               8.35
## 3512               254.6                83              11.46
## 4249               224.6               104              10.11
## 2482               163.6               109               7.36
## 4088               221.7               100               9.98
## 2125               279.5                96              12.58
## 758                184.6                78               8.31
## 2121               229.8                82              10.34
## 4640               180.9                90               8.14
## 2323               269.7                69              12.14
## 1210               255.8                90              11.51
## 1245               261.8                85              11.78
## 2597               224.2               128              10.09
## 3113               151.0                75               6.80
## 1611               306.6                90              13.80
## 292                125.1               124               5.63
## 2160               279.8               105              12.59
## 4014               227.2                89              10.22
## 2750               171.4               105               7.71
## 1691               221.8               109               9.98
## 4886               140.3               103               6.31
## 4269               189.9                99               8.55
## 2343               213.3               106               9.60
## 821                185.1                96               8.33
## 2595               217.5               107               9.79
## 4593               167.7                75               7.55
## 4911               100.3               132               4.51
## 3918               286.4                93              12.89
## 1466               197.7                84               8.90
## 886                190.8                95               8.59
## 231                217.6                74               9.79
## 1173               132.0               124               5.94
## 1675               220.2                67               9.91
## 759                177.9                95               8.01
## 1450               167.4                83               7.53
## 84                 222.7                68              10.02
## 4750               220.6               105               9.93
## 3833               189.4               109               8.52
## 413                168.8                72               7.60
## 4144               223.2               120              10.04
## 2641               299.6               109              13.48
## 2007               210.5                82               9.47
## 322                232.4                89              10.46
## 2672               227.0                89              10.22
## 337                170.2               104               7.66
## 1006               220.0               110               9.90
## 2614               128.9               137               5.80
## 2292               182.3                75               8.20
## 4769               134.0                89               6.03
## 711                259.9               119              11.70
## 2373               213.9               102               9.63
## 4469               188.3                79               8.47
##      total_intl_minutes total_intl_calls total_intl_charge
## 4575               17.8                2              4.81
## 4685               10.1               11              2.73
## 1431                7.6                4              2.05
## 4150                9.8                2              2.65
## 3207               10.2               13              2.75
## 2593               12.8                3              3.46
## 3679               11.2                4              3.02
## 673                11.8                5              3.19
## 3280                8.3                5              2.24
## 3519               10.3                2              2.78
## 2285               12.6                5              3.40
## 3588                7.8               11              2.11
## 4663               12.1                1              3.27
## 1274                8.3                2              2.24
## 2305               11.0                9              2.97
## 4686                8.3                5              2.24
## 4876                5.0                4              1.35
## 586                10.2                5              2.75
## 2367               14.0                5              3.78
## 2792                5.5                3              1.49
## 4503                9.8                3              2.65
## 691                14.9                3              4.02
## 4923               13.0                7              3.51
## 4712                9.6                5              2.59
## 411                 8.4                5              2.27
## 2559                9.6                5              2.59
## 1941                8.7                4              2.35
## 4505                0.0                0              0.00
## 2223               12.1                9              3.27
## 4156                8.5                2              2.30
## 3666               11.0                5              2.97
## 4031               12.8                9              3.46
## 1929               11.2                5              3.02
## 3404                0.0                0              0.00
## 20                 13.0                2              3.51
## 4136               12.6                6              3.40
## 37                 14.5                6              3.92
## 1031                9.3                4              2.51
## 4499                7.8                3              2.11
## 3036               11.6                3              3.13
## 1883               12.9                3              3.48
## 2161                7.9                1              2.13
## 186                10.8               13              2.92
## 4826                9.2                1              2.48
## 2140               12.0                5              3.24
## 4745               10.2                3              2.75
## 4398                8.6                9              2.32
## 3170               12.0                3              3.24
## 4809                6.8                3              1.84
## 3064               12.2                2              3.29
## 1651                9.6                2              2.59
## 1717                9.1                2              2.46
## 1972                9.0                4              2.43
## 3882               10.6                3              2.86
## 193                 8.8                4              2.38
## 3703                5.9                6              1.59
## 3349                7.4                2              2.00
## 847                10.3                4              2.78
## 1291                9.1                3              2.46
## 2542               14.2                2              3.83
## 3338                7.7                4              2.08
## 4855               11.5                4              3.11
## 3751               12.0                9              3.24
## 2797               16.2                2              4.37
## 4195                8.8                3              2.38
## 936                13.0                3              3.51
## 1339                9.3                3              2.51
## 4086                9.9                4              2.67
## 3419               11.3               12              3.05
## 1187                8.7                5              2.35
## 212                11.9               11              3.21
## 693                 5.7                7              1.54
## 1067               11.8                4              3.19
## 2362               10.5                3              2.84
## 973                11.8                3              3.19
## 3543               10.3                3              2.78
## 39                 10.5                6              2.84
## 1849                8.6                3              2.32
## 2532               10.1                4              2.73
## 8                   7.1                6              1.92
## 2862               13.1                3              3.54
## 777                10.5                4              2.84
## 1766               12.7                2              3.43
## 3175                8.6                6              2.32
## 3814               10.0                3              2.70
## 2771                5.7                3              1.54
## 1149                9.7                7              2.62
## 443                 6.8                3              1.84
## 421                 9.3               11              2.51
## 1499               11.1                3              3.00
## 3278               10.0                9              2.70
## 2                  13.7                3              3.70
## 1024                7.8                8              2.11
## 4579               11.0                2              2.97
## 4542                9.8                4              2.65
## 3601                7.5                8              2.03
## 1634                9.5                3              2.57
## 2526                9.7                2              2.62
## 3647                9.9                4              2.67
## 3035               11.2                8              3.02
## 3069                9.5                1              2.57
## 1064               10.7                2              2.89
## 1061               15.3                3              4.13
## 1905               14.5                4              3.92
## 4615               13.0                3              3.51
## 4977               19.7                4              5.32
## 3621               10.7                8              2.89
## 4989               16.2                2              4.37
## 2621               11.0                7              2.97
## 12                  9.1                5              2.46
## 2978                6.7                2              1.81
## 4092                5.4                2              1.46
## 3674                5.3                3              1.43
## 2213                4.8               12              1.30
## 2618               10.7                2              2.89
## 2626               12.0                5              3.24
## 7                   7.5                7              2.03
## 1737                6.9                6              1.86
## 2989               12.0                5              3.24
## 4047                8.6                3              2.32
## 1741                9.4                6              2.54
## 2004               13.0                6              3.51
## 2798               10.0                4              2.70
## 2876                6.6                2              1.78
## 3510                9.4                2              2.54
## 1926                9.8                6              2.65
## 4481                7.9               10              2.13
## 4691               11.5                1              3.11
## 1138               11.3                1              3.05
## 3530                6.2                4              1.67
## 4401               11.9                3              3.21
## 2939                7.0                5              1.89
## 3075                6.6                3              1.78
## 4563               14.3                3              3.86
## 4139                8.7                6              2.35
## 2821               13.2                2              3.56
## 3996               11.7                2              3.16
## 554                10.1                6              2.73
## 3718               12.3                2              3.32
## 3032               12.5                3              3.38
## 722                10.9                5              2.94
## 391                11.1                2              3.00
## 2255                6.7                3              1.81
## 3786               13.7               10              3.70
## 3563                9.5                6              2.57
## 3968               14.8                3              4.00
## 826                11.6                6              3.13
## 4585                9.6                4              2.59
## 1425               11.7                5              3.16
## 724                14.0               11              3.78
## 3489               10.5                2              2.84
## 1572                6.1                4              1.65
## 3776               11.5                4              3.11
## 1912                8.2                7              2.21
## 3289                4.9                9              1.32
## 3759               11.9                2              3.21
## 911                 8.1                5              2.19
## 141                12.4                3              3.35
## 658                 9.3                2              2.51
## 3293               11.7                4              3.16
## 4525                9.8                4              2.65
## 2664               12.7                7              3.43
## 2912               13.0                3              3.51
## 953                10.7                2              2.89
## 2589               12.0                9              3.24
## 869                 4.0                7              1.08
## 2185               11.8                5              3.19
## 1533               11.5                7              3.11
## 562                 7.2                7              1.94
## 900                 8.7                7              2.35
## 3525                7.8                4              2.11
## 1989               12.1                2              3.27
## 2000               10.9                9              2.94
## 2319               11.5                7              3.11
## 2064               10.1                5              2.73
## 659                 9.9                7              2.67
## 3979               13.7                2              3.70
## 2857               10.0                4              2.70
## 3831               13.3               10              3.59
## 3708                6.3                2              1.70
## 4426               12.9                3              3.48
## 4158                5.7                2              1.54
## 1528                9.0                2              2.43
## 1249               11.6                8              3.13
## 3575                8.8                4              2.38
## 3599               14.2                4              3.83
## 4419                8.9                3              2.40
## 3818                9.4                3              2.54
## 642                 9.8                3              2.65
## 1385               10.1                3              2.73
## 937                 9.2                4              2.48
## 3771               14.1                3              3.81
## 620                14.5                4              3.92
## 621                12.8                4              3.46
## 348                10.5               11              2.84
## 256                14.1                8              3.81
## 2556                9.3                4              2.51
## 540                 9.9                3              2.67
## 3569               18.7                7              5.05
## 3512               12.1                7              3.27
## 4249               10.2                4              2.75
## 2482               11.3                4              3.05
## 4088                8.8                2              2.38
## 2125               10.7                3              2.89
## 758                13.1                5              3.54
## 2121               13.7                3              3.70
## 4640                6.9                2              1.86
## 2323               10.6                6              2.86
## 1210               12.8                3              3.46
## 1245                7.7                2              2.08
## 2597                5.8                5              1.57
## 3113                9.3                3              2.51
## 1611               12.6                5              3.40
## 292                 9.7                3              2.62
## 2160               13.0                2              3.51
## 4014                9.8                5              2.65
## 2750               14.2                6              3.83
## 1691               12.4                9              3.35
## 4886                8.7                1              2.35
## 4269               11.2                1              3.02
## 2343               10.2                2              2.75
## 821                12.7                3              3.43
## 2595               11.8                8              3.19
## 4593                8.3                6              2.24
## 4911                8.6                9              2.32
## 3918               11.0                9              2.97
## 1466                8.6                2              2.32
## 886                 5.4                3              1.46
## 231                10.5                2              2.84
## 1173                9.2                5              2.48
## 1675                9.9                1              2.67
## 759                 9.8                6              2.65
## 1450               12.7                6              3.43
## 84                  9.8                4              2.65
## 4750                8.1                4              2.19
## 3833                7.8                3              2.11
## 413                 5.4                1              1.46
## 4144               14.2                5              3.83
## 2641               10.8                3              2.92
## 2007                6.6                2              1.78
## 322                12.9                3              3.48
## 2672                9.5                5              2.57
## 337                10.6                2              2.86
## 1006                9.8                4              2.65
## 2614                7.1                9              1.92
## 2292               11.9                1              3.21
## 4769                6.7                5              1.81
## 711                12.5                4              3.38
## 2373               10.1                3              2.73
## 4469               12.8                3              3.46
##      number_customer_service_calls
## 4575                             1
## 4685                             3
## 1431                             1
## 4150                             1
## 3207                             1
## 2593                             5
## 3679                             2
## 673                              0
## 3280                             2
## 3519                             3
## 2285                             2
## 3588                             1
## 4663                             3
## 1274                             6
## 2305                             0
## 4686                             2
## 4876                             0
## 586                              1
## 2367                             1
## 2792                             0
## 4503                             4
## 691                              2
## 4923                             1
## 4712                             0
## 411                              2
## 2559                             0
## 1941                             1
## 4505                             1
## 2223                             0
## 4156                             2
## 3666                             5
## 4031                             0
## 1929                             1
## 3404                             1
## 20                               1
## 4136                             0
## 37                               0
## 1031                             2
## 4499                             0
## 3036                             4
## 1883                             2
## 2161                             1
## 186                              1
## 4826                             3
## 2140                             3
## 4745                             3
## 4398                             1
## 3170                             4
## 4809                             2
## 3064                             1
## 1651                             1
## 1717                             1
## 1972                             1
## 3882                             2
## 193                              2
## 3703                             3
## 3349                             1
## 847                              1
## 1291                             0
## 2542                             2
## 3338                             2
## 4855                             2
## 3751                             1
## 2797                             0
## 4195                             0
## 936                              4
## 1339                             2
## 4086                             7
## 3419                             1
## 1187                             0
## 212                              0
## 693                              4
## 1067                             0
## 2362                             3
## 973                              1
## 3543                             0
## 39                               3
## 1849                             2
## 2532                             1
## 8                                0
## 2862                             4
## 777                              0
## 1766                             4
## 3175                             2
## 3814                             3
## 2771                             2
## 1149                             3
## 443                              1
## 421                              2
## 1499                             0
## 3278                             2
## 2                                1
## 1024                             2
## 4579                             1
## 4542                             0
## 3601                             3
## 1634                             0
## 2526                             0
## 3647                             0
## 3035                             1
## 3069                             2
## 1064                             1
## 1061                             1
## 1905                             4
## 4615                             1
## 4977                             0
## 3621                             2
## 4989                             2
## 2621                             1
## 12                               0
## 2978                             0
## 4092                             1
## 3674                             0
## 2213                             1
## 2618                             2
## 2626                             2
## 7                                3
## 1737                             1
## 2989                             3
## 4047                             1
## 1741                             0
## 2004                             1
## 2798                             2
## 2876                             3
## 3510                             5
## 1926                             2
## 4481                             2
## 4691                             1
## 1138                             1
## 3530                             2
## 4401                             2
## 2939                             1
## 3075                             1
## 4563                             5
## 4139                             0
## 2821                             4
## 3996                             2
## 554                              2
## 3718                             5
## 3032                             3
## 722                              6
## 391                              1
## 2255                             1
## 3786                             3
## 3563                             1
## 3968                             0
## 826                              1
## 4585                             2
## 1425                             0
## 724                              3
## 3489                             1
## 1572                             1
## 3776                             2
## 1912                             4
## 3289                             1
## 3759                             0
## 911                              0
## 141                              1
## 658                              2
## 3293                             1
## 4525                             1
## 2664                             2
## 2912                             1
## 953                              3
## 2589                             2
## 869                              0
## 2185                             1
## 1533                             0
## 562                              0
## 900                              1
## 3525                             1
## 1989                             1
## 2000                             0
## 2319                             0
## 2064                             2
## 659                              0
## 3979                             2
## 2857                             0
## 3831                             1
## 3708                             2
## 4426                             4
## 4158                             0
## 1528                             0
## 1249                             0
## 3575                             0
## 3599                             1
## 4419                             0
## 3818                             3
## 642                              1
## 1385                             3
## 937                              0
## 3771                             2
## 620                              0
## 621                              1
## 348                              1
## 256                              4
## 2556                             2
## 540                              2
## 3569                             0
## 3512                             2
## 4249                             3
## 2482                             1
## 4088                             1
## 2125                             3
## 758                              1
## 2121                             1
## 4640                             3
## 2323                             5
## 1210                             0
## 1245                             1
## 2597                             1
## 3113                             7
## 1611                             4
## 292                              0
## 2160                             1
## 4014                             4
## 2750                             1
## 1691                             1
## 4886                             3
## 4269                             2
## 2343                             2
## 821                              2
## 2595                             1
## 4593                             4
## 4911                             3
## 3918                             4
## 1466                             1
## 886                              2
## 231                              0
## 1173                             2
## 1675                             1
## 759                              2
## 1450                             4
## 84                               1
## 4750                             1
## 3833                             2
## 413                              0
## 4144                             3
## 2641                             3
## 2007                             3
## 322                              1
## 2672                             2
## 337                              0
## 1006                             2
## 2614                             1
## 2292                             0
## 4769                             4
## 711                              3
## 2373                             2
## 4469                             0

str(churn_x)

## 'data.frame':    250 obs. of  70 variables:
##  $ stateAK                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateAL                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateAR                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateAZ                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateCA                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateCO                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateCT                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateDC                      : int  0 0 0 0 0 0 0 0 0 1 ...
##  $ stateDE                      : int  0 0 0 1 0 0 0 0 0 0 ...
##  $ stateFL                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateGA                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateHI                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateIA                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateID                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateIL                      : int  0 0 0 0 0 0 0 1 0 0 ...
##  $ stateIN                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateKS                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateKY                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateLA                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateMA                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateMD                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateME                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateMI                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateMN                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateMO                      : int  0 1 0 0 0 0 0 0 0 0 ...
##  $ stateMS                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateMT                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateNC                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateND                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateNE                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateNH                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateNJ                      : int  0 0 0 0 0 0 0 0 1 0 ...
##  $ stateNM                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateNV                      : int  0 0 0 0 0 0 1 0 0 0 ...
##  $ stateNY                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateOH                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateOK                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateOR                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ statePA                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateRI                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateSC                      : int  1 0 0 0 0 0 0 0 0 0 ...
##  $ stateSD                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateTN                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateTX                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateUT                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateVA                      : int  0 0 0 0 0 1 0 0 0 0 ...
##  $ stateVT                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateWA                      : int  0 0 0 0 1 0 0 0 0 0 ...
##  $ stateWI                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ stateWV                      : int  0 0 1 0 0 0 0 0 0 0 ...
##  $ stateWY                      : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ account_length               : int  137 83 48 67 143 163 100 151 139 17 ...
##  $ area_codearea_code_415       : int  0 1 1 1 0 1 1 0 1 0 ...
##  $ area_codearea_code_510       : int  1 0 0 0 1 0 0 0 0 1 ...
##  $ international_planyes        : int  0 0 0 0 0 0 0 0 0 0 ...
##  $ voice_mail_planyes           : int  0 0 1 0 0 0 1 0 1 1 ...
##  $ number_vmail_messages        : int  0 0 34 0 0 0 39 0 43 30 ...
##  $ total_day_minutes            : num  110 197 198 164 133 ...
##  $ total_day_calls              : int  112 117 70 79 107 100 74 106 85 101 ...
##  $ total_day_charge             : num  18.7 33.4 33.7 28 22.7 ...
##  $ total_eve_minutes            : num  224 272 274 110 224 ...
##  $ total_eve_calls              : int  88 89 121 108 117 46 80 87 82 85 ...
##  $ total_eve_charge             : num  19 23.12 23.26 9.38 19.03 ...
##  $ total_night_minutes          : num  248 200 218 204 180 ...
##  $ total_night_calls            : int  96 62 71 102 85 116 89 88 105 130 ...
##  $ total_night_charge           : num  11.14 9 9.81 9.18 8.12 ...
##  $ total_intl_minutes           : num  17.8 10.1 7.6 9.8 10.2 12.8 11.2 11.8 8.3 10.3 ...
##  $ total_intl_calls             : int  2 11 4 2 13 3 4 5 5 2 ...
##  $ total_intl_charge            : num  4.81 2.73 2.05 2.65 2.75 3.46 3.02 3.19 2.24 2.78 ...
##  $ number_customer_service_calls: int  1 3 1 1 1 5 2 0 2 3 ...

Pass them to trainControl() to create a reusable trainControl for comparing models.

# Create reusable trainControl object: myControl
myControl <- trainControl(
  summaryFunction = twoClassSummary,
  classProbs = TRUE, # IMPORTANT!
  verboseIter = TRUE,
  savePredictions = TRUE,
  index = myFolds
)

Great work! By saving the indexes in the train control, we can fit many models using the same CV folds.

5.2 Reintroducing glmnet

5.2.1 glmnet as a baseline model

What makes glmnet a good baseline model?

It’s simple, fast, and easy to interpret.

It always gives poor predictions, so your other models will look good by comparison.
Linear models with penalties on their coefficients always give better results.

Correct! You can interpret the coefficients the same way as the coefficients from an lm or glm model.

5.2.2 Fit the baseline model

Now that you have a reusable trainControl object called myControl, you can start fitting different predictive models to your churn dataset and evaluate their predictive accuracy.

You’ll start with one of my favorite models, glmnet, which penalizes linear and logistic regression models on the size and number of coefficients to help prevent overfitting.

Fit a glmnet model to the churn dataset called model_glmnet. Make sure to use myControl, which you created in the first exercise and is available in your workspace, as the trainControl object.

# Fit glmnet model: model_glmnet
model_glmnet <- train(
  x = churn_x, 
  y = churn_y,
  metric = "ROC",
  method = "glmnet",
  trControl = myControl
)

## + Fold1: alpha=0.10, lambda=0.01821 
## - Fold1: alpha=0.10, lambda=0.01821 
## + Fold1: alpha=0.55, lambda=0.01821 
## - Fold1: alpha=0.55, lambda=0.01821 
## + Fold1: alpha=1.00, lambda=0.01821 
## - Fold1: alpha=1.00, lambda=0.01821 
## + Fold2: alpha=0.10, lambda=0.01821 
## - Fold2: alpha=0.10, lambda=0.01821 
## + Fold2: alpha=0.55, lambda=0.01821 
## - Fold2: alpha=0.55, lambda=0.01821 
## + Fold2: alpha=1.00, lambda=0.01821 
## - Fold2: alpha=1.00, lambda=0.01821 
## + Fold3: alpha=0.10, lambda=0.01821 
## - Fold3: alpha=0.10, lambda=0.01821 
## + Fold3: alpha=0.55, lambda=0.01821 
## - Fold3: alpha=0.55, lambda=0.01821 
## + Fold3: alpha=1.00, lambda=0.01821 
## - Fold3: alpha=1.00, lambda=0.01821 
## + Fold4: alpha=0.10, lambda=0.01821 
## - Fold4: alpha=0.10, lambda=0.01821 
## + Fold4: alpha=0.55, lambda=0.01821 
## - Fold4: alpha=0.55, lambda=0.01821 
## + Fold4: alpha=1.00, lambda=0.01821 
## - Fold4: alpha=1.00, lambda=0.01821 
## + Fold5: alpha=0.10, lambda=0.01821 
## - Fold5: alpha=0.10, lambda=0.01821 
## + Fold5: alpha=0.55, lambda=0.01821 
## - Fold5: alpha=0.55, lambda=0.01821 
## + Fold5: alpha=1.00, lambda=0.01821 
## - Fold5: alpha=1.00, lambda=0.01821 
## Aggregating results
## Selecting tuning parameters
## Fitting alpha = 0.55, lambda = 0.0182 on full training set

Great work! This model uses our custome CV folds and will be easily compared to other models.

5.3 Reintroducing random forest

5.3.1 Random forest drawback

What’s the drawback of using a random forest model for churn prediction?

Tree-based models are usually less accurate than linear models.

You no longer have model coefficients to help interpret the model.

Nobody else uses random forests to predict churn.

Yup! Random forests are a little bit harder to interpret than linear models, though it is still possible to understand them.

5.3.2 Random forest with custom trainControl

Another one of my favorite models is the random forest, which combines an ensemble of non-linear decision trees into a highly flexible (and usually quite accurate) model.

Rather than using the classic randomForest package, you’ll be using the ranger package, which is a re-implementation of randomForest that produces almost the exact same results, but is faster, more stable, and uses less memory. I highly recommend it as a starting point for random forest modeling in R.

churn_x and churn_y are loaded in your workspace.

Fit a random forest model to the churn dataset. Be sure to use myControl as the trainControl like you’ve done before and implement the “ranger” method.

# Fit random forest: model_rf
model_rf <- train(
  x = churn_x, 
  y = churn_y,
  metric = "ROC",
  method = "ranger",
  trControl = myControl
)

## + Fold1: mtry= 2, min.node.size=1, splitrule=gini 
## - Fold1: mtry= 2, min.node.size=1, splitrule=gini 
## + Fold1: mtry=36, min.node.size=1, splitrule=gini 
## - Fold1: mtry=36, min.node.size=1, splitrule=gini 
## + Fold1: mtry=70, min.node.size=1, splitrule=gini 
## - Fold1: mtry=70, min.node.size=1, splitrule=gini 
## + Fold1: mtry= 2, min.node.size=1, splitrule=extratrees 
## - Fold1: mtry= 2, min.node.size=1, splitrule=extratrees 
## + Fold1: mtry=36, min.node.size=1, splitrule=extratrees 
## - Fold1: mtry=36, min.node.size=1, splitrule=extratrees 
## + Fold1: mtry=70, min.node.size=1, splitrule=extratrees 
## - Fold1: mtry=70, min.node.size=1, splitrule=extratrees 
## + Fold2: mtry= 2, min.node.size=1, splitrule=gini 
## - Fold2: mtry= 2, min.node.size=1, splitrule=gini 
## + Fold2: mtry=36, min.node.size=1, splitrule=gini 
## - Fold2: mtry=36, min.node.size=1, splitrule=gini 
## + Fold2: mtry=70, min.node.size=1, splitrule=gini 
## - Fold2: mtry=70, min.node.size=1, splitrule=gini 
## + Fold2: mtry= 2, min.node.size=1, splitrule=extratrees 
## - Fold2: mtry= 2, min.node.size=1, splitrule=extratrees 
## + Fold2: mtry=36, min.node.size=1, splitrule=extratrees 
## - Fold2: mtry=36, min.node.size=1, splitrule=extratrees 
## + Fold2: mtry=70, min.node.size=1, splitrule=extratrees 
## - Fold2: mtry=70, min.node.size=1, splitrule=extratrees 
## + Fold3: mtry= 2, min.node.size=1, splitrule=gini 
## - Fold3: mtry= 2, min.node.size=1, splitrule=gini 
## + Fold3: mtry=36, min.node.size=1, splitrule=gini 
## - Fold3: mtry=36, min.node.size=1, splitrule=gini 
## + Fold3: mtry=70, min.node.size=1, splitrule=gini 
## - Fold3: mtry=70, min.node.size=1, splitrule=gini 
## + Fold3: mtry= 2, min.node.size=1, splitrule=extratrees 
## - Fold3: mtry= 2, min.node.size=1, splitrule=extratrees 
## + Fold3: mtry=36, min.node.size=1, splitrule=extratrees 
## - Fold3: mtry=36, min.node.size=1, splitrule=extratrees 
## + Fold3: mtry=70, min.node.size=1, splitrule=extratrees 
## - Fold3: mtry=70, min.node.size=1, splitrule=extratrees 
## + Fold4: mtry= 2, min.node.size=1, splitrule=gini 
## - Fold4: mtry= 2, min.node.size=1, splitrule=gini 
## + Fold4: mtry=36, min.node.size=1, splitrule=gini 
## - Fold4: mtry=36, min.node.size=1, splitrule=gini 
## + Fold4: mtry=70, min.node.size=1, splitrule=gini 
## - Fold4: mtry=70, min.node.size=1, splitrule=gini 
## + Fold4: mtry= 2, min.node.size=1, splitrule=extratrees 
## - Fold4: mtry= 2, min.node.size=1, splitrule=extratrees 
## + Fold4: mtry=36, min.node.size=1, splitrule=extratrees 
## - Fold4: mtry=36, min.node.size=1, splitrule=extratrees 
## + Fold4: mtry=70, min.node.size=1, splitrule=extratrees 
## - Fold4: mtry=70, min.node.size=1, splitrule=extratrees 
## + Fold5: mtry= 2, min.node.size=1, splitrule=gini 
## - Fold5: mtry= 2, min.node.size=1, splitrule=gini 
## + Fold5: mtry=36, min.node.size=1, splitrule=gini 
## - Fold5: mtry=36, min.node.size=1, splitrule=gini 
## + Fold5: mtry=70, min.node.size=1, splitrule=gini 
## - Fold5: mtry=70, min.node.size=1, splitrule=gini 
## + Fold5: mtry= 2, min.node.size=1, splitrule=extratrees 
## - Fold5: mtry= 2, min.node.size=1, splitrule=extratrees 
## + Fold5: mtry=36, min.node.size=1, splitrule=extratrees 
## - Fold5: mtry=36, min.node.size=1, splitrule=extratrees 
## + Fold5: mtry=70, min.node.size=1, splitrule=extratrees 
## - Fold5: mtry=70, min.node.size=1, splitrule=extratrees 
## Aggregating results
## Selecting tuning parameters
## Fitting mtry = 36, splitrule = extratrees, min.node.size = 1 on full training set

Great work! This random forest uses the custom CV folds, so we can easily compare it to the baseline model.

5.4 Comparing models

5.4.1 Matching train/test indices

What’s the primary reason that train/test indices need to match when comparing two models?

You can save a lot of time when fitting your models because you don’t have to remake the datasets.
There’s no real reason; it just makes your plots look better.

Because otherwise you wouldn’t be doing a fair comparison of your models and your results could be due to chance.

Correct! Train/test indexes allow you to evaluate your models out of sample so you know that they work!

5.4.2 Create a resamples object

Now that you have fit two models to the churn dataset, it’s time to compare their out-of-sample predictions and choose which one is the best model for your dataset.

You can compare models in caret using the resamples() function, provided they have the same training data and use the same trainControl object with preset cross-validation folds. resamples() takes as input a list of models and can be used to compare dozens of models at once (though in this case you are only comparing two models).

model_glmnet and model_rf are loaded in your workspace.

Create a list() containing the glmnet model as item1 and the ranger model as item2.

# Create model_list
model_list <- list(item1 = model_glmnet, item2 = model_rf)

Pass this list to the resamples() function and save the resulting object as resamples.

# Pass model_list to resamples(): resamples
resamples <- resamples(model_list)

Summarize the results by calling summary() on resamples.

# Summarize the results
summary(resamples)

## 
## Call:
## summary.resamples(object = resamples)
## 
## Models: item1, item2 
## Number of resamples: 5 
## 
## ROC 
##            Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## item1 0.4489390 0.4832007 0.5296198 0.5440319 0.6178286 0.6405714    0
## item2 0.6621353 0.7017020 0.7075429 0.7000982 0.7100571 0.7190539    0
## 
## Sens 
##            Min.   1st Qu.    Median      Mean   3rd Qu.      Max. NA's
## item1 0.9195402 0.9482759 0.9542857 0.9552644 0.9657143 0.9885057    0
## item2 0.9712644 0.9885714 0.9942857 0.9908243 1.0000000 1.0000000    0
## 
## Spec 
##             Min.    1st Qu.     Median       Mean 3rd Qu. Max. NA's
## item1 0.03846154 0.03846154 0.07692308 0.09476923    0.08 0.24    0
## item2 0.00000000 0.00000000 0.00000000 0.03200000    0.00 0.16    0

Amazing! The resamples function gives us a bunch of options for comparing models, that we’ll explore further in the next exercises.

5.5 More on resamples

5.5.1 Create a box-and-whisker plot

caret provides a variety of methods to use for comparing models. All of these methods are based on the resamples() function. My favorite is the box-and-whisker plot, which allows you to compare the distribution of predictive accuracy (in this case AUC) for the two models.

In general, you want the model with the higher median AUC, as well as a smaller range between min and max AUC.

You can make this plot using the bwplot() function, which makes a box and whisker plot of the model’s out of sample scores. Box and whisker plots show the median of each distribution as a line and the interquartile range of each distribution as a box around the median line. You can pass the metric = “ROC” argument to the bwplot() function to show a plot of the model’s out-of-sample ROC scores and choose the model with the highest median ROC.

If you do not specify a metric to plot, bwplot() will automatically plot 3 of them.

Pass the resamples object to the bwplot() function to make a box-and-whisker plot. Look at the resulting plot and note which model has the higher median ROC statistic. Be sure to specify which metric you want to plot.

# Create bwplot
bwplot(resamples, metric = "ROC")

Great work! I’m a big fan of box and whisker plots for comparing models.

5.5.2 Create a scatterplot

Another useful plot for comparing models is the scatterplot, also known as the xy-plot. This plot shows you how similar the two models’ performances are on different folds.

It’s particularly useful for identifying if one model is consistently better than the other across all folds, or if there are situations when the inferior model produces better predictions on a particular subset of the data.

Pass the resamples object to the xyplot() function. Look at the resulting plot and note how similar the two models’ predictions are (or are not) on the different folds. Be sure to specify which metric you want to plot.

# Create xyplot
xyplot(resamples, metric = "ROC")

Nice one! These scatterplots let you see if one model is always better than the other.

5.5.3 Ensembling models

That concludes the course! As a teaser for a future course on making ensembles of caret models, I’ll show you how to fit a stacked ensemble of models using the caretEnsemble package.

caretEnsemble provides the caretList() function for creating multiple caret models at once on the same dataset, using the same resampling folds. You can also create your own lists of caret models.

In this exercise, I’ve made a caretList for you, containing the glmnet and ranger models you fit on the churn dataset. Use the caretStack() function to make a stack of caret models, with the two sub-models (glmnet and ranger) feeding into another (hopefully more accurate!) caret model.

Call the caretStack() function with two arguments, model_list and method = “glm”, to ensemble the two models using a logistic regression. Store the result as stack.

library(caretEnsemble)
model_list <- c(item1 = model_glmnet, item2 = model_rf)
# Create ensemble model: stack
stack <- caretStack(model_list, method = "glm")

Summarize the resulting model with the summary() function.

# Look at summary
summary(stack)

## 
## Call:
## NULL
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -1.4886  -0.4979  -0.4298  -0.4028   2.3499  
## 
## Coefficients:
##               Estimate Std. Error z value Pr(>|z|)    
## (Intercept)     2.0415     0.6178   3.304 0.000952 ***
## item1.glmnet1   3.3744     0.8527   3.957 7.58e-05 ***
## item2.ranger2  -7.9001     1.2252  -6.448 1.13e-10 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 765.13  on 999  degrees of freedom
## Residual deviance: 719.59  on 997  degrees of freedom
## AIC: 725.59
## 
## Number of Fisher Scoring iterations: 5

Great work! The caretEnsemble package gives you an easy way to combine many caret models. Now for a brief farewell message from Max…