Machine learning (ML) develops algorithms to identify patterns in data (unsupervised ML) or make predictions and inferences (supervised ML).
Supervised ML trains the machine to learn from prior examples to predict either a categorical outcome (classification) or a numeric outcome (regression), or to infer the relationships between the outcome and its explanatory variables.
Two early forms of supervised ML are linear regression (OLS) and generalized linear models (GLM) (Poisson regression and logistic regression). These methods have been improved with advanced linear methods, including stepwise selection, regularization (ridge, lasso, elastic net), principal components regression, and partial least squares. With greater computing capacity, non-linear models are now in use, including polynomial regression, step functions, splines, and generalized additive models (GAM). Decision trees (bagging, random forests, and, boosting) are additional options for regression and classification, and support vector machines is an additional option for classification.
These notes cover decision trees, the tree-based methods for supervised ML. Simple decision trees, including Classification trees and regression trees, are easy to use and interpret, but are not competitive with the best linear and non-linear machine learning methods. However, they form the foundation of bagged trees, random forests, and boosted tree models, all of which are very accurate (although less interpretable).
Decision tree algorithms use the training data to segment the predictor space into non-overlapping regions, the nodes of the tree. Each node is described by a set of rules which are then used to predict new responses. The predicted value for each node is the most common response in the node (classification), or mean response in the node (regression).
The algorithm splits by recursive partitioning, starting with all the observations in a single node. It splits this node at the best predictor variable and best cutpoint so that the responses within each subtree are as homogenous as possible, and repeats the splitting process for each of the child nodes until a stopping criterion is satisfied. The split cutoff mimizes the residual sum of squares (RSS) in the subpartition (regression trees), or maximizes the “purity” of in the subpartition (classification trees). There are two common measures of purity: the Gini index and entropy where is the proportion of misclassified observations within the subpartition.
This much results in a large tree that provides a good fit to the training data, but it likely overfits the data. The solution is to “prune” leaves from the tree. The most common pruning method is cost-complexity pruning. Cost-complexity pruning minimizes the cost complexity:
where is the tree size (complexity), is the missclassification rate (decision trees) or RSS (regression trees), and is the complexity parameter. It is expensive to evaluate the error on all possible subtrees, so instead the algorithm defines a sequence of nested trees by successively pruning leaves from the tree, repeating until only the root node remains. The complexity parameter yielding the lowest cost complexity is the optimal tree size.
Pruning is performed either with the validation dataset, or by k-fold cross validation on the training set (if the original dataset is too small to carve out a validation dataset). In the first case, the subtree with the lowest is the final tree. With k-fold cross validation the training dataset is partitioned into k folds and the all but one fold is used to calculate the optimal complexity parameter and measure the resulting the resulting error rate on the held-out fold. K-fold proceeds by repeating k times, holding out one fold each time. The complexity parameter with the lowest error rate among the k-folds is the optimal tree.
Decision trees have limitations. They only provide course-grain predictions (number of leaves) vs continuous predictions in a linear model, and do not express truly linear relationships well.
A simple classification tree is rarely performed on its own; the bagged, random forest, and gradient boosting methods build on this logic. However, it is good to start here to build understanding.
Using the OJ
data set from the ISLR
package, I will predict which brand of orange juice customers Purchase
(CH = Citrus Hill | MM = Minute Maid) using the 17 feature variables.
Start by loading the relevant libraries.
library(ISLR) # For OJ, and Carseats datasets
library(caret) # for workflow
library(rpart.plot) # for better formatted plots than the ones in rpart
library(tidyverse)
library(skimr) # neat alternative to glance + summary
Dataset OJ
contains n = 1070 observations.
## # A tibble: 18 x 16
## type variable missing complete n n_unique top_counts ordered mean
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 fact~ Purchase 0 1070 1070 2 CH: 653, ~ FALSE <NA>
## 2 fact~ Store7 0 1070 1070 2 No: 714, ~ FALSE <NA>
## 3 nume~ DiscCH 0 1070 1070 <NA> <NA> <NA> " 0~
## 4 nume~ DiscMM 0 1070 1070 <NA> <NA> <NA> " 0~
## 5 nume~ ListPri~ 0 1070 1070 <NA> <NA> <NA> " 0~
## 6 nume~ LoyalCH 0 1070 1070 <NA> <NA> <NA> " 0~
## 7 nume~ PctDisc~ 0 1070 1070 <NA> <NA> <NA> " 0~
## 8 nume~ PctDisc~ 0 1070 1070 <NA> <NA> <NA> " 0~
## 9 nume~ PriceCH 0 1070 1070 <NA> <NA> <NA> " 1~
## 10 nume~ PriceDi~ 0 1070 1070 <NA> <NA> <NA> " 0~
## 11 nume~ PriceMM 0 1070 1070 <NA> <NA> <NA> " 2~
## 12 nume~ SalePri~ 0 1070 1070 <NA> <NA> <NA> " 1~
## 13 nume~ SalePri~ 0 1070 1070 <NA> <NA> <NA> " 1~
## 14 nume~ Special~ 0 1070 1070 <NA> <NA> <NA> " 0~
## 15 nume~ Special~ 0 1070 1070 <NA> <NA> <NA> " 0~
## 16 nume~ STORE 0 1070 1070 <NA> <NA> <NA> " 1~
## 17 nume~ StoreID 0 1070 1070 <NA> <NA> <NA> " 3~
## 18 nume~ WeekofP~ 0 1070 1070 <NA> <NA> <NA> "254~
## # ... with 7 more variables: sd <chr>, p0 <chr>, p25 <chr>, p50 <chr>,
## # p75 <chr>, p100 <chr>, hist <chr>
The standard way to build models is to fit multiple candidate models with a “training” data set, then compare their peformance using a “test” data set. The reason for the test data set is that models can “overfit” the training data, and not generalize well to new data. In this case, my candidate models will be this simple classification tree, and the more advanced bagging, random forest, and gradient boosting models.
Turns out their are lots of ways to partition a data set, including sample()
and runif()
. caret
function createDataPartition()
is better because it preserves the proportion of the categories in the respondent variable.
set.seed(12345)
partition <- createDataPartition(y = oj_dat$Purchase, p = 0.8, list = FALSE)
oj.train <- oj_dat[partition, ]
oj.test <- oj_dat[-partition, ]
rm(partition)
The first step is to build a full tree, stopping only when the nodes
reach some minimum size, or when no improvement can be made. Then, using
k-fold cross-validation, evalute various values of to minimize the cost complexity . Function rpart()
does both. It returns a model object with a full tree, and it returns a
table of error rates produced by various settings of the the complexity
parameter. Use the printcp()
function to view the model details.
set.seed(123)
oj.full_class <- rpart(formula = Purchase ~ .,
data = oj.train,
method = "class", # classification (not regression)
xval = 10 # 10-fold cross-validation
)
rpart.plot(oj.full_class, yesno = TRUE)
##
## Classification tree:
## rpart(formula = Purchase ~ ., data = oj.train, method = "class",
## xval = 10)
##
## Variables actually used in tree construction:
## [1] LoyalCH PriceDiff SalePriceMM
##
## Root node error: 334/857 = 0.38973
##
## n= 857
##
## CP nsplit rel error xerror xstd
## 1 0.479042 0 1.00000 1.00000 0.042745
## 2 0.032934 1 0.52096 0.54192 0.035775
## 3 0.013473 3 0.45509 0.47006 0.033905
## 4 0.010000 5 0.42814 0.46407 0.033736
The algorithm included 13 variables in the full tree. The root node contains 334 errors out of the 857 values (39%).
The second step is to prune the tree to optimal size (to avoid
overfitting). The CP table in the model summary shows the relavent
statistics to choose the appropriate pruning paramter. The rel error
column is the error rate / root node error produced when pruning the tree using complexity parameter CP
to nsplits
splits. The xerror
column shows the error rate. A plot of xerror
vs cp
shows the relationship.
The dashed line is set at the minimum xerror
+ xstd
. Any value below the line would be considered statistically significant. A good choice for CP
is often the largest value for which the error is within a standard deviation of the mimimum error. In this case, the smallest cp
is at 0.01. A good way to detect and capture the correct cp
is with the which.min()
function, but if you want to choose the smallest statistically equivalent tree, specify it manually. Use the prune()
function to prune the tree by specifying the associated cost-complexity cp
.
oj.class <- prune(oj.full_class,
cp = oj.full_class$cptable[which.min(oj.full_class$cptable[, "xerror"]), "CP"])
rm(oj.full_class)
rpart.plot(oj.class, yesno = TRUE)
The pruned tree has 13 variables. The most important indicator of Purchase
is LoyalCH
.
The third and last step is to make predictions on the validation data
set and record the accuracy (correct classification percentage) for
comparison to other models. For a classification tree, set argument type = "class"
.
oj.class.pred <- predict(oj.class, oj.test, type = "class")
plot(oj.test$Purchase, oj.class.pred,
main = "Simple Classification: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 117 18
## MM 13 65
##
## Accuracy : 0.8545
## 95% CI : (0.7998, 0.8989)
## No Information Rate : 0.6103
## P-Value [Acc > NIR] : 4.83e-15
##
## Kappa : 0.6907
##
## Mcnemar's Test P-Value : 0.4725
##
## Sensitivity : 0.9000
## Specificity : 0.7831
## Pos Pred Value : 0.8667
## Neg Pred Value : 0.8333
## Prevalence : 0.6103
## Detection Rate : 0.5493
## Detection Prevalence : 0.6338
## Balanced Accuracy : 0.8416
##
## 'Positive' Class : CH
##
The pruning process leads to correct predictions for 85.4% of the locations in the test data set.
All of this can happen more or less automatically with the caret::train()
function, specifying method = "rpart"
. There are two ways to tune hyperparameters in train()
:
tuneLength
, ortuneGrid
.I’ll do this with tuneLength
first.
oj.class2 = train(Purchase ~ .,
data = oj.train,
method = "rpart", # for classification tree
tuneLength = 5, # choose up to 5 combinations of tuning parameters (cp)
metric='ROC', # evaluate hyperparamter combinations with ROC
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final", # save predictions for the optimal tuning parameter
classProbs = TRUE, # return class probabilities in addition to predicted values
summaryFunction = twoClassSummary # for binary response variable
)
)
oj.class2
## CART
##
## 857 samples
## 17 predictor
## 2 classes: 'CH', 'MM'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 771, 772, 772, 772, 770, 770, ...
## Resampling results across tuning parameters:
##
## cp ROC Sens Spec
## 0.005988024 0.8497527 0.8545718 0.7427807
## 0.008982036 0.8495648 0.8431060 0.7309269
## 0.013473054 0.8385888 0.8279390 0.7426916
## 0.032934132 0.7895605 0.8543904 0.6948307
## 0.479041916 0.6213113 0.9042090 0.3384135
##
## ROC was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.005988024.
oj.class.pred <- predict(oj.class2, oj.test, type = "raw")
plot(oj.test$Purchase, oj.class.pred,
main = "Simple Classification: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 115 18
## MM 15 65
##
## Accuracy : 0.8451
## 95% CI : (0.7894, 0.8909)
## No Information Rate : 0.6103
## P-Value [Acc > NIR] : 6.311e-14
##
## Kappa : 0.6721
##
## Mcnemar's Test P-Value : 0.7277
##
## Sensitivity : 0.8846
## Specificity : 0.7831
## Pos Pred Value : 0.8647
## Neg Pred Value : 0.8125
## Prevalence : 0.6103
## Detection Rate : 0.5399
## Detection Prevalence : 0.6244
## Balanced Accuracy : 0.8339
##
## 'Positive' Class : CH
##
oj.class.acc2 <- as.numeric(oj.class.conf$overall[1])
rm(oj.class.pred)
rm(oj.class.conf)
rpart.plot(oj.class2$finalModel)
And now with tuneGrid
.
myGrid <- expand.grid(cp = (0:1)/10)
oj.class3 = train(Purchase ~ .,
data = oj.train,
method = "rpart", # for classification tree
tuneGrid = myGrid, # choose up to 5 combinations of tuning parameters (cp)
metric='ROC', # evaluate hyperparamter combinations with ROC
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final", # save predictions for the optimal tuning parameter
classProbs = TRUE, # return class probabilities in addition to predicted values
summaryFunction = twoClassSummary # for binary response variable
)
)
rm(myGrid)
oj.class3
## CART
##
## 857 samples
## 17 predictor
## 2 classes: 'CH', 'MM'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 770, 772, 772, 772, 771, 771, ...
## Resampling results across tuning parameters:
##
## cp ROC Sens Spec
## 0.0 0.8509904 0.820029 0.7331551
## 0.1 0.7747727 0.826016 0.7235294
##
## ROC was used to select the optimal model using the largest value.
## The final value used for the model was cp = 0.
oj.class.pred <- predict(oj.class3, oj.test, type = "raw")
plot(oj.test$Purchase, oj.class.pred,
main = "Simple Classification: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 116 18
## MM 14 65
##
## Accuracy : 0.8498
## 95% CI : (0.7946, 0.8949)
## No Information Rate : 0.6103
## P-Value [Acc > NIR] : 1.778e-14
##
## Kappa : 0.6814
##
## Mcnemar's Test P-Value : 0.5959
##
## Sensitivity : 0.8923
## Specificity : 0.7831
## Pos Pred Value : 0.8657
## Neg Pred Value : 0.8228
## Prevalence : 0.6103
## Detection Rate : 0.5446
## Detection Prevalence : 0.6291
## Balanced Accuracy : 0.8377
##
## 'Positive' Class : CH
##
oj.class.acc3 <- as.numeric(oj.class.conf$overall[1])
rm(oj.class.pred)
rm(oj.class.conf)
rpart.plot(oj.class3$finalModel)
Looks like the manual effort faired best. Here is a summary the accuracy rates of the three models.
rbind(data.frame(model = "Manual Class", Acc = round(oj.class.acc, 5)),
data.frame(model = "Caret w/tuneLength", Acc = round(oj.class.acc2, 5)),
data.frame(model = "Caret w.tuneGrid", Acc = round(oj.class.acc3, 5))
)
## model Acc
## 1 Manual Class 0.85446
## 2 Caret w/tuneLength 0.84507
## 3 Caret w.tuneGrid 0.84977
A simple regression tree is built the same way as a simple classificatioon tree. Like the simple classification tree, it is rarely invoked on its own; the bagged, random forest, and gradient boosting methods build on this logic.
Using the Carseats
data set from ISLR
, I’ll predict Sales
using the available feature variables.
## # A tibble: 11 x 16
## type variable missing complete n n_unique top_counts ordered mean
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 fact~ ShelveL~ 0 400 400 3 Med: 219,~ FALSE <NA>
## 2 fact~ Urban 0 400 400 2 Yes: 282,~ FALSE <NA>
## 3 fact~ US 0 400 400 2 Yes: 258,~ FALSE <NA>
## 4 nume~ Adverti~ 0 400 400 <NA> <NA> <NA> " 6~
## 5 nume~ Age 0 400 400 <NA> <NA> <NA> " 53~
## 6 nume~ CompPri~ 0 400 400 <NA> <NA> <NA> 124.~
## 7 nume~ Educati~ 0 400 400 <NA> <NA> <NA> " 13~
## 8 nume~ Income 0 400 400 <NA> <NA> <NA> " 68~
## 9 nume~ Populat~ 0 400 400 <NA> <NA> <NA> 264.~
## 10 nume~ Price 0 400 400 <NA> <NA> <NA> "115~
## 11 nume~ Sales 0 400 400 <NA> <NA> <NA> " 7~
## # ... with 7 more variables: sd <chr>, p0 <chr>, p25 <chr>, p50 <chr>,
## # p75 <chr>, p100 <chr>, hist <chr>
partition <- createDataPartition(y = carseats_dat$Sales, p = 0.8, list = FALSE)
carseats.train <- carseats_dat[partition, ]
carseats.test <- carseats_dat[-partition, ]
rm(partition)
The first step is to build a full tree, then perform k-fold cross-validation to help select the optimal cost complexity (). The only difference here is the rpart()
parameter method = "anova"
to produce a regression tree.
set.seed(1234)
# Specify model = TRUE to handle plotting splits with factor variables.
carseats.full_anova <- rpart(formula = Sales ~ .,
data = carseats.train,
method = "anova",
xval = 10,
model = TRUE)
rpart.plot(carseats.full_anova, yesno = TRUE)
##
## Regression tree:
## rpart(formula = Sales ~ ., data = carseats.train, method = "anova",
## model = TRUE, xval = 10)
##
## Variables actually used in tree construction:
## [1] Advertising Age CompPrice Income Price ShelveLoc
## [7] US
##
## Root node error: 2629.2/321 = 8.1908
##
## n= 321
##
## CP nsplit rel error xerror xstd
## 1 0.236459 0 1.00000 1.00756 0.078470
## 2 0.120038 1 0.76354 0.77101 0.058731
## 3 0.065175 2 0.64350 0.67435 0.051761
## 4 0.038307 3 0.57833 0.62466 0.047870
## 5 0.034221 4 0.54002 0.62371 0.051616
## 6 0.028611 5 0.50580 0.63037 0.050867
## 7 0.019303 6 0.47719 0.61965 0.048823
## 8 0.019272 7 0.45789 0.61149 0.047367
## 9 0.015376 8 0.43862 0.60406 0.046656
## 10 0.015054 9 0.42324 0.60152 0.049192
## 11 0.014997 11 0.39313 0.59860 0.048116
## 12 0.013372 12 0.37813 0.59617 0.048076
## 13 0.012945 13 0.36476 0.59869 0.047714
## 14 0.012233 14 0.35182 0.59823 0.047987
## 15 0.011233 15 0.33958 0.59515 0.047725
## 16 0.010722 16 0.32835 0.58347 0.047526
## 17 0.010404 17 0.31763 0.58113 0.045906
## 18 0.010096 18 0.30723 0.58383 0.045385
## 19 0.010000 19 0.29713 0.58739 0.045432
The algorithm included 9 variables in the full tree.
The second step is to prune the tree to avoid overfitting. The CP
table shows the relavent statistics to choose the appropriate pruning
paramter. The CP table is included in the model summary. The rel error
column is the error rate / root node error produced when pruning the tree using complexity parameter CP
to nsplits
splits. The xerror
column shows the error rate. A plot of xerror
vs cp
shows the relationship.
In this case, the smallest relative error is at 0.0104039, but the
maximum CP below the dashed line (one standard deviation above the
mimimum error) is at cp ~ .039
.Use the prune()
function to prune the tree by specifying the associated cost-complexity cp
.
The pruned tree has 7 variables. The most important indicator of Sales
is shelving location.
The third and last step is to make predictions on the validation data
set and record the root mean squared error (RMSE) for comparison to
other models. The root mean squared error ( and mean absolute error ()
are the two most common measures of predictive accuracy. The key
difference is that RMSE punishes large errors more harshly. For a
regression tree, set argument type = "vector"
(or do not specify at all).
carseats.anova.pred <- predict(carseats.anova, carseats.test, type = "vector")
plot(carseats.test$Sales, carseats.anova.pred,
main = "Simple Regression: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
abline(0, 1)
## [1] 2.174825
The pruning process leads to an average prediction error of 2.175 in
the test data set. Not too bad considering the standard deviation of Sales
is 2.661.
All of this can happen more or less automatically with the caret::train()
function, specifying method = "rpart"
and specifying either tuneLength
or tuneGrid
.
I’ll do this with tuneLength
first.
carseats.anova2 = train(Sales ~ .,
data = carseats.train,
method = "rpart", # for classification tree
tuneLength = 5, # choose up to 5 combinations of tuning parameters (cp)
metric = "RMSE", # evaluate hyperparamter combinations with RMSE
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final" # save predictions for the optimal tuning parameter
)
)
## Warning in nominalTrainWorkflow(x = x, y = y, wts = weights, info =
## trainInfo, : There were missing values in resampled performance measures.
## CART
##
## 321 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 289, 289, 289, 288, 289, 289, ...
## Resampling results across tuning parameters:
##
## cp RMSE Rsquared MAE
## 0.03422119 2.293747 0.3761522 1.797402
## 0.03830654 2.319269 0.3630645 1.815010
## 0.06517488 2.349911 0.3277357 1.885201
## 0.12003756 2.469709 0.2643980 1.978326
## 0.23645904 2.715428 0.1612847 2.180476
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was cp = 0.03422119.
carseats.anova.pred <- predict(carseats.anova2, carseats.test, type = "raw")
plot(carseats.test$Sales, carseats.anova.pred,
main = "Simple Regression: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## [1] 2.041899
## Warning in rm(oj.class.pred): object 'oj.class.pred' not found
Now with tuneGrid
.
myGrid <- expand.grid(cp = (0:2)/10)
carseats.anova3 = train(Sales ~ .,
data = carseats.train,
method = "rpart", # for classification tree
tuneGrid = myGrid, # choose up to 5 combinations of tuning parameters (cp)
metric = "RMSE", # evaluate hyperparamter combinations with RMSE
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final" # save predictions for the optimal tuning parameter
)
)
carseats.anova3
## CART
##
## 321 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 289, 289, 288, 289, 289, 289, ...
## Resampling results across tuning parameters:
##
## cp RMSE Rsquared MAE
## 0.0 2.256378 0.4265076 1.832518
## 0.1 2.387187 0.3123398 1.916377
## 0.2 2.499859 0.2491193 2.041793
##
## RMSE was used to select the optimal model using the smallest value.
## The final value used for the model was cp = 0.
carseats.anova.pred <- predict(carseats.anova3, carseats.test, type = "raw")
plot(carseats.test$Sales, carseats.anova.pred,
main = "Simple Regression: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## [1] 2.1482
Looks like the manual effort faired best again. Here is a summary the RMSE values of the three models.
rbind(data.frame(model = "Manual ANOVA",
RMSE = round(carseats.anova.rmse, 5)),
data.frame(model = "Caret w/tuneLength",
RMSE = round(carseats.anova.rmse2, 5)),
data.frame(model = "Caret w.tuneGrid",
RMSE = round(carseats.anova.rmse3, 5))
)
## model RMSE
## 1 Manual ANOVA 2.17482
## 2 Caret w/tuneLength 2.04190
## 3 Caret w.tuneGrid 2.14820
Bootstrap aggregation, or bagging, is a general-purpose procedure for reducing the variance of a statistical learning method. The algorithm constructs B regression trees using B bootstrapped training sets, and averages the resulting predictions. These trees are grown deep, and are not pruned. Hence each individual tree has high variance, but low bias. Averaging these B trees reduces the variance. For classification trees, bagging takes the “majority vote” for the prediction. Use a value of B sufficiently large that the error has settled down.
To test the model accuracy, the out-of-bag observations are predicted from the models that do not use them. If B/3 of observations are in-bag, there are B/3 predictions per observation. These predictions are averaged for the test prediction. Again, for classification trees, a majority vote is taken.
The downside to bagging is that it improves accuracy at the expense of interpretability. There is no longer a single tree to interpret, so it is no longer clear which variables are more important than others.
Bagged trees are a special case of random forests, so see the next section for an example.
Random forests improve bagged trees by way of a small tweak that de-correlates the trees. As in bagging, the algorithm builds a number of decision trees on bootstrapped training samples. But when building these decision trees, each time a split in a tree is considered, a random sample of mtry predictors is chosen as split candidates from the full set of p predictors. A fresh sample of mtry predictors is taken at each split. Typically . Bagged trees are thus a special case of random forests where mtry = p.
Again using the OJ
data set to predict Purchase
, this time I’ll use the bagging method by specifying method = "treebag"
. I’ll use tuneLength = 5
and not worry about tuneGrid
anymore. Caret has no hyperparameters to tune with this model.
oj.bag = train(Purchase ~ .,
data = oj.train,
method = "treebag", # for bagging
tuneLength = 5, # choose up to 5 combinations of tuning parameters
metric = "ROC", # evaluate hyperparamter combinations with ROC
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # k=10 folds
savePredictions = "final", # save predictions for the optimal tuning parameters
classProbs = TRUE, # return class probabilities in addition to predicted values
summaryFunction = twoClassSummary # for binary response variable
)
)
oj.bag
## Bagged CART
##
## 857 samples
## 17 predictor
## 2 classes: 'CH', 'MM'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 771, 772, 771, 772, 770, 770, ...
## Resampling results:
##
## ROC Sens Spec
## 0.8531872 0.8241292 0.7364528
#plot(oj.bag$)
oj.pred <- predict(oj.bag, oj.test, type = "raw")
plot(oj.test$Purchase, oj.pred,
main = "Bagging Classification: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 108 18
## MM 22 65
##
## Accuracy : 0.8122
## 95% CI : (0.7532, 0.8623)
## No Information Rate : 0.6103
## P-Value [Acc > NIR] : 1.758e-10
##
## Kappa : 0.6086
##
## Mcnemar's Test P-Value : 0.6353
##
## Sensitivity : 0.8308
## Specificity : 0.7831
## Pos Pred Value : 0.8571
## Neg Pred Value : 0.7471
## Prevalence : 0.6103
## Detection Rate : 0.5070
## Detection Prevalence : 0.5915
## Balanced Accuracy : 0.8070
##
## 'Positive' Class : CH
##
oj.bag.acc <- as.numeric(oj.conf$overall[1])
rm(oj.pred)
rm(oj.conf)
#plot(oj.bag$, oj.bag$finalModel$y)
plot(varImp(oj.bag), main="Variable Importance with Simple Classication")
Now I’ll try it with the random forest method by specifying method = "ranger"
. I’ll stick with tuneLength = 5
. Caret tunes three hyperparameters:
mtry
: number of randomly selected predictors. Default is sqrt(p).splitrule
: splitting rule. For classification, options are “gini” (default) and “extratrees”.min.node.size
: minimal node size. Default is 1 for classification.oj.frst = train(Purchase ~ .,
data = oj.train,
method = "ranger", # for random forest
tuneLength = 5, # choose up to 5 combinations of tuning parameters
metric = "ROC", # evaluate hyperparamter combinations with ROC
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final", # save predictions for the optimal tuning parameter1
classProbs = TRUE, # return class probabilities in addition to predicted values
summaryFunction = twoClassSummary # for binary response variable
)
)
oj.frst
## Random Forest
##
## 857 samples
## 17 predictor
## 2 classes: 'CH', 'MM'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 770, 772, 772, 771, 771, 771, ...
## Resampling results across tuning parameters:
##
## mtry splitrule ROC Sens Spec
## 2 gini 0.8670943 0.8720610 0.7004456
## 2 extratrees 0.8586046 0.8854499 0.6378788
## 5 gini 0.8693983 0.8548621 0.7213904
## 5 extratrees 0.8683829 0.8567126 0.6915330
## 9 gini 0.8685985 0.8433237 0.7364528
## 9 extratrees 0.8684591 0.8395138 0.6976827
## 13 gini 0.8667643 0.8414369 0.7422460
## 13 extratrees 0.8660050 0.8318578 0.7035651
## 17 gini 0.8608692 0.8357039 0.7272727
## 17 extratrees 0.8652581 0.8298621 0.7036542
##
## Tuning parameter 'min.node.size' was held constant at a value of 1
## ROC was used to select the optimal model using the largest value.
## The final values used for the model were mtry = 5, splitrule = gini
## and min.node.size = 1.
oj.pred <- predict(oj.frst, oj.test, type = "raw")
plot(oj.test$Purchase, oj.pred,
main = "Random Forest Classification: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 110 17
## MM 20 66
##
## Accuracy : 0.8263
## 95% CI : (0.7686, 0.8746)
## No Information Rate : 0.6103
## P-Value [Acc > NIR] : 7.121e-12
##
## Kappa : 0.6372
##
## Mcnemar's Test P-Value : 0.7423
##
## Sensitivity : 0.8462
## Specificity : 0.7952
## Pos Pred Value : 0.8661
## Neg Pred Value : 0.7674
## Prevalence : 0.6103
## Detection Rate : 0.5164
## Detection Prevalence : 0.5962
## Balanced Accuracy : 0.8207
##
## 'Positive' Class : CH
##
oj.frst.acc <- as.numeric(oj.conf$overall[1])
rm(oj.pred)
rm(oj.conf)
#plot(oj.bag$, oj.bag$finalModel$y)
#plot(varImp(oj.frst), main="Variable Importance with Simple Classication")
The model algorithm explains “ROC was used to select the optimal model using the largest value. The final values used for the model were mtry = 9, splitrule = extratrees and min.node.size = 1.” You can see the results of tuning grid combinations in the associated plot of ROC AUC vs mtry grouped by splitting rule.
The bagging (accuracy = 0.80751) and random forest (accuracy = 0.81690) models faired pretty well, but the manual classification tree is still in first place. There’s still gradient boosting to investigate!
rbind(data.frame(model = "Manual Class", Accuracy = round(oj.class.acc, 5)),
data.frame(model = "Class w/tuneLength", Accuracy = round(oj.class.acc2, 5)),
data.frame(model = "Class w.tuneGrid", Accuracy = round(oj.class.acc3, 5)),
data.frame(model = "Bagging", Accuracy = round(oj.bag.acc, 5)),
data.frame(model = "Random Forest", Accuracy = round(oj.frst.acc, 5))
) %>% arrange(desc(Accuracy))
## model Accuracy
## 1 Manual Class 0.85446
## 2 Class w.tuneGrid 0.84977
## 3 Class w/tuneLength 0.84507
## 4 Random Forest 0.82629
## 5 Bagging 0.81221
Again using the Carseats
data set to predict Sales
, this time I’ll use the bagging method by specifying method = "treebag"
. I’ll use tuneLength = 5
and not worry about tuneGrid
anymore. Caret has no hyperparameters to tune with this model.
carseats.bag = train(Sales ~ .,
data = carseats.train,
method = "treebag", # for bagging
tuneLength = 5, # choose up to 5 combinations of tuning parameters
metric = "RMSE", # evaluate hyperparamter combinations with RMSE
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final" # save predictions for the optimal tuning parameter1
)
)
carseats.bag
## Bagged CART
##
## 321 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 289, 289, 289, 289, 288, 289, ...
## Resampling results:
##
## RMSE Rsquared MAE
## 1.806068 0.6267001 1.461412
#plot(carseats.bag$finalModel)
carseats.pred <- predict(carseats.bag, carseats.test, type = "raw")
plot(carseats.test$Sales, carseats.pred,
main = "Bagging Regression: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
abline(0, 1)
## [1] 1.583801
Now I’ll try it with the random forest method by specifying method = "ranger"
. I’ll stick with tuneLength = 5
. Caret tunes three hyperparameters:
mtry
: number of randomly selected predictorssplitrule
: splitting rule. For regression, options are “variance” (default), “extratrees”, and “maxstat”.min.node.size
: minimal node sizecarseats.frst = train(Sales ~ .,
data = carseats.train,
method = "ranger", # for random forest
tuneLength = 5, # choose up to 5 combinations of tuning parameters
metric = "RMSE", # evaluate hyperparamter combinations with RMSE
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final" # save predictions for the optimal tuning parameter1
)
)
carseats.frst
## Random Forest
##
## 321 samples
## 10 predictor
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 289, 289, 289, 289, 289, 288, ...
## Resampling results across tuning parameters:
##
## mtry splitrule RMSE Rsquared MAE
## 2 variance 1.913948 0.6658539 1.526403
## 2 extratrees 1.999992 0.6222039 1.602629
## 4 variance 1.701275 0.6991667 1.347109
## 4 extratrees 1.804496 0.6544198 1.442011
## 6 variance 1.639042 0.7064980 1.298627
## 6 extratrees 1.721709 0.6744264 1.381047
## 8 variance 1.640807 0.6935796 1.306517
## 8 extratrees 1.702956 0.6724439 1.366531
## 11 variance 1.638256 0.6898718 1.314992
## 11 extratrees 1.680607 0.6761447 1.343343
##
## Tuning parameter 'min.node.size' was held constant at a value of 5
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were mtry = 11, splitrule =
## variance and min.node.size = 5.
carseats.pred <- predict(carseats.frst, carseats.test, type = "raw")
plot(carseats.test$Sales, carseats.pred,
main = "Random Forest Regression: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
abline(0, 1)
## [1] 1.333577
rm(carseats.pred)
#plot(varImp(carseats.frst), main="Variable Importance with Regression Random Forest")
The model algorithm explains “RMSE was used to select the optimal model using the smallest value. The final values used for the model were mtry = 11, splitrule = variance and min.node.size = 5.” You can see the results of tuning grid combinations in the associated plot of ROC AUC vs mtry grouped by splitting rule.
The bagging and random forest models faired very well - they took over the first and second place!
rbind(data.frame(model = "Manual ANOVA", RMSE = round(carseats.anova.rmse, 5)),
data.frame(model = "ANOVA w/tuneLength", RMSE = round(carseats.anova.rmse2, 5)),
data.frame(model = "ANOVA w.tuneGrid", RMSE = round(carseats.anova.rmse3, 5)),
data.frame(model = "Bagging", RMSE = round(carseats.bag.rmse, 5)),
data.frame(model = "Random Forest", RMSE = round(carseats.frst.rmse, 5))
) %>% arrange(RMSE)
## model RMSE
## 1 Random Forest 1.33358
## 2 Bagging 1.58380
## 3 ANOVA w/tuneLength 2.04190
## 4 ANOVA w.tuneGrid 2.14820
## 5 Manual ANOVA 2.17482
Boosting is a method to improve (boost) the week learners sequentially and increase the model accuracy with a combined model. There are several boosting algorithms. One of the earliest was AdaBoost (adaptive boost). A more recent innovation is gradient boosting.
Adaboost creates a single split tree (decision stump) then weights the observations by how well the initial tree performed, putting more weight on the difficult observations. It then creates a second tree using the weights so that it focuses on the difficult observations. Observations that are difficult to classify receive increasing larger weights until the algorithm identifies a model that correctly classifies them. The final model returns predictions that are a majority vode. (I think Adaboost applies only to classification problems, not regressions).
Gradient boosting generalizes the AdaBoost method, so that the object is to minimize a loss function. In the case of classification problems, the loss function is the log-loss; for regression problems, the loss function is mean squared error. The regression trees are addative, so that the successive models can be added together to correct the residuals in the earlier models. Gradient boosting constructs its trees in a “greedy” manner, meaning it chooses the best splits based on purity scores like Gini or minimizing the loss. It is common to constrain the weak learners by setting maximum tree size parameters. Gradient boosting continues until it reaches maximum number of trees or an acceptible error level. This can result in overfitting, so it is common to employ regularization methods that penalize aspects of the model.
Tree Constraints. In general the more constrained the tree, the more trees need to be grown. Parameters to optimize include number of trees, tree depth, number of nodes, minimmum observations per split, and minimum improvement to loss.
Learning Rate. Each successive tree can be weighted to slow down the learning rate. Decreasing the learning rate increases the number of required trees. Common growth rates are 0.1 to 0.3.
The gradient boosting algorithm fits a shallow tree to the data, . Then it fits a tree to the residuals and adds a weighted sum of the tree to the original tree as . For regularized boosting, include a learning rate factor , . A larger produces faster learning, but risks overfitting. The process repeats until the residuals are small enough, or until it reaches the maximum iterations. Because overfitting is a risk, use cross-validation to select the appropriate number of trees (the number of trees producing the lowest RMSE).
Again using the OJ
data set to predict Purchase
, this time I’ll use the gradient boosting method by specifying method = "gbm"
. I’ll use tuneLength = 5
and not worry about tuneGrid
anymore. Caret tunes the following hyperparameters.
n.trees
: number of boosting iterationsinteraction.depth
: maximum tree depthshrinkage
: shrinkagen.minobsinnode
: mimimum terminal node sizeoj.gbm <- train(Purchase ~ .,
data = oj.train,
method = "gbm", # for bagged tree
tuneLength = 5, # choose up to 5 combinations of tuning parameters
metric = "ROC", # evaluate hyperparamter combinations with ROC
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final", # save predictions for the optimal tuning parameter1
classProbs = TRUE, # return class probabilities in addition to predicted values
summaryFunction = twoClassSummary # for binary response variable
)
)
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2757 nan 0.1000 0.0317
## 2 1.2205 nan 0.1000 0.0266
## 3 1.1791 nan 0.1000 0.0209
## 4 1.1380 nan 0.1000 0.0185
## 5 1.1048 nan 0.1000 0.0154
## 6 1.0791 nan 0.1000 0.0127
## 7 1.0571 nan 0.1000 0.0103
## 8 1.0344 nan 0.1000 0.0087
## 9 1.0175 nan 0.1000 0.0073
## 10 0.9987 nan 0.1000 0.0092
## 20 0.8926 nan 0.1000 0.0025
## 40 0.8005 nan 0.1000 -0.0006
## 60 0.7637 nan 0.1000 0.0006
## 80 0.7498 nan 0.1000 -0.0005
## 100 0.7399 nan 0.1000 -0.0008
## 120 0.7309 nan 0.1000 -0.0007
## 140 0.7262 nan 0.1000 -0.0007
## 160 0.7219 nan 0.1000 -0.0004
## 180 0.7176 nan 0.1000 -0.0011
## 200 0.7136 nan 0.1000 -0.0008
## 220 0.7088 nan 0.1000 -0.0003
## 240 0.7055 nan 0.1000 -0.0012
## 250 0.7028 nan 0.1000 -0.0005
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2690 nan 0.1000 0.0330
## 2 1.2035 nan 0.1000 0.0300
## 3 1.1522 nan 0.1000 0.0254
## 4 1.1089 nan 0.1000 0.0204
## 5 1.0689 nan 0.1000 0.0188
## 6 1.0360 nan 0.1000 0.0142
## 7 1.0131 nan 0.1000 0.0108
## 8 0.9869 nan 0.1000 0.0107
## 9 0.9677 nan 0.1000 0.0101
## 10 0.9456 nan 0.1000 0.0098
## 20 0.8179 nan 0.1000 0.0034
## 40 0.7401 nan 0.1000 -0.0016
## 60 0.7116 nan 0.1000 -0.0007
## 80 0.6975 nan 0.1000 -0.0019
## 100 0.6796 nan 0.1000 -0.0013
## 120 0.6676 nan 0.1000 -0.0011
## 140 0.6559 nan 0.1000 -0.0013
## 160 0.6447 nan 0.1000 -0.0013
## 180 0.6355 nan 0.1000 -0.0011
## 200 0.6249 nan 0.1000 -0.0010
## 220 0.6181 nan 0.1000 -0.0017
## 240 0.6070 nan 0.1000 -0.0013
## 250 0.6035 nan 0.1000 -0.0007
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2531 nan 0.1000 0.0408
## 2 1.1817 nan 0.1000 0.0323
## 3 1.1223 nan 0.1000 0.0269
## 4 1.0744 nan 0.1000 0.0210
## 5 1.0338 nan 0.1000 0.0186
## 6 0.9956 nan 0.1000 0.0176
## 7 0.9672 nan 0.1000 0.0114
## 8 0.9369 nan 0.1000 0.0127
## 9 0.9128 nan 0.1000 0.0094
## 10 0.8916 nan 0.1000 0.0085
## 20 0.7731 nan 0.1000 0.0000
## 40 0.7065 nan 0.1000 -0.0016
## 60 0.6753 nan 0.1000 -0.0006
## 80 0.6535 nan 0.1000 -0.0017
## 100 0.6341 nan 0.1000 -0.0015
## 120 0.6149 nan 0.1000 -0.0020
## 140 0.6009 nan 0.1000 -0.0010
## 160 0.5880 nan 0.1000 -0.0014
## 180 0.5681 nan 0.1000 -0.0010
## 200 0.5565 nan 0.1000 -0.0015
## 220 0.5477 nan 0.1000 -0.0019
## 240 0.5348 nan 0.1000 -0.0012
## 250 0.5289 nan 0.1000 -0.0023
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2463 nan 0.1000 0.0454
## 2 1.1757 nan 0.1000 0.0292
## 3 1.1182 nan 0.1000 0.0287
## 4 1.0654 nan 0.1000 0.0225
## 5 1.0236 nan 0.1000 0.0183
## 6 0.9824 nan 0.1000 0.0160
## 7 0.9553 nan 0.1000 0.0109
## 8 0.9228 nan 0.1000 0.0140
## 9 0.8941 nan 0.1000 0.0115
## 10 0.8734 nan 0.1000 0.0085
## 20 0.7596 nan 0.1000 0.0015
## 40 0.6842 nan 0.1000 0.0000
## 60 0.6479 nan 0.1000 -0.0012
## 80 0.6185 nan 0.1000 -0.0020
## 100 0.5940 nan 0.1000 -0.0025
## 120 0.5743 nan 0.1000 -0.0024
## 140 0.5550 nan 0.1000 -0.0012
## 160 0.5378 nan 0.1000 -0.0009
## 180 0.5221 nan 0.1000 -0.0019
## 200 0.5087 nan 0.1000 -0.0016
## 220 0.4944 nan 0.1000 -0.0013
## 240 0.4842 nan 0.1000 -0.0003
## 250 0.4731 nan 0.1000 -0.0025
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2473 nan 0.1000 0.0412
## 2 1.1703 nan 0.1000 0.0342
## 3 1.1109 nan 0.1000 0.0238
## 4 1.0591 nan 0.1000 0.0260
## 5 1.0162 nan 0.1000 0.0196
## 6 0.9736 nan 0.1000 0.0147
## 7 0.9402 nan 0.1000 0.0151
## 8 0.9090 nan 0.1000 0.0130
## 9 0.8864 nan 0.1000 0.0091
## 10 0.8650 nan 0.1000 0.0068
## 20 0.7426 nan 0.1000 0.0001
## 40 0.6637 nan 0.1000 -0.0008
## 60 0.6195 nan 0.1000 -0.0017
## 80 0.5885 nan 0.1000 -0.0022
## 100 0.5592 nan 0.1000 -0.0007
## 120 0.5367 nan 0.1000 -0.0023
## 140 0.5194 nan 0.1000 -0.0018
## 160 0.4966 nan 0.1000 -0.0034
## 180 0.4790 nan 0.1000 -0.0022
## 200 0.4676 nan 0.1000 -0.0015
## 220 0.4505 nan 0.1000 -0.0028
## 240 0.4343 nan 0.1000 -0.0025
## 250 0.4271 nan 0.1000 -0.0016
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2755 nan 0.1000 0.0296
## 2 1.2304 nan 0.1000 0.0209
## 3 1.1879 nan 0.1000 0.0209
## 4 1.1503 nan 0.1000 0.0162
## 5 1.1232 nan 0.1000 0.0135
## 6 1.0959 nan 0.1000 0.0126
## 7 1.0742 nan 0.1000 0.0104
## 8 1.0560 nan 0.1000 0.0085
## 9 1.0375 nan 0.1000 0.0083
## 10 1.0256 nan 0.1000 0.0044
## 20 0.9260 nan 0.1000 0.0021
## 40 0.8304 nan 0.1000 0.0009
## 60 0.7915 nan 0.1000 -0.0008
## 80 0.7732 nan 0.1000 -0.0000
## 100 0.7637 nan 0.1000 -0.0006
## 120 0.7573 nan 0.1000 -0.0017
## 140 0.7498 nan 0.1000 -0.0004
## 160 0.7456 nan 0.1000 -0.0004
## 180 0.7420 nan 0.1000 -0.0010
## 200 0.7374 nan 0.1000 -0.0004
## 220 0.7344 nan 0.1000 -0.0014
## 240 0.7305 nan 0.1000 -0.0006
## 250 0.7298 nan 0.1000 -0.0007
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2657 nan 0.1000 0.0353
## 2 1.2136 nan 0.1000 0.0260
## 3 1.1665 nan 0.1000 0.0223
## 4 1.1236 nan 0.1000 0.0198
## 5 1.0859 nan 0.1000 0.0124
## 6 1.0595 nan 0.1000 0.0118
## 7 1.0305 nan 0.1000 0.0116
## 8 1.0032 nan 0.1000 0.0111
## 9 0.9816 nan 0.1000 0.0082
## 10 0.9618 nan 0.1000 0.0053
## 20 0.8363 nan 0.1000 0.0026
## 40 0.7621 nan 0.1000 -0.0001
## 60 0.7342 nan 0.1000 -0.0007
## 80 0.7152 nan 0.1000 -0.0013
## 100 0.7000 nan 0.1000 -0.0012
## 120 0.6818 nan 0.1000 -0.0009
## 140 0.6737 nan 0.1000 -0.0011
## 160 0.6620 nan 0.1000 -0.0006
## 180 0.6548 nan 0.1000 -0.0016
## 200 0.6444 nan 0.1000 -0.0012
## 220 0.6388 nan 0.1000 -0.0011
## 240 0.6288 nan 0.1000 -0.0011
## 250 0.6226 nan 0.1000 -0.0014
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2569 nan 0.1000 0.0402
## 2 1.1871 nan 0.1000 0.0325
## 3 1.1332 nan 0.1000 0.0246
## 4 1.0882 nan 0.1000 0.0185
## 5 1.0514 nan 0.1000 0.0183
## 6 1.0186 nan 0.1000 0.0144
## 7 0.9844 nan 0.1000 0.0158
## 8 0.9579 nan 0.1000 0.0120
## 9 0.9357 nan 0.1000 0.0093
## 10 0.9170 nan 0.1000 0.0078
## 20 0.8031 nan 0.1000 0.0010
## 40 0.7280 nan 0.1000 -0.0016
## 60 0.6929 nan 0.1000 0.0001
## 80 0.6669 nan 0.1000 -0.0014
## 100 0.6472 nan 0.1000 -0.0009
## 120 0.6287 nan 0.1000 -0.0023
## 140 0.6093 nan 0.1000 -0.0004
## 160 0.5934 nan 0.1000 -0.0007
## 180 0.5836 nan 0.1000 -0.0012
## 200 0.5701 nan 0.1000 -0.0027
## 220 0.5586 nan 0.1000 -0.0009
## 240 0.5490 nan 0.1000 -0.0026
## 250 0.5450 nan 0.1000 -0.0014
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2572 nan 0.1000 0.0390
## 2 1.1950 nan 0.1000 0.0293
## 3 1.1386 nan 0.1000 0.0252
## 4 1.0907 nan 0.1000 0.0236
## 5 1.0482 nan 0.1000 0.0178
## 6 1.0140 nan 0.1000 0.0145
## 7 0.9768 nan 0.1000 0.0142
## 8 0.9491 nan 0.1000 0.0124
## 9 0.9184 nan 0.1000 0.0127
## 10 0.8970 nan 0.1000 0.0088
## 20 0.7833 nan 0.1000 0.0000
## 40 0.6953 nan 0.1000 -0.0012
## 60 0.6593 nan 0.1000 -0.0016
## 80 0.6270 nan 0.1000 -0.0023
## 100 0.5999 nan 0.1000 -0.0008
## 120 0.5766 nan 0.1000 -0.0024
## 140 0.5543 nan 0.1000 -0.0026
## 160 0.5366 nan 0.1000 -0.0023
## 180 0.5209 nan 0.1000 -0.0014
## 200 0.5054 nan 0.1000 -0.0018
## 220 0.4929 nan 0.1000 -0.0014
## 240 0.4819 nan 0.1000 -0.0014
## 250 0.4773 nan 0.1000 -0.0022
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2541 nan 0.1000 0.0399
## 2 1.1816 nan 0.1000 0.0323
## 3 1.1245 nan 0.1000 0.0235
## 4 1.0725 nan 0.1000 0.0231
## 5 1.0292 nan 0.1000 0.0185
## 6 0.9962 nan 0.1000 0.0125
## 7 0.9647 nan 0.1000 0.0137
## 8 0.9342 nan 0.1000 0.0138
## 9 0.9121 nan 0.1000 0.0091
## 10 0.8930 nan 0.1000 0.0073
## 20 0.7697 nan 0.1000 -0.0009
## 40 0.6804 nan 0.1000 -0.0000
## 60 0.6405 nan 0.1000 -0.0019
## 80 0.6066 nan 0.1000 -0.0024
## 100 0.5709 nan 0.1000 -0.0009
## 120 0.5430 nan 0.1000 -0.0030
## 140 0.5193 nan 0.1000 -0.0011
## 160 0.5006 nan 0.1000 -0.0020
## 180 0.4839 nan 0.1000 -0.0017
## 200 0.4664 nan 0.1000 -0.0013
## 220 0.4542 nan 0.1000 -0.0010
## 240 0.4388 nan 0.1000 -0.0019
## 250 0.4334 nan 0.1000 -0.0019
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2752 nan 0.1000 0.0311
## 2 1.2229 nan 0.1000 0.0260
## 3 1.1807 nan 0.1000 0.0210
## 4 1.1459 nan 0.1000 0.0181
## 5 1.1129 nan 0.1000 0.0137
## 6 1.0853 nan 0.1000 0.0118
## 7 1.0609 nan 0.1000 0.0104
## 8 1.0401 nan 0.1000 0.0105
## 9 1.0245 nan 0.1000 0.0079
## 10 1.0074 nan 0.1000 0.0078
## 20 0.9055 nan 0.1000 0.0022
## 40 0.8183 nan 0.1000 0.0003
## 60 0.7816 nan 0.1000 0.0002
## 80 0.7633 nan 0.1000 -0.0005
## 100 0.7536 nan 0.1000 -0.0002
## 120 0.7458 nan 0.1000 -0.0008
## 140 0.7398 nan 0.1000 -0.0007
## 160 0.7342 nan 0.1000 -0.0008
## 180 0.7312 nan 0.1000 -0.0009
## 200 0.7273 nan 0.1000 -0.0013
## 220 0.7239 nan 0.1000 -0.0010
## 240 0.7213 nan 0.1000 -0.0008
## 250 0.7192 nan 0.1000 -0.0012
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2682 nan 0.1000 0.0304
## 2 1.2053 nan 0.1000 0.0283
## 3 1.1553 nan 0.1000 0.0230
## 4 1.1138 nan 0.1000 0.0158
## 5 1.0736 nan 0.1000 0.0155
## 6 1.0401 nan 0.1000 0.0159
## 7 1.0146 nan 0.1000 0.0120
## 8 0.9936 nan 0.1000 0.0093
## 9 0.9706 nan 0.1000 0.0094
## 10 0.9477 nan 0.1000 0.0086
## 20 0.8364 nan 0.1000 0.0019
## 40 0.7588 nan 0.1000 -0.0007
## 60 0.7337 nan 0.1000 0.0001
## 80 0.7150 nan 0.1000 -0.0015
## 100 0.7012 nan 0.1000 -0.0015
## 120 0.6825 nan 0.1000 -0.0013
## 140 0.6705 nan 0.1000 -0.0003
## 160 0.6603 nan 0.1000 -0.0012
## 180 0.6471 nan 0.1000 -0.0010
## 200 0.6380 nan 0.1000 -0.0009
## 220 0.6263 nan 0.1000 -0.0008
## 240 0.6200 nan 0.1000 -0.0014
## 250 0.6160 nan 0.1000 -0.0010
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2532 nan 0.1000 0.0384
## 2 1.1884 nan 0.1000 0.0299
## 3 1.1343 nan 0.1000 0.0226
## 4 1.0906 nan 0.1000 0.0210
## 5 1.0567 nan 0.1000 0.0166
## 6 1.0226 nan 0.1000 0.0154
## 7 0.9923 nan 0.1000 0.0120
## 8 0.9633 nan 0.1000 0.0126
## 9 0.9443 nan 0.1000 0.0077
## 10 0.9241 nan 0.1000 0.0081
## 20 0.7999 nan 0.1000 0.0029
## 40 0.7335 nan 0.1000 -0.0020
## 60 0.6973 nan 0.1000 -0.0008
## 80 0.6718 nan 0.1000 -0.0020
## 100 0.6544 nan 0.1000 -0.0027
## 120 0.6356 nan 0.1000 -0.0019
## 140 0.6164 nan 0.1000 -0.0016
## 160 0.5985 nan 0.1000 -0.0009
## 180 0.5831 nan 0.1000 -0.0020
## 200 0.5756 nan 0.1000 -0.0025
## 220 0.5631 nan 0.1000 -0.0015
## 240 0.5510 nan 0.1000 -0.0015
## 250 0.5488 nan 0.1000 -0.0013
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2498 nan 0.1000 0.0363
## 2 1.1786 nan 0.1000 0.0314
## 3 1.1285 nan 0.1000 0.0248
## 4 1.0808 nan 0.1000 0.0248
## 5 1.0348 nan 0.1000 0.0198
## 6 0.9949 nan 0.1000 0.0168
## 7 0.9640 nan 0.1000 0.0104
## 8 0.9352 nan 0.1000 0.0108
## 9 0.9125 nan 0.1000 0.0044
## 10 0.8931 nan 0.1000 0.0075
## 20 0.7696 nan 0.1000 0.0004
## 40 0.6910 nan 0.1000 -0.0003
## 60 0.6547 nan 0.1000 -0.0020
## 80 0.6233 nan 0.1000 -0.0021
## 100 0.5986 nan 0.1000 -0.0015
## 120 0.5775 nan 0.1000 -0.0020
## 140 0.5614 nan 0.1000 -0.0021
## 160 0.5420 nan 0.1000 -0.0005
## 180 0.5249 nan 0.1000 -0.0005
## 200 0.5107 nan 0.1000 -0.0011
## 220 0.4984 nan 0.1000 -0.0008
## 240 0.4823 nan 0.1000 -0.0024
## 250 0.4773 nan 0.1000 -0.0023
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2492 nan 0.1000 0.0423
## 2 1.1773 nan 0.1000 0.0340
## 3 1.1141 nan 0.1000 0.0281
## 4 1.0675 nan 0.1000 0.0213
## 5 1.0249 nan 0.1000 0.0190
## 6 0.9877 nan 0.1000 0.0176
## 7 0.9545 nan 0.1000 0.0151
## 8 0.9249 nan 0.1000 0.0105
## 9 0.8990 nan 0.1000 0.0105
## 10 0.8799 nan 0.1000 0.0082
## 20 0.7630 nan 0.1000 0.0016
## 40 0.6807 nan 0.1000 -0.0016
## 60 0.6400 nan 0.1000 -0.0016
## 80 0.6088 nan 0.1000 -0.0020
## 100 0.5807 nan 0.1000 -0.0020
## 120 0.5529 nan 0.1000 -0.0025
## 140 0.5290 nan 0.1000 -0.0021
## 160 0.5099 nan 0.1000 -0.0021
## 180 0.4882 nan 0.1000 -0.0021
## 200 0.4722 nan 0.1000 -0.0009
## 220 0.4559 nan 0.1000 -0.0022
## 240 0.4408 nan 0.1000 -0.0016
## 250 0.4346 nan 0.1000 -0.0020
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2757 nan 0.1000 0.0312
## 2 1.2200 nan 0.1000 0.0246
## 3 1.1791 nan 0.1000 0.0192
## 4 1.1429 nan 0.1000 0.0158
## 5 1.1152 nan 0.1000 0.0129
## 6 1.0916 nan 0.1000 0.0110
## 7 1.0691 nan 0.1000 0.0095
## 8 1.0490 nan 0.1000 0.0090
## 9 1.0312 nan 0.1000 0.0077
## 10 1.0173 nan 0.1000 0.0067
## 20 0.9193 nan 0.1000 0.0020
## 40 0.8253 nan 0.1000 0.0014
## 60 0.7871 nan 0.1000 -0.0007
## 80 0.7693 nan 0.1000 -0.0001
## 100 0.7601 nan 0.1000 -0.0011
## 120 0.7536 nan 0.1000 -0.0004
## 140 0.7509 nan 0.1000 -0.0007
## 160 0.7450 nan 0.1000 -0.0007
## 180 0.7397 nan 0.1000 -0.0003
## 200 0.7377 nan 0.1000 -0.0012
## 220 0.7344 nan 0.1000 -0.0006
## 240 0.7329 nan 0.1000 -0.0003
## 250 0.7308 nan 0.1000 -0.0011
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2691 nan 0.1000 0.0314
## 2 1.2071 nan 0.1000 0.0284
## 3 1.1577 nan 0.1000 0.0240
## 4 1.1189 nan 0.1000 0.0193
## 5 1.0830 nan 0.1000 0.0174
## 6 1.0529 nan 0.1000 0.0142
## 7 1.0246 nan 0.1000 0.0121
## 8 1.0025 nan 0.1000 0.0096
## 9 0.9812 nan 0.1000 0.0089
## 10 0.9634 nan 0.1000 0.0074
## 20 0.8385 nan 0.1000 0.0022
## 40 0.7651 nan 0.1000 -0.0006
## 60 0.7382 nan 0.1000 -0.0005
## 80 0.7193 nan 0.1000 -0.0014
## 100 0.7032 nan 0.1000 -0.0014
## 120 0.6946 nan 0.1000 -0.0012
## 140 0.6790 nan 0.1000 -0.0019
## 160 0.6680 nan 0.1000 -0.0007
## 180 0.6583 nan 0.1000 -0.0009
## 200 0.6491 nan 0.1000 -0.0011
## 220 0.6417 nan 0.1000 -0.0009
## 240 0.6338 nan 0.1000 -0.0010
## 250 0.6310 nan 0.1000 -0.0018
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2606 nan 0.1000 0.0371
## 2 1.1978 nan 0.1000 0.0313
## 3 1.1409 nan 0.1000 0.0242
## 4 1.0985 nan 0.1000 0.0189
## 5 1.0601 nan 0.1000 0.0164
## 6 1.0279 nan 0.1000 0.0146
## 7 1.0012 nan 0.1000 0.0128
## 8 0.9703 nan 0.1000 0.0140
## 9 0.9468 nan 0.1000 0.0119
## 10 0.9234 nan 0.1000 0.0102
## 20 0.8064 nan 0.1000 0.0026
## 40 0.7368 nan 0.1000 -0.0011
## 60 0.7071 nan 0.1000 -0.0008
## 80 0.6869 nan 0.1000 -0.0009
## 100 0.6617 nan 0.1000 -0.0017
## 120 0.6441 nan 0.1000 -0.0028
## 140 0.6255 nan 0.1000 -0.0019
## 160 0.6115 nan 0.1000 -0.0024
## 180 0.5937 nan 0.1000 -0.0024
## 200 0.5777 nan 0.1000 -0.0015
## 220 0.5654 nan 0.1000 -0.0017
## 240 0.5531 nan 0.1000 -0.0019
## 250 0.5480 nan 0.1000 -0.0022
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2538 nan 0.1000 0.0384
## 2 1.1875 nan 0.1000 0.0325
## 3 1.1333 nan 0.1000 0.0260
## 4 1.0914 nan 0.1000 0.0176
## 5 1.0473 nan 0.1000 0.0189
## 6 1.0111 nan 0.1000 0.0146
## 7 0.9772 nan 0.1000 0.0158
## 8 0.9474 nan 0.1000 0.0131
## 9 0.9266 nan 0.1000 0.0090
## 10 0.9056 nan 0.1000 0.0091
## 20 0.7820 nan 0.1000 0.0016
## 40 0.7053 nan 0.1000 -0.0007
## 60 0.6698 nan 0.1000 -0.0017
## 80 0.6429 nan 0.1000 -0.0026
## 100 0.6207 nan 0.1000 -0.0013
## 120 0.6001 nan 0.1000 -0.0022
## 140 0.5806 nan 0.1000 -0.0011
## 160 0.5637 nan 0.1000 -0.0023
## 180 0.5457 nan 0.1000 -0.0021
## 200 0.5272 nan 0.1000 -0.0013
## 220 0.5089 nan 0.1000 -0.0022
## 240 0.4969 nan 0.1000 -0.0014
## 250 0.4896 nan 0.1000 -0.0016
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2509 nan 0.1000 0.0386
## 2 1.1798 nan 0.1000 0.0355
## 3 1.1205 nan 0.1000 0.0264
## 4 1.0721 nan 0.1000 0.0221
## 5 1.0338 nan 0.1000 0.0155
## 6 0.9989 nan 0.1000 0.0145
## 7 0.9692 nan 0.1000 0.0133
## 8 0.9379 nan 0.1000 0.0130
## 9 0.9144 nan 0.1000 0.0093
## 10 0.8936 nan 0.1000 0.0087
## 20 0.7705 nan 0.1000 0.0010
## 40 0.6870 nan 0.1000 -0.0022
## 60 0.6422 nan 0.1000 -0.0031
## 80 0.6095 nan 0.1000 -0.0030
## 100 0.5801 nan 0.1000 -0.0023
## 120 0.5539 nan 0.1000 -0.0026
## 140 0.5342 nan 0.1000 -0.0014
## 160 0.5121 nan 0.1000 -0.0021
## 180 0.4946 nan 0.1000 -0.0015
## 200 0.4780 nan 0.1000 -0.0016
## 220 0.4669 nan 0.1000 -0.0009
## 240 0.4542 nan 0.1000 -0.0033
## 250 0.4479 nan 0.1000 -0.0024
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2754 nan 0.1000 0.0295
## 2 1.2276 nan 0.1000 0.0243
## 3 1.1923 nan 0.1000 0.0160
## 4 1.1555 nan 0.1000 0.0163
## 5 1.1233 nan 0.1000 0.0120
## 6 1.0985 nan 0.1000 0.0116
## 7 1.0805 nan 0.1000 0.0081
## 8 1.0583 nan 0.1000 0.0104
## 9 1.0410 nan 0.1000 0.0085
## 10 1.0248 nan 0.1000 0.0075
## 20 0.9289 nan 0.1000 0.0020
## 40 0.8402 nan 0.1000 0.0004
## 60 0.8051 nan 0.1000 -0.0017
## 80 0.7842 nan 0.1000 -0.0004
## 100 0.7736 nan 0.1000 -0.0004
## 120 0.7665 nan 0.1000 -0.0007
## 140 0.7613 nan 0.1000 -0.0013
## 160 0.7556 nan 0.1000 -0.0003
## 180 0.7514 nan 0.1000 -0.0006
## 200 0.7464 nan 0.1000 -0.0007
## 220 0.7437 nan 0.1000 -0.0006
## 240 0.7393 nan 0.1000 -0.0008
## 250 0.7378 nan 0.1000 -0.0012
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2756 nan 0.1000 0.0323
## 2 1.2156 nan 0.1000 0.0276
## 3 1.1672 nan 0.1000 0.0230
## 4 1.1303 nan 0.1000 0.0162
## 5 1.0930 nan 0.1000 0.0175
## 6 1.0618 nan 0.1000 0.0129
## 7 1.0360 nan 0.1000 0.0127
## 8 1.0108 nan 0.1000 0.0120
## 9 0.9894 nan 0.1000 0.0089
## 10 0.9715 nan 0.1000 0.0072
## 20 0.8537 nan 0.1000 0.0023
## 40 0.7751 nan 0.1000 -0.0005
## 60 0.7466 nan 0.1000 -0.0000
## 80 0.7271 nan 0.1000 -0.0008
## 100 0.7172 nan 0.1000 -0.0021
## 120 0.7029 nan 0.1000 -0.0018
## 140 0.6882 nan 0.1000 -0.0009
## 160 0.6772 nan 0.1000 -0.0009
## 180 0.6676 nan 0.1000 -0.0008
## 200 0.6607 nan 0.1000 -0.0015
## 220 0.6524 nan 0.1000 -0.0019
## 240 0.6428 nan 0.1000 -0.0004
## 250 0.6403 nan 0.1000 -0.0014
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2603 nan 0.1000 0.0367
## 2 1.1993 nan 0.1000 0.0285
## 3 1.1407 nan 0.1000 0.0272
## 4 1.0936 nan 0.1000 0.0200
## 5 1.0532 nan 0.1000 0.0177
## 6 1.0224 nan 0.1000 0.0134
## 7 0.9961 nan 0.1000 0.0109
## 8 0.9690 nan 0.1000 0.0124
## 9 0.9425 nan 0.1000 0.0111
## 10 0.9242 nan 0.1000 0.0072
## 20 0.8104 nan 0.1000 0.0019
## 40 0.7401 nan 0.1000 -0.0009
## 60 0.7066 nan 0.1000 -0.0019
## 80 0.6811 nan 0.1000 -0.0029
## 100 0.6616 nan 0.1000 -0.0018
## 120 0.6414 nan 0.1000 -0.0002
## 140 0.6269 nan 0.1000 -0.0014
## 160 0.6159 nan 0.1000 -0.0018
## 180 0.6004 nan 0.1000 -0.0002
## 200 0.5862 nan 0.1000 -0.0012
## 220 0.5717 nan 0.1000 -0.0020
## 240 0.5641 nan 0.1000 -0.0013
## 250 0.5565 nan 0.1000 -0.0016
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2589 nan 0.1000 0.0353
## 2 1.1911 nan 0.1000 0.0334
## 3 1.1342 nan 0.1000 0.0265
## 4 1.0839 nan 0.1000 0.0234
## 5 1.0421 nan 0.1000 0.0179
## 6 1.0135 nan 0.1000 0.0128
## 7 0.9869 nan 0.1000 0.0125
## 8 0.9612 nan 0.1000 0.0086
## 9 0.9370 nan 0.1000 0.0081
## 10 0.9147 nan 0.1000 0.0095
## 20 0.7933 nan 0.1000 0.0014
## 40 0.7138 nan 0.1000 -0.0009
## 60 0.6744 nan 0.1000 -0.0016
## 80 0.6446 nan 0.1000 -0.0016
## 100 0.6223 nan 0.1000 -0.0025
## 120 0.5945 nan 0.1000 -0.0020
## 140 0.5745 nan 0.1000 -0.0021
## 160 0.5568 nan 0.1000 -0.0023
## 180 0.5410 nan 0.1000 -0.0011
## 200 0.5267 nan 0.1000 -0.0011
## 220 0.5140 nan 0.1000 -0.0020
## 240 0.4996 nan 0.1000 -0.0017
## 250 0.4947 nan 0.1000 -0.0014
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2562 nan 0.1000 0.0403
## 2 1.1844 nan 0.1000 0.0315
## 3 1.1237 nan 0.1000 0.0287
## 4 1.0732 nan 0.1000 0.0236
## 5 1.0317 nan 0.1000 0.0182
## 6 0.9925 nan 0.1000 0.0175
## 7 0.9623 nan 0.1000 0.0128
## 8 0.9364 nan 0.1000 0.0099
## 9 0.9118 nan 0.1000 0.0096
## 10 0.8909 nan 0.1000 0.0069
## 20 0.7743 nan 0.1000 0.0007
## 40 0.6916 nan 0.1000 -0.0009
## 60 0.6507 nan 0.1000 -0.0016
## 80 0.6170 nan 0.1000 -0.0035
## 100 0.5860 nan 0.1000 -0.0026
## 120 0.5613 nan 0.1000 -0.0025
## 140 0.5430 nan 0.1000 -0.0029
## 160 0.5199 nan 0.1000 -0.0008
## 180 0.5037 nan 0.1000 -0.0014
## 200 0.4875 nan 0.1000 -0.0020
## 220 0.4704 nan 0.1000 -0.0009
## 240 0.4574 nan 0.1000 -0.0017
## 250 0.4513 nan 0.1000 -0.0014
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2712 nan 0.1000 0.0297
## 2 1.2228 nan 0.1000 0.0245
## 3 1.1847 nan 0.1000 0.0192
## 4 1.1496 nan 0.1000 0.0154
## 5 1.1215 nan 0.1000 0.0128
## 6 1.0987 nan 0.1000 0.0115
## 7 1.0749 nan 0.1000 0.0104
## 8 1.0503 nan 0.1000 0.0100
## 9 1.0363 nan 0.1000 0.0067
## 10 1.0208 nan 0.1000 0.0063
## 20 0.9167 nan 0.1000 0.0029
## 40 0.8238 nan 0.1000 0.0009
## 60 0.7819 nan 0.1000 -0.0004
## 80 0.7639 nan 0.1000 -0.0002
## 100 0.7514 nan 0.1000 -0.0003
## 120 0.7435 nan 0.1000 -0.0006
## 140 0.7396 nan 0.1000 -0.0007
## 160 0.7360 nan 0.1000 -0.0009
## 180 0.7306 nan 0.1000 -0.0008
## 200 0.7281 nan 0.1000 -0.0012
## 220 0.7245 nan 0.1000 -0.0007
## 240 0.7222 nan 0.1000 -0.0004
## 250 0.7196 nan 0.1000 -0.0005
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2669 nan 0.1000 0.0341
## 2 1.2101 nan 0.1000 0.0242
## 3 1.1596 nan 0.1000 0.0220
## 4 1.1177 nan 0.1000 0.0208
## 5 1.0844 nan 0.1000 0.0135
## 6 1.0538 nan 0.1000 0.0136
## 7 1.0244 nan 0.1000 0.0132
## 8 0.9970 nan 0.1000 0.0108
## 9 0.9756 nan 0.1000 0.0100
## 10 0.9597 nan 0.1000 0.0068
## 20 0.8340 nan 0.1000 0.0032
## 40 0.7577 nan 0.1000 -0.0001
## 60 0.7298 nan 0.1000 -0.0012
## 80 0.7142 nan 0.1000 -0.0006
## 100 0.6997 nan 0.1000 -0.0009
## 120 0.6894 nan 0.1000 -0.0017
## 140 0.6781 nan 0.1000 -0.0024
## 160 0.6667 nan 0.1000 -0.0014
## 180 0.6567 nan 0.1000 -0.0007
## 200 0.6466 nan 0.1000 -0.0008
## 220 0.6371 nan 0.1000 -0.0021
## 240 0.6287 nan 0.1000 -0.0010
## 250 0.6238 nan 0.1000 -0.0009
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2624 nan 0.1000 0.0360
## 2 1.1962 nan 0.1000 0.0298
## 3 1.1376 nan 0.1000 0.0263
## 4 1.0917 nan 0.1000 0.0219
## 5 1.0512 nan 0.1000 0.0158
## 6 1.0196 nan 0.1000 0.0147
## 7 0.9858 nan 0.1000 0.0159
## 8 0.9617 nan 0.1000 0.0109
## 9 0.9366 nan 0.1000 0.0106
## 10 0.9187 nan 0.1000 0.0069
## 20 0.7959 nan 0.1000 0.0008
## 40 0.7303 nan 0.1000 -0.0001
## 60 0.7032 nan 0.1000 -0.0017
## 80 0.6757 nan 0.1000 -0.0010
## 100 0.6569 nan 0.1000 -0.0032
## 120 0.6362 nan 0.1000 -0.0012
## 140 0.6174 nan 0.1000 -0.0013
## 160 0.6028 nan 0.1000 -0.0013
## 180 0.5881 nan 0.1000 -0.0019
## 200 0.5737 nan 0.1000 -0.0030
## 220 0.5618 nan 0.1000 -0.0020
## 240 0.5516 nan 0.1000 -0.0014
## 250 0.5480 nan 0.1000 -0.0021
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2515 nan 0.1000 0.0384
## 2 1.1824 nan 0.1000 0.0333
## 3 1.1240 nan 0.1000 0.0261
## 4 1.0772 nan 0.1000 0.0205
## 5 1.0362 nan 0.1000 0.0187
## 6 1.0029 nan 0.1000 0.0121
## 7 0.9718 nan 0.1000 0.0147
## 8 0.9458 nan 0.1000 0.0103
## 9 0.9226 nan 0.1000 0.0102
## 10 0.9053 nan 0.1000 0.0082
## 20 0.7840 nan 0.1000 0.0016
## 40 0.7059 nan 0.1000 -0.0022
## 60 0.6679 nan 0.1000 -0.0011
## 80 0.6351 nan 0.1000 -0.0025
## 100 0.6149 nan 0.1000 -0.0033
## 120 0.5922 nan 0.1000 -0.0011
## 140 0.5730 nan 0.1000 -0.0025
## 160 0.5587 nan 0.1000 -0.0021
## 180 0.5414 nan 0.1000 -0.0017
## 200 0.5267 nan 0.1000 -0.0014
## 220 0.5145 nan 0.1000 -0.0012
## 240 0.5008 nan 0.1000 -0.0015
## 250 0.4954 nan 0.1000 -0.0017
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2537 nan 0.1000 0.0430
## 2 1.1819 nan 0.1000 0.0323
## 3 1.1177 nan 0.1000 0.0285
## 4 1.0698 nan 0.1000 0.0214
## 5 1.0292 nan 0.1000 0.0170
## 6 0.9856 nan 0.1000 0.0174
## 7 0.9564 nan 0.1000 0.0094
## 8 0.9268 nan 0.1000 0.0116
## 9 0.9037 nan 0.1000 0.0094
## 10 0.8817 nan 0.1000 0.0078
## 20 0.7600 nan 0.1000 0.0009
## 40 0.6794 nan 0.1000 -0.0004
## 60 0.6356 nan 0.1000 -0.0012
## 80 0.6033 nan 0.1000 -0.0026
## 100 0.5751 nan 0.1000 -0.0016
## 120 0.5513 nan 0.1000 -0.0020
## 140 0.5340 nan 0.1000 -0.0024
## 160 0.5136 nan 0.1000 -0.0030
## 180 0.4960 nan 0.1000 -0.0027
## 200 0.4813 nan 0.1000 -0.0008
## 220 0.4666 nan 0.1000 -0.0014
## 240 0.4498 nan 0.1000 -0.0017
## 250 0.4447 nan 0.1000 -0.0027
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2793 nan 0.1000 0.0294
## 2 1.2337 nan 0.1000 0.0211
## 3 1.1880 nan 0.1000 0.0205
## 4 1.1507 nan 0.1000 0.0165
## 5 1.1256 nan 0.1000 0.0134
## 6 1.1010 nan 0.1000 0.0096
## 7 1.0756 nan 0.1000 0.0122
## 8 1.0580 nan 0.1000 0.0076
## 9 1.0419 nan 0.1000 0.0070
## 10 1.0228 nan 0.1000 0.0089
## 20 0.9162 nan 0.1000 0.0014
## 40 0.8210 nan 0.1000 0.0003
## 60 0.7833 nan 0.1000 0.0000
## 80 0.7630 nan 0.1000 -0.0009
## 100 0.7516 nan 0.1000 -0.0003
## 120 0.7441 nan 0.1000 -0.0009
## 140 0.7391 nan 0.1000 -0.0009
## 160 0.7323 nan 0.1000 -0.0008
## 180 0.7283 nan 0.1000 -0.0007
## 200 0.7256 nan 0.1000 -0.0005
## 220 0.7232 nan 0.1000 -0.0007
## 240 0.7192 nan 0.1000 -0.0007
## 250 0.7177 nan 0.1000 -0.0009
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2655 nan 0.1000 0.0348
## 2 1.2066 nan 0.1000 0.0255
## 3 1.1640 nan 0.1000 0.0209
## 4 1.1241 nan 0.1000 0.0185
## 5 1.0881 nan 0.1000 0.0149
## 6 1.0570 nan 0.1000 0.0129
## 7 1.0289 nan 0.1000 0.0103
## 8 1.0014 nan 0.1000 0.0117
## 9 0.9814 nan 0.1000 0.0077
## 10 0.9604 nan 0.1000 0.0074
## 20 0.8341 nan 0.1000 0.0015
## 40 0.7549 nan 0.1000 -0.0006
## 60 0.7298 nan 0.1000 -0.0006
## 80 0.7110 nan 0.1000 -0.0007
## 100 0.6956 nan 0.1000 -0.0009
## 120 0.6830 nan 0.1000 -0.0020
## 140 0.6686 nan 0.1000 -0.0010
## 160 0.6591 nan 0.1000 -0.0014
## 180 0.6469 nan 0.1000 -0.0014
## 200 0.6361 nan 0.1000 -0.0014
## 220 0.6265 nan 0.1000 -0.0017
## 240 0.6194 nan 0.1000 -0.0012
## 250 0.6130 nan 0.1000 -0.0014
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2599 nan 0.1000 0.0355
## 2 1.1914 nan 0.1000 0.0314
## 3 1.1333 nan 0.1000 0.0256
## 4 1.0873 nan 0.1000 0.0204
## 5 1.0480 nan 0.1000 0.0183
## 6 1.0163 nan 0.1000 0.0148
## 7 0.9827 nan 0.1000 0.0147
## 8 0.9572 nan 0.1000 0.0098
## 9 0.9313 nan 0.1000 0.0108
## 10 0.9086 nan 0.1000 0.0107
## 20 0.7947 nan 0.1000 0.0002
## 40 0.7194 nan 0.1000 -0.0010
## 60 0.6873 nan 0.1000 -0.0010
## 80 0.6681 nan 0.1000 -0.0007
## 100 0.6463 nan 0.1000 -0.0019
## 120 0.6261 nan 0.1000 -0.0014
## 140 0.6082 nan 0.1000 -0.0019
## 160 0.5918 nan 0.1000 -0.0021
## 180 0.5794 nan 0.1000 -0.0015
## 200 0.5672 nan 0.1000 -0.0019
## 220 0.5546 nan 0.1000 -0.0020
## 240 0.5432 nan 0.1000 -0.0016
## 250 0.5382 nan 0.1000 -0.0019
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2497 nan 0.1000 0.0392
## 2 1.1782 nan 0.1000 0.0312
## 3 1.1203 nan 0.1000 0.0259
## 4 1.0800 nan 0.1000 0.0189
## 5 1.0352 nan 0.1000 0.0156
## 6 0.9983 nan 0.1000 0.0166
## 7 0.9615 nan 0.1000 0.0149
## 8 0.9335 nan 0.1000 0.0099
## 9 0.9102 nan 0.1000 0.0099
## 10 0.8922 nan 0.1000 0.0082
## 20 0.7734 nan 0.1000 0.0012
## 40 0.7027 nan 0.1000 -0.0004
## 60 0.6518 nan 0.1000 -0.0013
## 80 0.6182 nan 0.1000 -0.0003
## 100 0.5960 nan 0.1000 -0.0020
## 120 0.5768 nan 0.1000 -0.0016
## 140 0.5563 nan 0.1000 -0.0029
## 160 0.5373 nan 0.1000 -0.0028
## 180 0.5187 nan 0.1000 -0.0023
## 200 0.5029 nan 0.1000 -0.0024
## 220 0.4921 nan 0.1000 -0.0032
## 240 0.4793 nan 0.1000 -0.0021
## 250 0.4704 nan 0.1000 -0.0030
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2478 nan 0.1000 0.0410
## 2 1.1723 nan 0.1000 0.0328
## 3 1.1143 nan 0.1000 0.0231
## 4 1.0612 nan 0.1000 0.0252
## 5 1.0175 nan 0.1000 0.0204
## 6 0.9820 nan 0.1000 0.0142
## 7 0.9474 nan 0.1000 0.0140
## 8 0.9195 nan 0.1000 0.0109
## 9 0.8920 nan 0.1000 0.0080
## 10 0.8715 nan 0.1000 0.0072
## 20 0.7519 nan 0.1000 0.0006
## 40 0.6685 nan 0.1000 -0.0015
## 60 0.6283 nan 0.1000 -0.0036
## 80 0.5935 nan 0.1000 -0.0012
## 100 0.5617 nan 0.1000 -0.0017
## 120 0.5326 nan 0.1000 -0.0019
## 140 0.5034 nan 0.1000 -0.0018
## 160 0.4834 nan 0.1000 -0.0010
## 180 0.4657 nan 0.1000 -0.0024
## 200 0.4493 nan 0.1000 -0.0019
## 220 0.4330 nan 0.1000 -0.0021
## 240 0.4199 nan 0.1000 -0.0020
## 250 0.4146 nan 0.1000 -0.0022
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2695 nan 0.1000 0.0326
## 2 1.2152 nan 0.1000 0.0247
## 3 1.1754 nan 0.1000 0.0204
## 4 1.1362 nan 0.1000 0.0172
## 5 1.1024 nan 0.1000 0.0129
## 6 1.0755 nan 0.1000 0.0127
## 7 1.0508 nan 0.1000 0.0099
## 8 1.0303 nan 0.1000 0.0082
## 9 1.0170 nan 0.1000 0.0035
## 10 0.9986 nan 0.1000 0.0091
## 20 0.8966 nan 0.1000 0.0029
## 40 0.8035 nan 0.1000 0.0009
## 60 0.7677 nan 0.1000 -0.0002
## 80 0.7470 nan 0.1000 -0.0005
## 100 0.7358 nan 0.1000 -0.0006
## 120 0.7296 nan 0.1000 -0.0006
## 140 0.7232 nan 0.1000 -0.0004
## 160 0.7186 nan 0.1000 -0.0004
## 180 0.7154 nan 0.1000 -0.0005
## 200 0.7112 nan 0.1000 -0.0008
## 220 0.7086 nan 0.1000 -0.0008
## 240 0.7056 nan 0.1000 -0.0008
## 250 0.7035 nan 0.1000 -0.0002
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2616 nan 0.1000 0.0341
## 2 1.1989 nan 0.1000 0.0297
## 3 1.1530 nan 0.1000 0.0228
## 4 1.1080 nan 0.1000 0.0207
## 5 1.0681 nan 0.1000 0.0167
## 6 1.0352 nan 0.1000 0.0143
## 7 1.0070 nan 0.1000 0.0108
## 8 0.9791 nan 0.1000 0.0094
## 9 0.9556 nan 0.1000 0.0107
## 10 0.9401 nan 0.1000 0.0070
## 20 0.8221 nan 0.1000 0.0021
## 40 0.7419 nan 0.1000 0.0000
## 60 0.7153 nan 0.1000 -0.0014
## 80 0.6964 nan 0.1000 -0.0007
## 100 0.6807 nan 0.1000 -0.0008
## 120 0.6636 nan 0.1000 -0.0016
## 140 0.6503 nan 0.1000 -0.0008
## 160 0.6420 nan 0.1000 -0.0009
## 180 0.6313 nan 0.1000 -0.0011
## 200 0.6217 nan 0.1000 -0.0008
## 220 0.6128 nan 0.1000 -0.0016
## 240 0.6023 nan 0.1000 -0.0017
## 250 0.5993 nan 0.1000 -0.0004
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2513 nan 0.1000 0.0399
## 2 1.1900 nan 0.1000 0.0319
## 3 1.1299 nan 0.1000 0.0287
## 4 1.0744 nan 0.1000 0.0245
## 5 1.0358 nan 0.1000 0.0188
## 6 1.0012 nan 0.1000 0.0154
## 7 0.9720 nan 0.1000 0.0121
## 8 0.9448 nan 0.1000 0.0120
## 9 0.9199 nan 0.1000 0.0081
## 10 0.9006 nan 0.1000 0.0092
## 20 0.7763 nan 0.1000 0.0011
## 40 0.7048 nan 0.1000 -0.0014
## 60 0.6685 nan 0.1000 -0.0009
## 80 0.6493 nan 0.1000 -0.0026
## 100 0.6298 nan 0.1000 -0.0021
## 120 0.6104 nan 0.1000 -0.0020
## 140 0.5975 nan 0.1000 -0.0013
## 160 0.5797 nan 0.1000 -0.0018
## 180 0.5674 nan 0.1000 -0.0009
## 200 0.5548 nan 0.1000 -0.0023
## 220 0.5433 nan 0.1000 -0.0017
## 240 0.5321 nan 0.1000 -0.0015
## 250 0.5228 nan 0.1000 -0.0015
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2548 nan 0.1000 0.0370
## 2 1.1816 nan 0.1000 0.0331
## 3 1.1221 nan 0.1000 0.0277
## 4 1.0701 nan 0.1000 0.0237
## 5 1.0253 nan 0.1000 0.0222
## 6 0.9885 nan 0.1000 0.0166
## 7 0.9537 nan 0.1000 0.0164
## 8 0.9270 nan 0.1000 0.0106
## 9 0.9049 nan 0.1000 0.0081
## 10 0.8822 nan 0.1000 0.0083
## 20 0.7643 nan 0.1000 0.0013
## 40 0.6871 nan 0.1000 -0.0021
## 60 0.6451 nan 0.1000 -0.0000
## 80 0.6161 nan 0.1000 -0.0024
## 100 0.5833 nan 0.1000 -0.0019
## 120 0.5584 nan 0.1000 -0.0004
## 140 0.5383 nan 0.1000 -0.0023
## 160 0.5202 nan 0.1000 -0.0010
## 180 0.5013 nan 0.1000 -0.0026
## 200 0.4858 nan 0.1000 -0.0020
## 220 0.4713 nan 0.1000 -0.0015
## 240 0.4589 nan 0.1000 -0.0014
## 250 0.4518 nan 0.1000 -0.0019
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2435 nan 0.1000 0.0382
## 2 1.1704 nan 0.1000 0.0350
## 3 1.1045 nan 0.1000 0.0304
## 4 1.0572 nan 0.1000 0.0225
## 5 1.0121 nan 0.1000 0.0197
## 6 0.9712 nan 0.1000 0.0164
## 7 0.9385 nan 0.1000 0.0138
## 8 0.9064 nan 0.1000 0.0133
## 9 0.8792 nan 0.1000 0.0109
## 10 0.8592 nan 0.1000 0.0049
## 20 0.7289 nan 0.1000 0.0017
## 40 0.6539 nan 0.1000 -0.0015
## 60 0.6135 nan 0.1000 -0.0024
## 80 0.5764 nan 0.1000 -0.0019
## 100 0.5433 nan 0.1000 -0.0035
## 120 0.5145 nan 0.1000 -0.0015
## 140 0.4949 nan 0.1000 -0.0006
## 160 0.4762 nan 0.1000 -0.0013
## 180 0.4553 nan 0.1000 -0.0018
## 200 0.4397 nan 0.1000 -0.0024
## 220 0.4259 nan 0.1000 -0.0018
## 240 0.4106 nan 0.1000 -0.0021
## 250 0.4031 nan 0.1000 -0.0019
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2668 nan 0.1000 0.0307
## 2 1.2112 nan 0.1000 0.0233
## 3 1.1693 nan 0.1000 0.0194
## 4 1.1331 nan 0.1000 0.0168
## 5 1.1037 nan 0.1000 0.0129
## 6 1.0755 nan 0.1000 0.0110
## 7 1.0556 nan 0.1000 0.0095
## 8 1.0362 nan 0.1000 0.0082
## 9 1.0212 nan 0.1000 0.0066
## 10 1.0053 nan 0.1000 0.0072
## 20 0.9106 nan 0.1000 0.0034
## 40 0.8137 nan 0.1000 0.0005
## 60 0.7743 nan 0.1000 -0.0001
## 80 0.7563 nan 0.1000 -0.0007
## 100 0.7466 nan 0.1000 -0.0011
## 120 0.7378 nan 0.1000 0.0000
## 140 0.7310 nan 0.1000 -0.0007
## 160 0.7265 nan 0.1000 -0.0009
## 180 0.7214 nan 0.1000 -0.0002
## 200 0.7177 nan 0.1000 -0.0008
## 220 0.7146 nan 0.1000 -0.0013
## 240 0.7110 nan 0.1000 -0.0014
## 250 0.7085 nan 0.1000 -0.0009
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2621 nan 0.1000 0.0350
## 2 1.2045 nan 0.1000 0.0289
## 3 1.1569 nan 0.1000 0.0218
## 4 1.1136 nan 0.1000 0.0215
## 5 1.0750 nan 0.1000 0.0185
## 6 1.0451 nan 0.1000 0.0117
## 7 1.0186 nan 0.1000 0.0118
## 8 0.9931 nan 0.1000 0.0097
## 9 0.9720 nan 0.1000 0.0071
## 10 0.9541 nan 0.1000 0.0082
## 20 0.8360 nan 0.1000 0.0027
## 40 0.7541 nan 0.1000 0.0003
## 60 0.7264 nan 0.1000 -0.0024
## 80 0.7067 nan 0.1000 -0.0012
## 100 0.6899 nan 0.1000 -0.0006
## 120 0.6787 nan 0.1000 -0.0004
## 140 0.6631 nan 0.1000 -0.0002
## 160 0.6507 nan 0.1000 -0.0008
## 180 0.6405 nan 0.1000 -0.0010
## 200 0.6339 nan 0.1000 -0.0007
## 220 0.6242 nan 0.1000 -0.0007
## 240 0.6154 nan 0.1000 -0.0011
## 250 0.6095 nan 0.1000 -0.0006
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2632 nan 0.1000 0.0342
## 2 1.1959 nan 0.1000 0.0287
## 3 1.1429 nan 0.1000 0.0267
## 4 1.0972 nan 0.1000 0.0205
## 5 1.0593 nan 0.1000 0.0168
## 6 1.0207 nan 0.1000 0.0173
## 7 0.9888 nan 0.1000 0.0136
## 8 0.9621 nan 0.1000 0.0112
## 9 0.9388 nan 0.1000 0.0092
## 10 0.9150 nan 0.1000 0.0087
## 20 0.7902 nan 0.1000 0.0007
## 40 0.7133 nan 0.1000 -0.0014
## 60 0.6798 nan 0.1000 -0.0010
## 80 0.6580 nan 0.1000 -0.0017
## 100 0.6396 nan 0.1000 -0.0010
## 120 0.6209 nan 0.1000 -0.0007
## 140 0.6030 nan 0.1000 -0.0016
## 160 0.5849 nan 0.1000 -0.0017
## 180 0.5733 nan 0.1000 -0.0009
## 200 0.5586 nan 0.1000 -0.0016
## 220 0.5420 nan 0.1000 -0.0007
## 240 0.5324 nan 0.1000 -0.0009
## 250 0.5271 nan 0.1000 -0.0013
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2484 nan 0.1000 0.0427
## 2 1.1835 nan 0.1000 0.0321
## 3 1.1229 nan 0.1000 0.0279
## 4 1.0706 nan 0.1000 0.0217
## 5 1.0298 nan 0.1000 0.0184
## 6 0.9923 nan 0.1000 0.0169
## 7 0.9583 nan 0.1000 0.0143
## 8 0.9299 nan 0.1000 0.0095
## 9 0.9028 nan 0.1000 0.0102
## 10 0.8808 nan 0.1000 0.0061
## 20 0.7622 nan 0.1000 0.0014
## 40 0.6812 nan 0.1000 -0.0014
## 60 0.6394 nan 0.1000 0.0001
## 80 0.6043 nan 0.1000 -0.0018
## 100 0.5808 nan 0.1000 -0.0028
## 120 0.5575 nan 0.1000 -0.0015
## 140 0.5348 nan 0.1000 -0.0015
## 160 0.5172 nan 0.1000 -0.0015
## 180 0.5038 nan 0.1000 -0.0018
## 200 0.4892 nan 0.1000 -0.0015
## 220 0.4736 nan 0.1000 -0.0013
## 240 0.4591 nan 0.1000 -0.0021
## 250 0.4535 nan 0.1000 -0.0033
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2496 nan 0.1000 0.0437
## 2 1.1761 nan 0.1000 0.0321
## 3 1.1154 nan 0.1000 0.0290
## 4 1.0631 nan 0.1000 0.0229
## 5 1.0186 nan 0.1000 0.0182
## 6 0.9765 nan 0.1000 0.0176
## 7 0.9429 nan 0.1000 0.0132
## 8 0.9157 nan 0.1000 0.0120
## 9 0.8892 nan 0.1000 0.0096
## 10 0.8652 nan 0.1000 0.0079
## 20 0.7400 nan 0.1000 0.0021
## 40 0.6617 nan 0.1000 -0.0015
## 60 0.6131 nan 0.1000 -0.0025
## 80 0.5793 nan 0.1000 -0.0019
## 100 0.5491 nan 0.1000 -0.0019
## 120 0.5207 nan 0.1000 -0.0028
## 140 0.5000 nan 0.1000 -0.0022
## 160 0.4821 nan 0.1000 -0.0021
## 180 0.4616 nan 0.1000 -0.0017
## 200 0.4435 nan 0.1000 -0.0013
## 220 0.4325 nan 0.1000 -0.0019
## 240 0.4151 nan 0.1000 -0.0013
## 250 0.4090 nan 0.1000 -0.0017
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2777 nan 0.1000 0.0280
## 2 1.2263 nan 0.1000 0.0242
## 3 1.1908 nan 0.1000 0.0176
## 4 1.1537 nan 0.1000 0.0180
## 5 1.1227 nan 0.1000 0.0144
## 6 1.0952 nan 0.1000 0.0108
## 7 1.0723 nan 0.1000 0.0103
## 8 1.0549 nan 0.1000 0.0080
## 9 1.0377 nan 0.1000 0.0086
## 10 1.0240 nan 0.1000 0.0067
## 20 0.9253 nan 0.1000 0.0032
## 40 0.8406 nan 0.1000 -0.0002
## 60 0.8031 nan 0.1000 0.0004
## 80 0.7833 nan 0.1000 -0.0001
## 100 0.7738 nan 0.1000 -0.0002
## 120 0.7681 nan 0.1000 -0.0007
## 140 0.7630 nan 0.1000 -0.0005
## 160 0.7596 nan 0.1000 -0.0009
## 180 0.7568 nan 0.1000 -0.0007
## 200 0.7531 nan 0.1000 -0.0018
## 220 0.7490 nan 0.1000 -0.0002
## 240 0.7460 nan 0.1000 -0.0012
## 250 0.7454 nan 0.1000 -0.0004
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2638 nan 0.1000 0.0338
## 2 1.2117 nan 0.1000 0.0254
## 3 1.1685 nan 0.1000 0.0218
## 4 1.1283 nan 0.1000 0.0182
## 5 1.0896 nan 0.1000 0.0164
## 6 1.0572 nan 0.1000 0.0126
## 7 1.0313 nan 0.1000 0.0109
## 8 1.0077 nan 0.1000 0.0106
## 9 0.9873 nan 0.1000 0.0085
## 10 0.9705 nan 0.1000 0.0073
## 20 0.8526 nan 0.1000 0.0032
## 40 0.7763 nan 0.1000 -0.0002
## 60 0.7483 nan 0.1000 -0.0013
## 80 0.7323 nan 0.1000 -0.0002
## 100 0.7187 nan 0.1000 -0.0007
## 120 0.7092 nan 0.1000 -0.0013
## 140 0.6976 nan 0.1000 -0.0017
## 160 0.6877 nan 0.1000 -0.0006
## 180 0.6731 nan 0.1000 -0.0019
## 200 0.6646 nan 0.1000 -0.0016
## 220 0.6556 nan 0.1000 -0.0013
## 240 0.6494 nan 0.1000 -0.0018
## 250 0.6453 nan 0.1000 -0.0015
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2641 nan 0.1000 0.0402
## 2 1.2046 nan 0.1000 0.0273
## 3 1.1480 nan 0.1000 0.0276
## 4 1.1017 nan 0.1000 0.0216
## 5 1.0650 nan 0.1000 0.0186
## 6 1.0284 nan 0.1000 0.0161
## 7 0.9991 nan 0.1000 0.0112
## 8 0.9733 nan 0.1000 0.0087
## 9 0.9507 nan 0.1000 0.0102
## 10 0.9321 nan 0.1000 0.0077
## 20 0.8071 nan 0.1000 0.0020
## 40 0.7351 nan 0.1000 -0.0005
## 60 0.7083 nan 0.1000 -0.0019
## 80 0.6865 nan 0.1000 -0.0016
## 100 0.6688 nan 0.1000 -0.0016
## 120 0.6519 nan 0.1000 -0.0019
## 140 0.6363 nan 0.1000 -0.0021
## 160 0.6254 nan 0.1000 -0.0022
## 180 0.6116 nan 0.1000 -0.0010
## 200 0.5989 nan 0.1000 -0.0021
## 220 0.5885 nan 0.1000 -0.0016
## 240 0.5770 nan 0.1000 -0.0010
## 250 0.5739 nan 0.1000 -0.0011
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2532 nan 0.1000 0.0400
## 2 1.1877 nan 0.1000 0.0276
## 3 1.1326 nan 0.1000 0.0233
## 4 1.0851 nan 0.1000 0.0189
## 5 1.0445 nan 0.1000 0.0170
## 6 1.0116 nan 0.1000 0.0129
## 7 0.9751 nan 0.1000 0.0153
## 8 0.9487 nan 0.1000 0.0125
## 9 0.9245 nan 0.1000 0.0108
## 10 0.9063 nan 0.1000 0.0073
## 20 0.7880 nan 0.1000 0.0018
## 40 0.7174 nan 0.1000 0.0003
## 60 0.6804 nan 0.1000 -0.0009
## 80 0.6517 nan 0.1000 -0.0010
## 100 0.6292 nan 0.1000 -0.0023
## 120 0.6035 nan 0.1000 -0.0021
## 140 0.5854 nan 0.1000 -0.0014
## 160 0.5662 nan 0.1000 -0.0015
## 180 0.5521 nan 0.1000 -0.0027
## 200 0.5364 nan 0.1000 -0.0006
## 220 0.5267 nan 0.1000 -0.0016
## 240 0.5163 nan 0.1000 -0.0023
## 250 0.5060 nan 0.1000 -0.0022
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2483 nan 0.1000 0.0408
## 2 1.1785 nan 0.1000 0.0316
## 3 1.1275 nan 0.1000 0.0260
## 4 1.0783 nan 0.1000 0.0229
## 5 1.0389 nan 0.1000 0.0168
## 6 0.9976 nan 0.1000 0.0163
## 7 0.9681 nan 0.1000 0.0120
## 8 0.9428 nan 0.1000 0.0113
## 9 0.9165 nan 0.1000 0.0118
## 10 0.8922 nan 0.1000 0.0095
## 20 0.7726 nan 0.1000 0.0008
## 40 0.6919 nan 0.1000 -0.0021
## 60 0.6454 nan 0.1000 -0.0016
## 80 0.6132 nan 0.1000 -0.0018
## 100 0.5819 nan 0.1000 -0.0031
## 120 0.5608 nan 0.1000 -0.0029
## 140 0.5386 nan 0.1000 -0.0016
## 160 0.5133 nan 0.1000 -0.0024
## 180 0.4984 nan 0.1000 -0.0014
## 200 0.4773 nan 0.1000 -0.0009
## 220 0.4570 nan 0.1000 -0.0013
## 240 0.4444 nan 0.1000 -0.0017
## 250 0.4393 nan 0.1000 -0.0017
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 1.2499 nan 0.1000 0.0408
## 2 1.1698 nan 0.1000 0.0351
## 3 1.1118 nan 0.1000 0.0293
## 4 1.0622 nan 0.1000 0.0224
## 5 1.0194 nan 0.1000 0.0194
## 6 0.9791 nan 0.1000 0.0172
## 7 0.9469 nan 0.1000 0.0143
## 8 0.9232 nan 0.1000 0.0100
## 9 0.8984 nan 0.1000 0.0105
## 10 0.8774 nan 0.1000 0.0081
## 20 0.7633 nan 0.1000 0.0001
## 40 0.6875 nan 0.1000 -0.0011
## 50 0.6696 nan 0.1000 -0.0015
## Stochastic Gradient Boosting
##
## 857 samples
## 17 predictor
## 2 classes: 'CH', 'MM'
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 771, 771, 771, 770, 772, 772, ...
## Resampling results across tuning parameters:
##
## interaction.depth n.trees ROC Sens Spec
## 1 50 0.8892582 0.8738389 0.7368093
## 1 100 0.8915027 0.8739115 0.7486631
## 1 150 0.8903909 0.8757620 0.7546346
## 1 200 0.8899279 0.8757983 0.7547237
## 1 250 0.8900719 0.8719884 0.7636364
## 2 50 0.8899216 0.8814949 0.7546346
## 2 100 0.8884642 0.8777213 0.7666667
## 2 150 0.8859272 0.8700290 0.7545455
## 2 200 0.8830202 0.8700290 0.7544563
## 2 250 0.8838563 0.8644049 0.7543672
## 3 50 0.8922453 0.8643324 0.7516934
## 3 100 0.8865670 0.8604862 0.7573084
## 3 150 0.8870774 0.8681422 0.7543672
## 3 200 0.8851723 0.8623730 0.7454545
## 3 250 0.8829434 0.8586720 0.7512478
## 4 50 0.8883161 0.8681060 0.7573975
## 4 100 0.8845444 0.8643687 0.7483957
## 4 150 0.8835516 0.8548621 0.7423351
## 4 200 0.8812255 0.8549347 0.7483066
## 4 250 0.8814305 0.8567489 0.7483957
## 5 50 0.8922660 0.8662192 0.7573084
## 5 100 0.8858724 0.8567126 0.7693405
## 5 150 0.8808982 0.8566763 0.7483066
## 5 200 0.8794903 0.8528302 0.7541889
## 5 250 0.8757586 0.8624093 0.7451872
##
## Tuning parameter 'shrinkage' was held constant at a value of 0.1
##
## Tuning parameter 'n.minobsinnode' was held constant at a value of 10
## ROC was used to select the optimal model using the largest value.
## The final values used for the model were n.trees = 50, interaction.depth
## = 5, shrinkage = 0.1 and n.minobsinnode = 10.
oj.pred <- predict(oj.gbm, oj.test, type = "raw")
plot(oj.test$Purchase, oj.pred,
main = "Gradient Boosing Classification: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
## Confusion Matrix and Statistics
##
## Reference
## Prediction CH MM
## CH 116 18
## MM 14 65
##
## Accuracy : 0.8498
## 95% CI : (0.7946, 0.8949)
## No Information Rate : 0.6103
## P-Value [Acc > NIR] : 1.778e-14
##
## Kappa : 0.6814
##
## Mcnemar's Test P-Value : 0.5959
##
## Sensitivity : 0.8923
## Specificity : 0.7831
## Pos Pred Value : 0.8657
## Neg Pred Value : 0.8228
## Prevalence : 0.6103
## Detection Rate : 0.5446
## Detection Prevalence : 0.6291
## Balanced Accuracy : 0.8377
##
## 'Positive' Class : CH
##
Again using the Carseats
data set to predict Sales
, this time I’ll use the gradient boosting method by specifying method = "gbm"
. I’ll use tuneLength = 5
and not worry about tuneGrid
anymore. Caret tunes the following hyperparameters.
n.trees
: number of boosting iterations (increasing n.trees
reduces the error on training set, but may lead to over-fitting)interaction.depth
: maximum tree depth (the default six - node tree appears to do an excellent job)shrinkage
: learning rate (reduces the impact of each
additional fitted base-learner (tree) by reducing the size of
incremental steps and thus penalizes the importance of each consecutive
iteration. The intuition is that it is better to improve a model by
taking many small steps than by taking fewer large steps. If one of the
boosting iterations turns out to be erroneous, its negative impact can
be easily corrected in subsequent steps.)n.minobsinnode
: mimimum terminal node sizecarseats.gbm <- train(Sales ~ .,
data = carseats.train,
method = "gbm", # for bagged tree
tuneLength = 5, # choose up to 5 combinations of tuning parameters
metric = "RMSE", # evaluate hyperparamter combinations with ROC
trControl = trainControl(
method = "cv", # k-fold cross validation
number = 10, # 10 folds
savePredictions = "final", # save predictions for the optimal tuning parameter1
verboseIter = FALSE,
returnData = FALSE
)
)
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8083 nan 0.1000 0.3222
## 2 7.5373 nan 0.1000 0.2522
## 3 7.2731 nan 0.1000 0.1668
## 4 6.9938 nan 0.1000 0.2670
## 5 6.7555 nan 0.1000 0.2492
## 6 6.5373 nan 0.1000 0.1434
## 7 6.3963 nan 0.1000 0.1110
## 8 6.2457 nan 0.1000 0.1310
## 9 6.0646 nan 0.1000 0.1020
## 10 5.9319 nan 0.1000 0.0985
## 20 4.7342 nan 0.1000 0.0081
## 40 3.4912 nan 0.1000 0.0295
## 60 2.7603 nan 0.1000 0.0146
## 80 2.2677 nan 0.1000 0.0200
## 100 1.9059 nan 0.1000 -0.0007
## 120 1.6265 nan 0.1000 0.0041
## 140 1.4293 nan 0.1000 -0.0029
## 160 1.3105 nan 0.1000 -0.0017
## 180 1.2283 nan 0.1000 -0.0062
## 200 1.1594 nan 0.1000 -0.0059
## 220 1.1154 nan 0.1000 -0.0066
## 240 1.0761 nan 0.1000 0.0017
## 250 1.0592 nan 0.1000 0.0013
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.6492 nan 0.1000 0.4488
## 2 7.2567 nan 0.1000 0.2573
## 3 6.8423 nan 0.1000 0.4122
## 4 6.5245 nan 0.1000 0.1662
## 5 6.1513 nan 0.1000 0.3096
## 6 5.8471 nan 0.1000 0.2457
## 7 5.5708 nan 0.1000 0.2508
## 8 5.3479 nan 0.1000 0.2343
## 9 5.1321 nan 0.1000 0.1755
## 10 4.9277 nan 0.1000 0.1584
## 20 3.5557 nan 0.1000 0.0285
## 40 2.2578 nan 0.1000 0.0175
## 60 1.5877 nan 0.1000 0.0115
## 80 1.2644 nan 0.1000 0.0020
## 100 1.0530 nan 0.1000 -0.0088
## 120 0.9395 nan 0.1000 -0.0025
## 140 0.8725 nan 0.1000 -0.0095
## 160 0.8315 nan 0.1000 -0.0052
## 180 0.7980 nan 0.1000 -0.0096
## 200 0.7541 nan 0.1000 -0.0046
## 220 0.7211 nan 0.1000 -0.0073
## 240 0.6918 nan 0.1000 -0.0027
## 250 0.6788 nan 0.1000 -0.0031
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5210 nan 0.1000 0.5620
## 2 6.9689 nan 0.1000 0.4981
## 3 6.4303 nan 0.1000 0.4822
## 4 6.0757 nan 0.1000 0.3230
## 5 5.7039 nan 0.1000 0.2940
## 6 5.3513 nan 0.1000 0.2604
## 7 5.0253 nan 0.1000 0.2711
## 8 4.8282 nan 0.1000 0.1118
## 9 4.6007 nan 0.1000 0.1444
## 10 4.3742 nan 0.1000 0.1277
## 20 2.9120 nan 0.1000 0.0591
## 40 1.7293 nan 0.1000 0.0091
## 60 1.1679 nan 0.1000 -0.0004
## 80 0.9392 nan 0.1000 -0.0041
## 100 0.8185 nan 0.1000 -0.0067
## 120 0.7410 nan 0.1000 -0.0106
## 140 0.6849 nan 0.1000 -0.0104
## 160 0.6419 nan 0.1000 -0.0072
## 180 0.5944 nan 0.1000 -0.0080
## 200 0.5504 nan 0.1000 -0.0072
## 220 0.5115 nan 0.1000 -0.0080
## 240 0.4724 nan 0.1000 -0.0024
## 250 0.4592 nan 0.1000 -0.0068
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.3346 nan 0.1000 0.6871
## 2 6.7665 nan 0.1000 0.4967
## 3 6.2338 nan 0.1000 0.4800
## 4 5.7985 nan 0.1000 0.3622
## 5 5.4520 nan 0.1000 0.2654
## 6 5.1270 nan 0.1000 0.2337
## 7 4.8230 nan 0.1000 0.1561
## 8 4.5635 nan 0.1000 0.1436
## 9 4.2716 nan 0.1000 0.2062
## 10 4.0009 nan 0.1000 0.1409
## 20 2.5217 nan 0.1000 0.0622
## 40 1.3221 nan 0.1000 0.0138
## 60 0.9659 nan 0.1000 0.0024
## 80 0.7910 nan 0.1000 -0.0052
## 100 0.6761 nan 0.1000 -0.0083
## 120 0.5920 nan 0.1000 -0.0026
## 140 0.5309 nan 0.1000 -0.0096
## 160 0.4725 nan 0.1000 -0.0011
## 180 0.4275 nan 0.1000 -0.0044
## 200 0.3886 nan 0.1000 -0.0036
## 220 0.3535 nan 0.1000 -0.0092
## 240 0.3264 nan 0.1000 -0.0039
## 250 0.3123 nan 0.1000 -0.0046
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.3585 nan 0.1000 0.7141
## 2 6.6648 nan 0.1000 0.6067
## 3 6.1735 nan 0.1000 0.4039
## 4 5.6415 nan 0.1000 0.4441
## 5 5.2180 nan 0.1000 0.2984
## 6 4.8816 nan 0.1000 0.2339
## 7 4.5263 nan 0.1000 0.2552
## 8 4.2660 nan 0.1000 0.1862
## 9 3.9640 nan 0.1000 0.1885
## 10 3.7871 nan 0.1000 0.1247
## 20 2.3312 nan 0.1000 0.0541
## 40 1.2304 nan 0.1000 -0.0013
## 60 0.8577 nan 0.1000 -0.0020
## 80 0.6964 nan 0.1000 -0.0036
## 100 0.5866 nan 0.1000 -0.0052
## 120 0.4989 nan 0.1000 -0.0071
## 140 0.4266 nan 0.1000 -0.0077
## 160 0.3723 nan 0.1000 -0.0071
## 180 0.3255 nan 0.1000 -0.0030
## 200 0.2891 nan 0.1000 -0.0035
## 220 0.2584 nan 0.1000 -0.0048
## 240 0.2284 nan 0.1000 -0.0046
## 250 0.2159 nan 0.1000 -0.0047
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.9330 nan 0.1000 0.3557
## 2 7.5815 nan 0.1000 0.3197
## 3 7.3020 nan 0.1000 0.2157
## 4 7.0110 nan 0.1000 0.1696
## 5 6.7307 nan 0.1000 0.2302
## 6 6.5078 nan 0.1000 0.1918
## 7 6.3919 nan 0.1000 0.0478
## 8 6.2202 nan 0.1000 0.1817
## 9 6.0632 nan 0.1000 0.1441
## 10 5.8993 nan 0.1000 0.1278
## 20 4.7903 nan 0.1000 0.0732
## 40 3.5897 nan 0.1000 0.0206
## 60 2.8508 nan 0.1000 0.0105
## 80 2.3468 nan 0.1000 -0.0164
## 100 1.9889 nan 0.1000 -0.0099
## 120 1.7199 nan 0.1000 -0.0006
## 140 1.5062 nan 0.1000 0.0022
## 160 1.3679 nan 0.1000 0.0044
## 180 1.2511 nan 0.1000 -0.0065
## 200 1.1667 nan 0.1000 0.0028
## 220 1.1131 nan 0.1000 -0.0029
## 240 1.0693 nan 0.1000 -0.0082
## 250 1.0510 nan 0.1000 -0.0149
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7776 nan 0.1000 0.3320
## 2 7.3738 nan 0.1000 0.3373
## 3 6.9380 nan 0.1000 0.4719
## 4 6.5766 nan 0.1000 0.3062
## 5 6.2563 nan 0.1000 0.2596
## 6 6.0342 nan 0.1000 0.1505
## 7 5.7380 nan 0.1000 0.2180
## 8 5.5144 nan 0.1000 0.1368
## 9 5.3078 nan 0.1000 0.1113
## 10 5.0816 nan 0.1000 0.1313
## 20 3.7656 nan 0.1000 0.0581
## 40 2.3556 nan 0.1000 0.0184
## 60 1.6763 nan 0.1000 0.0005
## 80 1.2785 nan 0.1000 -0.0079
## 100 1.0706 nan 0.1000 -0.0027
## 120 0.9522 nan 0.1000 -0.0063
## 140 0.8837 nan 0.1000 -0.0079
## 160 0.8203 nan 0.1000 -0.0035
## 180 0.7791 nan 0.1000 -0.0071
## 200 0.7407 nan 0.1000 -0.0071
## 220 0.7091 nan 0.1000 -0.0041
## 240 0.6706 nan 0.1000 -0.0054
## 250 0.6581 nan 0.1000 -0.0071
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7039 nan 0.1000 0.3643
## 2 7.1384 nan 0.1000 0.4992
## 3 6.6515 nan 0.1000 0.3488
## 4 6.2373 nan 0.1000 0.3208
## 5 5.8920 nan 0.1000 0.2042
## 6 5.6256 nan 0.1000 0.1710
## 7 5.2995 nan 0.1000 0.2457
## 8 5.0149 nan 0.1000 0.2497
## 9 4.7454 nan 0.1000 0.2224
## 10 4.4968 nan 0.1000 0.1984
## 20 2.9812 nan 0.1000 0.0539
## 40 1.7429 nan 0.1000 0.0077
## 60 1.2034 nan 0.1000 -0.0049
## 80 0.9751 nan 0.1000 -0.0114
## 100 0.8546 nan 0.1000 -0.0072
## 120 0.7740 nan 0.1000 -0.0086
## 140 0.6988 nan 0.1000 -0.0060
## 160 0.6361 nan 0.1000 -0.0066
## 180 0.5950 nan 0.1000 -0.0064
## 200 0.5539 nan 0.1000 -0.0044
## 220 0.5149 nan 0.1000 -0.0030
## 240 0.4695 nan 0.1000 -0.0022
## 250 0.4557 nan 0.1000 -0.0039
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5466 nan 0.1000 0.5424
## 2 6.9640 nan 0.1000 0.4184
## 3 6.4363 nan 0.1000 0.4220
## 4 6.0841 nan 0.1000 0.2054
## 5 5.6870 nan 0.1000 0.3668
## 6 5.3576 nan 0.1000 0.2926
## 7 5.0809 nan 0.1000 0.1989
## 8 4.7942 nan 0.1000 0.2171
## 9 4.4966 nan 0.1000 0.1916
## 10 4.2471 nan 0.1000 0.1694
## 20 2.5592 nan 0.1000 0.0526
## 40 1.3918 nan 0.1000 0.0142
## 60 0.9875 nan 0.1000 -0.0128
## 80 0.8082 nan 0.1000 -0.0020
## 100 0.6794 nan 0.1000 -0.0024
## 120 0.5921 nan 0.1000 -0.0067
## 140 0.5217 nan 0.1000 -0.0093
## 160 0.4676 nan 0.1000 -0.0048
## 180 0.4221 nan 0.1000 -0.0094
## 200 0.3821 nan 0.1000 -0.0064
## 220 0.3467 nan 0.1000 -0.0049
## 240 0.3138 nan 0.1000 -0.0036
## 250 0.2967 nan 0.1000 -0.0042
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4553 nan 0.1000 0.7316
## 2 6.7115 nan 0.1000 0.5345
## 3 6.2173 nan 0.1000 0.4124
## 4 5.7242 nan 0.1000 0.4639
## 5 5.3569 nan 0.1000 0.2850
## 6 4.9314 nan 0.1000 0.3605
## 7 4.5846 nan 0.1000 0.1959
## 8 4.2277 nan 0.1000 0.2166
## 9 4.0476 nan 0.1000 0.1328
## 10 3.7971 nan 0.1000 0.2048
## 20 2.4077 nan 0.1000 0.0056
## 40 1.2306 nan 0.1000 0.0156
## 60 0.8506 nan 0.1000 -0.0085
## 80 0.6894 nan 0.1000 -0.0121
## 100 0.5810 nan 0.1000 -0.0135
## 120 0.4880 nan 0.1000 -0.0039
## 140 0.4209 nan 0.1000 -0.0033
## 160 0.3616 nan 0.1000 -0.0061
## 180 0.3109 nan 0.1000 -0.0036
## 200 0.2751 nan 0.1000 -0.0030
## 220 0.2381 nan 0.1000 -0.0021
## 240 0.2081 nan 0.1000 -0.0049
## 250 0.1960 nan 0.1000 -0.0025
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7442 nan 0.1000 0.3719
## 2 7.4030 nan 0.1000 0.3158
## 3 7.1195 nan 0.1000 0.2761
## 4 6.8792 nan 0.1000 0.2472
## 5 6.6493 nan 0.1000 0.1733
## 6 6.4290 nan 0.1000 0.1811
## 7 6.2570 nan 0.1000 0.1749
## 8 6.1447 nan 0.1000 0.0373
## 9 5.9788 nan 0.1000 0.1016
## 10 5.8236 nan 0.1000 0.1531
## 20 4.7155 nan 0.1000 0.0472
## 40 3.4993 nan 0.1000 0.0343
## 60 2.8051 nan 0.1000 0.0197
## 80 2.3021 nan 0.1000 0.0107
## 100 1.9374 nan 0.1000 -0.0036
## 120 1.6668 nan 0.1000 0.0023
## 140 1.4949 nan 0.1000 -0.0043
## 160 1.3703 nan 0.1000 -0.0076
## 180 1.2568 nan 0.1000 0.0031
## 200 1.1908 nan 0.1000 -0.0095
## 220 1.1329 nan 0.1000 -0.0013
## 240 1.0982 nan 0.1000 -0.0133
## 250 1.0846 nan 0.1000 -0.0051
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5439 nan 0.1000 0.4978
## 2 7.0509 nan 0.1000 0.4889
## 3 6.7885 nan 0.1000 0.1670
## 4 6.3991 nan 0.1000 0.3035
## 5 6.0628 nan 0.1000 0.2573
## 6 5.7237 nan 0.1000 0.1900
## 7 5.4494 nan 0.1000 0.2100
## 8 5.2535 nan 0.1000 0.1100
## 9 5.1093 nan 0.1000 0.0826
## 10 4.9036 nan 0.1000 0.1417
## 20 3.6477 nan 0.1000 0.0990
## 40 2.2566 nan 0.1000 0.0215
## 60 1.5752 nan 0.1000 -0.0084
## 80 1.2668 nan 0.1000 -0.0043
## 100 1.0888 nan 0.1000 -0.0144
## 120 0.9971 nan 0.1000 -0.0054
## 140 0.9237 nan 0.1000 -0.0046
## 160 0.8674 nan 0.1000 -0.0072
## 180 0.8310 nan 0.1000 -0.0054
## 200 0.7933 nan 0.1000 -0.0056
## 220 0.7661 nan 0.1000 -0.0081
## 240 0.7382 nan 0.1000 -0.0033
## 250 0.7215 nan 0.1000 -0.0058
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5054 nan 0.1000 0.4686
## 2 6.9529 nan 0.1000 0.5524
## 3 6.4910 nan 0.1000 0.4208
## 4 6.0378 nan 0.1000 0.2965
## 5 5.6289 nan 0.1000 0.3351
## 6 5.3241 nan 0.1000 0.1968
## 7 5.0188 nan 0.1000 0.2526
## 8 4.7180 nan 0.1000 0.1965
## 9 4.4985 nan 0.1000 0.1447
## 10 4.3379 nan 0.1000 0.0564
## 20 2.9392 nan 0.1000 0.0806
## 40 1.6909 nan 0.1000 0.0050
## 60 1.1988 nan 0.1000 -0.0000
## 80 1.0007 nan 0.1000 -0.0212
## 100 0.8842 nan 0.1000 -0.0056
## 120 0.8000 nan 0.1000 -0.0059
## 140 0.7270 nan 0.1000 -0.0052
## 160 0.6667 nan 0.1000 -0.0071
## 180 0.6155 nan 0.1000 -0.0133
## 200 0.5727 nan 0.1000 -0.0087
## 220 0.5290 nan 0.1000 -0.0067
## 240 0.4915 nan 0.1000 -0.0025
## 250 0.4782 nan 0.1000 -0.0030
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4035 nan 0.1000 0.6904
## 2 6.9283 nan 0.1000 0.4556
## 3 6.4092 nan 0.1000 0.5197
## 4 5.9303 nan 0.1000 0.3032
## 5 5.5535 nan 0.1000 0.2695
## 6 5.2311 nan 0.1000 0.2874
## 7 4.9381 nan 0.1000 0.1893
## 8 4.6419 nan 0.1000 0.2742
## 9 4.4018 nan 0.1000 0.1765
## 10 4.1806 nan 0.1000 0.1767
## 20 2.6311 nan 0.1000 0.0720
## 40 1.4426 nan 0.1000 -0.0115
## 60 1.0581 nan 0.1000 0.0029
## 80 0.8459 nan 0.1000 -0.0032
## 100 0.7316 nan 0.1000 -0.0084
## 120 0.6592 nan 0.1000 -0.0062
## 140 0.5999 nan 0.1000 -0.0093
## 160 0.5281 nan 0.1000 -0.0042
## 180 0.4779 nan 0.1000 -0.0071
## 200 0.4298 nan 0.1000 -0.0048
## 220 0.3885 nan 0.1000 -0.0045
## 240 0.3565 nan 0.1000 -0.0028
## 250 0.3382 nan 0.1000 -0.0037
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.3433 nan 0.1000 0.7021
## 2 6.7262 nan 0.1000 0.5648
## 3 6.2392 nan 0.1000 0.4434
## 4 5.7073 nan 0.1000 0.4005
## 5 5.2404 nan 0.1000 0.3012
## 6 4.9489 nan 0.1000 0.1632
## 7 4.6270 nan 0.1000 0.2434
## 8 4.3465 nan 0.1000 0.1920
## 9 4.0890 nan 0.1000 0.2193
## 10 3.8855 nan 0.1000 0.1706
## 20 2.4046 nan 0.1000 0.0485
## 40 1.2913 nan 0.1000 -0.0061
## 60 0.9399 nan 0.1000 -0.0044
## 80 0.7517 nan 0.1000 -0.0088
## 100 0.6134 nan 0.1000 -0.0101
## 120 0.5152 nan 0.1000 -0.0061
## 140 0.4433 nan 0.1000 -0.0097
## 160 0.3819 nan 0.1000 -0.0052
## 180 0.3293 nan 0.1000 -0.0009
## 200 0.2911 nan 0.1000 -0.0058
## 220 0.2574 nan 0.1000 -0.0056
## 240 0.2328 nan 0.1000 -0.0031
## 250 0.2194 nan 0.1000 -0.0023
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8675 nan 0.1000 0.3867
## 2 7.5497 nan 0.1000 0.2510
## 3 7.2923 nan 0.1000 0.2401
## 4 7.0818 nan 0.1000 0.1844
## 5 6.8286 nan 0.1000 0.2230
## 6 6.6572 nan 0.1000 0.1577
## 7 6.4480 nan 0.1000 0.1719
## 8 6.2955 nan 0.1000 0.1648
## 9 6.1013 nan 0.1000 0.1478
## 10 5.9570 nan 0.1000 0.0780
## 20 4.8206 nan 0.1000 0.0610
## 40 3.6287 nan 0.1000 0.0270
## 60 2.8496 nan 0.1000 -0.0136
## 80 2.3160 nan 0.1000 0.0092
## 100 1.9520 nan 0.1000 0.0096
## 120 1.6896 nan 0.1000 0.0112
## 140 1.4958 nan 0.1000 -0.0089
## 160 1.3769 nan 0.1000 -0.0059
## 180 1.2649 nan 0.1000 -0.0101
## 200 1.1711 nan 0.1000 -0.0049
## 220 1.1193 nan 0.1000 -0.0015
## 240 1.0805 nan 0.1000 -0.0073
## 250 1.0649 nan 0.1000 -0.0117
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7270 nan 0.1000 0.4797
## 2 7.2158 nan 0.1000 0.4339
## 3 6.8464 nan 0.1000 0.3657
## 4 6.5013 nan 0.1000 0.2780
## 5 6.1550 nan 0.1000 0.3012
## 6 5.8937 nan 0.1000 0.2162
## 7 5.5964 nan 0.1000 0.2724
## 8 5.3937 nan 0.1000 0.1875
## 9 5.2036 nan 0.1000 0.1245
## 10 5.0077 nan 0.1000 0.1555
## 20 3.6097 nan 0.1000 0.0687
## 40 2.1695 nan 0.1000 0.0230
## 60 1.5659 nan 0.1000 -0.0003
## 80 1.2119 nan 0.1000 0.0057
## 100 1.0335 nan 0.1000 -0.0051
## 120 0.9296 nan 0.1000 -0.0089
## 140 0.8770 nan 0.1000 -0.0065
## 160 0.8365 nan 0.1000 -0.0028
## 180 0.7948 nan 0.1000 -0.0065
## 200 0.7599 nan 0.1000 -0.0071
## 220 0.7263 nan 0.1000 -0.0068
## 240 0.6912 nan 0.1000 -0.0028
## 250 0.6773 nan 0.1000 -0.0046
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.6322 nan 0.1000 0.4782
## 2 7.1640 nan 0.1000 0.3469
## 3 6.8084 nan 0.1000 0.3421
## 4 6.5026 nan 0.1000 0.2733
## 5 6.0608 nan 0.1000 0.3986
## 6 5.6664 nan 0.1000 0.3311
## 7 5.3191 nan 0.1000 0.3128
## 8 4.9730 nan 0.1000 0.2600
## 9 4.7023 nan 0.1000 0.2122
## 10 4.4858 nan 0.1000 0.1509
## 20 2.9682 nan 0.1000 0.0911
## 40 1.7363 nan 0.1000 -0.0071
## 60 1.2581 nan 0.1000 -0.0040
## 80 1.0368 nan 0.1000 -0.0075
## 100 0.8924 nan 0.1000 -0.0100
## 120 0.8029 nan 0.1000 -0.0070
## 140 0.7379 nan 0.1000 -0.0041
## 160 0.6832 nan 0.1000 -0.0020
## 180 0.6236 nan 0.1000 -0.0077
## 200 0.5754 nan 0.1000 -0.0018
## 220 0.5300 nan 0.1000 0.0004
## 240 0.4928 nan 0.1000 -0.0090
## 250 0.4756 nan 0.1000 -0.0029
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5979 nan 0.1000 0.6625
## 2 6.9880 nan 0.1000 0.6003
## 3 6.4016 nan 0.1000 0.4388
## 4 5.9872 nan 0.1000 0.3006
## 5 5.5931 nan 0.1000 0.3202
## 6 5.2850 nan 0.1000 0.2632
## 7 5.0183 nan 0.1000 0.1597
## 8 4.6475 nan 0.1000 0.2820
## 9 4.3387 nan 0.1000 0.1972
## 10 4.1550 nan 0.1000 0.0889
## 20 2.5411 nan 0.1000 0.1051
## 40 1.3715 nan 0.1000 0.0096
## 60 0.9936 nan 0.1000 -0.0052
## 80 0.8276 nan 0.1000 -0.0141
## 100 0.7003 nan 0.1000 -0.0069
## 120 0.6291 nan 0.1000 -0.0111
## 140 0.5601 nan 0.1000 -0.0038
## 160 0.5009 nan 0.1000 -0.0095
## 180 0.4502 nan 0.1000 -0.0053
## 200 0.4055 nan 0.1000 -0.0039
## 220 0.3664 nan 0.1000 -0.0039
## 240 0.3334 nan 0.1000 -0.0061
## 250 0.3167 nan 0.1000 -0.0068
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4655 nan 0.1000 0.7638
## 2 6.8454 nan 0.1000 0.5806
## 3 6.2711 nan 0.1000 0.3631
## 4 5.8000 nan 0.1000 0.3499
## 5 5.4702 nan 0.1000 0.1869
## 6 5.0937 nan 0.1000 0.2817
## 7 4.7378 nan 0.1000 0.2245
## 8 4.4474 nan 0.1000 0.1795
## 9 4.1834 nan 0.1000 0.1889
## 10 3.8610 nan 0.1000 0.1279
## 20 2.2752 nan 0.1000 0.0465
## 40 1.2345 nan 0.1000 -0.0048
## 60 0.8656 nan 0.1000 -0.0202
## 80 0.6793 nan 0.1000 -0.0034
## 100 0.5752 nan 0.1000 -0.0099
## 120 0.4946 nan 0.1000 -0.0050
## 140 0.4236 nan 0.1000 -0.0104
## 160 0.3647 nan 0.1000 -0.0056
## 180 0.3242 nan 0.1000 -0.0040
## 200 0.2805 nan 0.1000 -0.0015
## 220 0.2489 nan 0.1000 -0.0059
## 240 0.2223 nan 0.1000 -0.0041
## 250 0.2099 nan 0.1000 -0.0040
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4475 nan 0.1000 0.1887
## 2 7.0997 nan 0.1000 0.3375
## 3 6.7827 nan 0.1000 0.2389
## 4 6.6044 nan 0.1000 0.1189
## 5 6.4460 nan 0.1000 0.1338
## 6 6.2423 nan 0.1000 0.1733
## 7 6.1048 nan 0.1000 0.1162
## 8 5.9531 nan 0.1000 0.0680
## 9 5.8671 nan 0.1000 0.0552
## 10 5.6825 nan 0.1000 0.1514
## 20 4.6692 nan 0.1000 0.0580
## 40 3.4975 nan 0.1000 0.0020
## 60 2.7627 nan 0.1000 0.0138
## 80 2.2596 nan 0.1000 0.0185
## 100 1.9127 nan 0.1000 -0.0087
## 120 1.6686 nan 0.1000 0.0013
## 140 1.4584 nan 0.1000 0.0047
## 160 1.3229 nan 0.1000 0.0030
## 180 1.2188 nan 0.1000 0.0070
## 200 1.1433 nan 0.1000 -0.0053
## 220 1.0778 nan 0.1000 -0.0101
## 240 1.0372 nan 0.1000 -0.0093
## 250 1.0207 nan 0.1000 -0.0047
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.2415 nan 0.1000 0.4769
## 2 6.8833 nan 0.1000 0.2271
## 3 6.4992 nan 0.1000 0.3310
## 4 6.1782 nan 0.1000 0.3218
## 5 5.8905 nan 0.1000 0.2228
## 6 5.6162 nan 0.1000 0.2528
## 7 5.4162 nan 0.1000 0.1619
## 8 5.2214 nan 0.1000 0.1296
## 9 5.0484 nan 0.1000 0.1341
## 10 4.8621 nan 0.1000 0.0692
## 20 3.5527 nan 0.1000 0.0447
## 40 2.2229 nan 0.1000 -0.0085
## 60 1.5796 nan 0.1000 0.0104
## 80 1.2496 nan 0.1000 -0.0010
## 100 1.0906 nan 0.1000 0.0001
## 120 0.9535 nan 0.1000 -0.0067
## 140 0.8747 nan 0.1000 -0.0045
## 160 0.8257 nan 0.1000 -0.0063
## 180 0.7840 nan 0.1000 -0.0061
## 200 0.7439 nan 0.1000 -0.0032
## 220 0.7107 nan 0.1000 -0.0054
## 240 0.6757 nan 0.1000 -0.0069
## 250 0.6577 nan 0.1000 -0.0073
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.1088 nan 0.1000 0.5638
## 2 6.6710 nan 0.1000 0.3505
## 3 6.2541 nan 0.1000 0.3938
## 4 5.8766 nan 0.1000 0.2508
## 5 5.6339 nan 0.1000 0.1800
## 6 5.3559 nan 0.1000 0.2471
## 7 5.0913 nan 0.1000 0.2187
## 8 4.8171 nan 0.1000 0.1871
## 9 4.6550 nan 0.1000 0.0454
## 10 4.3562 nan 0.1000 0.1936
## 20 2.8550 nan 0.1000 0.0578
## 40 1.6499 nan 0.1000 0.0036
## 60 1.1905 nan 0.1000 -0.0033
## 80 0.9435 nan 0.1000 -0.0119
## 100 0.8374 nan 0.1000 -0.0100
## 120 0.7537 nan 0.1000 -0.0026
## 140 0.6839 nan 0.1000 -0.0081
## 160 0.6359 nan 0.1000 -0.0120
## 180 0.5989 nan 0.1000 -0.0059
## 200 0.5508 nan 0.1000 -0.0062
## 220 0.5096 nan 0.1000 -0.0044
## 240 0.4756 nan 0.1000 -0.0079
## 250 0.4578 nan 0.1000 -0.0053
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.0143 nan 0.1000 0.5910
## 2 6.4171 nan 0.1000 0.4491
## 3 5.9248 nan 0.1000 0.3718
## 4 5.5713 nan 0.1000 0.2430
## 5 5.2420 nan 0.1000 0.3235
## 6 4.9653 nan 0.1000 0.1426
## 7 4.7021 nan 0.1000 0.1994
## 8 4.4446 nan 0.1000 0.2292
## 9 4.2094 nan 0.1000 0.1597
## 10 3.9844 nan 0.1000 0.2040
## 20 2.4713 nan 0.1000 0.0957
## 40 1.3441 nan 0.1000 0.0270
## 60 0.9566 nan 0.1000 0.0019
## 80 0.7920 nan 0.1000 -0.0101
## 100 0.6886 nan 0.1000 -0.0106
## 120 0.6151 nan 0.1000 -0.0093
## 140 0.5551 nan 0.1000 -0.0059
## 160 0.4985 nan 0.1000 -0.0080
## 180 0.4549 nan 0.1000 -0.0099
## 200 0.4132 nan 0.1000 -0.0070
## 220 0.3703 nan 0.1000 -0.0048
## 240 0.3362 nan 0.1000 -0.0034
## 250 0.3218 nan 0.1000 -0.0039
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.0297 nan 0.1000 0.6546
## 2 6.3980 nan 0.1000 0.5536
## 3 5.9466 nan 0.1000 0.4179
## 4 5.4612 nan 0.1000 0.3786
## 5 4.9838 nan 0.1000 0.3516
## 6 4.7324 nan 0.1000 0.1800
## 7 4.4230 nan 0.1000 0.2210
## 8 4.1407 nan 0.1000 0.2476
## 9 3.9535 nan 0.1000 0.1117
## 10 3.6687 nan 0.1000 0.1818
## 20 2.1426 nan 0.1000 0.0460
## 40 1.1565 nan 0.1000 -0.0077
## 60 0.8489 nan 0.1000 -0.0176
## 80 0.6983 nan 0.1000 -0.0100
## 100 0.5825 nan 0.1000 -0.0066
## 120 0.4974 nan 0.1000 -0.0102
## 140 0.4298 nan 0.1000 -0.0112
## 160 0.3739 nan 0.1000 -0.0041
## 180 0.3300 nan 0.1000 -0.0061
## 200 0.2863 nan 0.1000 -0.0061
## 220 0.2500 nan 0.1000 -0.0041
## 240 0.2207 nan 0.1000 -0.0025
## 250 0.2113 nan 0.1000 -0.0055
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8024 nan 0.1000 0.3828
## 2 7.4983 nan 0.1000 0.1574
## 3 7.3211 nan 0.1000 0.1505
## 4 7.0526 nan 0.1000 0.2570
## 5 6.8169 nan 0.1000 0.1275
## 6 6.5693 nan 0.1000 0.2533
## 7 6.3869 nan 0.1000 0.1891
## 8 6.2189 nan 0.1000 0.1664
## 9 6.0313 nan 0.1000 0.1797
## 10 5.9478 nan 0.1000 0.0246
## 20 4.7387 nan 0.1000 0.0472
## 40 3.5417 nan 0.1000 0.0094
## 60 2.8305 nan 0.1000 -0.0042
## 80 2.3356 nan 0.1000 -0.0093
## 100 1.9600 nan 0.1000 -0.0100
## 120 1.7020 nan 0.1000 0.0060
## 140 1.5348 nan 0.1000 -0.0043
## 160 1.3850 nan 0.1000 -0.0010
## 180 1.2972 nan 0.1000 -0.0069
## 200 1.2384 nan 0.1000 -0.0007
## 220 1.1782 nan 0.1000 -0.0073
## 240 1.1302 nan 0.1000 -0.0056
## 250 1.1080 nan 0.1000 -0.0056
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.6840 nan 0.1000 0.5942
## 2 7.2476 nan 0.1000 0.3880
## 3 6.8109 nan 0.1000 0.3808
## 4 6.4734 nan 0.1000 0.3122
## 5 6.1617 nan 0.1000 0.2058
## 6 5.9258 nan 0.1000 0.1510
## 7 5.6251 nan 0.1000 0.2208
## 8 5.4263 nan 0.1000 0.1748
## 9 5.2030 nan 0.1000 0.1563
## 10 4.9890 nan 0.1000 0.1972
## 20 3.5726 nan 0.1000 0.0397
## 40 2.2067 nan 0.1000 0.0181
## 60 1.6049 nan 0.1000 -0.0075
## 80 1.2971 nan 0.1000 -0.0166
## 100 1.1356 nan 0.1000 -0.0079
## 120 1.0235 nan 0.1000 0.0034
## 140 0.9506 nan 0.1000 -0.0092
## 160 0.9050 nan 0.1000 -0.0069
## 180 0.8575 nan 0.1000 -0.0068
## 200 0.8163 nan 0.1000 -0.0062
## 220 0.7710 nan 0.1000 -0.0025
## 240 0.7305 nan 0.1000 -0.0008
## 250 0.7161 nan 0.1000 -0.0061
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.6319 nan 0.1000 0.5819
## 2 6.9903 nan 0.1000 0.5051
## 3 6.5169 nan 0.1000 0.3811
## 4 6.0950 nan 0.1000 0.2983
## 5 5.7801 nan 0.1000 0.2584
## 6 5.4751 nan 0.1000 0.2878
## 7 5.2161 nan 0.1000 0.1587
## 8 4.9632 nan 0.1000 0.1688
## 9 4.7015 nan 0.1000 0.2004
## 10 4.5647 nan 0.1000 0.0843
## 20 2.9724 nan 0.1000 0.0390
## 40 1.7346 nan 0.1000 0.0227
## 60 1.2445 nan 0.1000 0.0045
## 80 1.0195 nan 0.1000 -0.0036
## 100 0.8792 nan 0.1000 -0.0066
## 120 0.7943 nan 0.1000 -0.0068
## 140 0.7310 nan 0.1000 -0.0071
## 160 0.6791 nan 0.1000 -0.0085
## 180 0.6205 nan 0.1000 -0.0040
## 200 0.5720 nan 0.1000 -0.0106
## 220 0.5337 nan 0.1000 -0.0044
## 240 0.4969 nan 0.1000 -0.0057
## 250 0.4777 nan 0.1000 -0.0040
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4908 nan 0.1000 0.6307
## 2 6.9477 nan 0.1000 0.4794
## 3 6.3998 nan 0.1000 0.4536
## 4 5.8766 nan 0.1000 0.4661
## 5 5.4877 nan 0.1000 0.2765
## 6 5.2396 nan 0.1000 0.1611
## 7 4.9528 nan 0.1000 0.2821
## 8 4.7213 nan 0.1000 0.1122
## 9 4.4745 nan 0.1000 0.1370
## 10 4.2194 nan 0.1000 0.1941
## 20 2.6640 nan 0.1000 0.0266
## 40 1.3993 nan 0.1000 0.0143
## 60 1.0174 nan 0.1000 -0.0071
## 80 0.8469 nan 0.1000 -0.0049
## 100 0.7063 nan 0.1000 -0.0027
## 120 0.6224 nan 0.1000 -0.0113
## 140 0.5611 nan 0.1000 -0.0068
## 160 0.5099 nan 0.1000 -0.0111
## 180 0.4550 nan 0.1000 -0.0078
## 200 0.4114 nan 0.1000 -0.0074
## 220 0.3747 nan 0.1000 -0.0027
## 240 0.3479 nan 0.1000 -0.0017
## 250 0.3288 nan 0.1000 -0.0033
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4486 nan 0.1000 0.7570
## 2 6.8728 nan 0.1000 0.4776
## 3 6.2444 nan 0.1000 0.5121
## 4 5.8024 nan 0.1000 0.2803
## 5 5.3520 nan 0.1000 0.2608
## 6 4.9408 nan 0.1000 0.3283
## 7 4.5973 nan 0.1000 0.2470
## 8 4.2629 nan 0.1000 0.2660
## 9 3.9911 nan 0.1000 0.1433
## 10 3.7436 nan 0.1000 0.1765
## 20 2.2879 nan 0.1000 0.0745
## 40 1.1770 nan 0.1000 0.0048
## 60 0.8448 nan 0.1000 -0.0005
## 80 0.6777 nan 0.1000 -0.0069
## 100 0.5756 nan 0.1000 -0.0049
## 120 0.4963 nan 0.1000 -0.0056
## 140 0.4331 nan 0.1000 -0.0093
## 160 0.3683 nan 0.1000 -0.0047
## 180 0.3303 nan 0.1000 -0.0022
## 200 0.2925 nan 0.1000 -0.0027
## 220 0.2532 nan 0.1000 -0.0053
## 240 0.2260 nan 0.1000 -0.0057
## 250 0.2113 nan 0.1000 -0.0023
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 8.0206 nan 0.1000 0.3696
## 2 7.6736 nan 0.1000 0.3080
## 3 7.4118 nan 0.1000 0.2194
## 4 7.2005 nan 0.1000 0.1478
## 5 6.9695 nan 0.1000 0.1658
## 6 6.8172 nan 0.1000 0.1005
## 7 6.5423 nan 0.1000 0.2264
## 8 6.3720 nan 0.1000 0.1709
## 9 6.2600 nan 0.1000 0.0531
## 10 6.1160 nan 0.1000 0.1144
## 20 4.9819 nan 0.1000 0.0947
## 40 3.7767 nan 0.1000 0.0259
## 60 2.9726 nan 0.1000 0.0138
## 80 2.4685 nan 0.1000 -0.0048
## 100 2.0732 nan 0.1000 0.0065
## 120 1.8036 nan 0.1000 -0.0034
## 140 1.5980 nan 0.1000 -0.0027
## 160 1.4587 nan 0.1000 0.0018
## 180 1.3456 nan 0.1000 -0.0085
## 200 1.2497 nan 0.1000 -0.0065
## 220 1.1883 nan 0.1000 -0.0004
## 240 1.1315 nan 0.1000 0.0021
## 250 1.1150 nan 0.1000 -0.0051
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.9493 nan 0.1000 0.3921
## 2 7.4300 nan 0.1000 0.4943
## 3 6.8776 nan 0.1000 0.3632
## 4 6.5555 nan 0.1000 0.2523
## 5 6.2360 nan 0.1000 0.3029
## 6 5.9558 nan 0.1000 0.3234
## 7 5.7027 nan 0.1000 0.1577
## 8 5.5067 nan 0.1000 0.0906
## 9 5.2605 nan 0.1000 0.1687
## 10 5.0718 nan 0.1000 0.1424
## 20 3.7583 nan 0.1000 0.1128
## 40 2.3049 nan 0.1000 0.0016
## 60 1.6571 nan 0.1000 -0.0091
## 80 1.2761 nan 0.1000 -0.0000
## 100 1.1098 nan 0.1000 -0.0106
## 120 1.0207 nan 0.1000 -0.0098
## 140 0.9491 nan 0.1000 -0.0059
## 160 0.8901 nan 0.1000 -0.0127
## 180 0.8391 nan 0.1000 -0.0045
## 200 0.7972 nan 0.1000 -0.0040
## 220 0.7650 nan 0.1000 -0.0057
## 240 0.7252 nan 0.1000 -0.0032
## 250 0.7143 nan 0.1000 -0.0043
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7083 nan 0.1000 0.4970
## 2 7.2231 nan 0.1000 0.4723
## 3 6.7488 nan 0.1000 0.3537
## 4 6.3104 nan 0.1000 0.3908
## 5 5.9285 nan 0.1000 0.2641
## 6 5.6238 nan 0.1000 0.1556
## 7 5.4300 nan 0.1000 0.1278
## 8 5.1064 nan 0.1000 0.2239
## 9 4.8228 nan 0.1000 0.2571
## 10 4.6128 nan 0.1000 0.1542
## 20 3.1097 nan 0.1000 0.0695
## 40 1.8170 nan 0.1000 0.0031
## 60 1.2871 nan 0.1000 0.0017
## 80 1.0410 nan 0.1000 -0.0107
## 100 0.9321 nan 0.1000 -0.0044
## 120 0.8429 nan 0.1000 -0.0036
## 140 0.7578 nan 0.1000 -0.0108
## 160 0.6906 nan 0.1000 -0.0015
## 180 0.6335 nan 0.1000 -0.0031
## 200 0.5815 nan 0.1000 -0.0037
## 220 0.5436 nan 0.1000 -0.0061
## 240 0.5090 nan 0.1000 -0.0037
## 250 0.4904 nan 0.1000 -0.0054
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.6889 nan 0.1000 0.7303
## 2 7.0836 nan 0.1000 0.4335
## 3 6.5227 nan 0.1000 0.5232
## 4 6.0970 nan 0.1000 0.2962
## 5 5.7667 nan 0.1000 0.2258
## 6 5.4028 nan 0.1000 0.3024
## 7 5.0429 nan 0.1000 0.2607
## 8 4.8011 nan 0.1000 0.1757
## 9 4.5012 nan 0.1000 0.2375
## 10 4.2445 nan 0.1000 0.2104
## 20 2.6505 nan 0.1000 0.0588
## 40 1.4522 nan 0.1000 -0.0002
## 60 1.0100 nan 0.1000 -0.0081
## 80 0.8022 nan 0.1000 -0.0002
## 100 0.7059 nan 0.1000 -0.0011
## 120 0.6169 nan 0.1000 -0.0078
## 140 0.5468 nan 0.1000 -0.0055
## 160 0.4878 nan 0.1000 -0.0075
## 180 0.4438 nan 0.1000 -0.0081
## 200 0.3993 nan 0.1000 -0.0068
## 220 0.3688 nan 0.1000 -0.0081
## 240 0.3347 nan 0.1000 -0.0042
## 250 0.3208 nan 0.1000 -0.0041
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4789 nan 0.1000 0.7842
## 2 6.7951 nan 0.1000 0.6482
## 3 6.2504 nan 0.1000 0.4569
## 4 5.7844 nan 0.1000 0.2987
## 5 5.3955 nan 0.1000 0.3480
## 6 5.0173 nan 0.1000 0.2833
## 7 4.7085 nan 0.1000 0.1974
## 8 4.4038 nan 0.1000 0.2490
## 9 4.1790 nan 0.1000 0.1880
## 10 3.9244 nan 0.1000 0.1744
## 20 2.4148 nan 0.1000 0.0206
## 40 1.2834 nan 0.1000 0.0044
## 60 0.9094 nan 0.1000 -0.0081
## 80 0.7161 nan 0.1000 -0.0058
## 100 0.6008 nan 0.1000 -0.0018
## 120 0.5126 nan 0.1000 -0.0051
## 140 0.4494 nan 0.1000 -0.0077
## 160 0.3930 nan 0.1000 -0.0047
## 180 0.3455 nan 0.1000 -0.0038
## 200 0.3014 nan 0.1000 -0.0053
## 220 0.2631 nan 0.1000 -0.0050
## 240 0.2355 nan 0.1000 -0.0021
## 250 0.2238 nan 0.1000 -0.0048
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8053 nan 0.1000 0.4072
## 2 7.5432 nan 0.1000 0.2282
## 3 7.2780 nan 0.1000 0.1612
## 4 6.9567 nan 0.1000 0.2415
## 5 6.7750 nan 0.1000 0.1436
## 6 6.5688 nan 0.1000 0.1610
## 7 6.4054 nan 0.1000 0.1211
## 8 6.1503 nan 0.1000 0.2279
## 9 6.0282 nan 0.1000 0.0906
## 10 5.9156 nan 0.1000 0.0767
## 20 4.8027 nan 0.1000 0.0523
## 40 3.5280 nan 0.1000 0.0329
## 60 2.7879 nan 0.1000 0.0077
## 80 2.3357 nan 0.1000 0.0027
## 100 1.9843 nan 0.1000 0.0112
## 120 1.7106 nan 0.1000 -0.0123
## 140 1.5021 nan 0.1000 -0.0114
## 160 1.3597 nan 0.1000 -0.0043
## 180 1.2735 nan 0.1000 -0.0031
## 200 1.1946 nan 0.1000 0.0009
## 220 1.1423 nan 0.1000 -0.0084
## 240 1.0969 nan 0.1000 -0.0067
## 250 1.0804 nan 0.1000 -0.0059
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7647 nan 0.1000 0.5314
## 2 7.3409 nan 0.1000 0.3075
## 3 6.9377 nan 0.1000 0.3676
## 4 6.6145 nan 0.1000 0.2017
## 5 6.2832 nan 0.1000 0.3023
## 6 6.0018 nan 0.1000 0.2181
## 7 5.7558 nan 0.1000 0.1808
## 8 5.5351 nan 0.1000 0.1731
## 9 5.3234 nan 0.1000 0.1672
## 10 5.1177 nan 0.1000 0.1299
## 20 3.6599 nan 0.1000 0.0363
## 40 2.2291 nan 0.1000 0.0152
## 60 1.5561 nan 0.1000 -0.0017
## 80 1.2129 nan 0.1000 -0.0019
## 100 1.0495 nan 0.1000 0.0018
## 120 0.9493 nan 0.1000 -0.0051
## 140 0.8856 nan 0.1000 -0.0057
## 160 0.8291 nan 0.1000 -0.0086
## 180 0.7820 nan 0.1000 -0.0085
## 200 0.7305 nan 0.1000 -0.0041
## 220 0.6939 nan 0.1000 -0.0077
## 240 0.6626 nan 0.1000 -0.0038
## 250 0.6430 nan 0.1000 -0.0031
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7856 nan 0.1000 0.4660
## 2 7.1112 nan 0.1000 0.5321
## 3 6.7948 nan 0.1000 0.2431
## 4 6.3676 nan 0.1000 0.3722
## 5 5.9648 nan 0.1000 0.3832
## 6 5.6847 nan 0.1000 0.2198
## 7 5.3323 nan 0.1000 0.2690
## 8 5.0486 nan 0.1000 0.1679
## 9 4.8524 nan 0.1000 0.1169
## 10 4.5810 nan 0.1000 0.2493
## 20 3.0230 nan 0.1000 0.0476
## 40 1.6447 nan 0.1000 0.0172
## 60 1.2055 nan 0.1000 0.0026
## 80 1.0188 nan 0.1000 -0.0102
## 100 0.8775 nan 0.1000 -0.0035
## 120 0.7749 nan 0.1000 -0.0027
## 140 0.7160 nan 0.1000 -0.0081
## 160 0.6402 nan 0.1000 -0.0006
## 180 0.5869 nan 0.1000 -0.0031
## 200 0.5349 nan 0.1000 -0.0070
## 220 0.4918 nan 0.1000 -0.0030
## 240 0.4544 nan 0.1000 -0.0041
## 250 0.4385 nan 0.1000 -0.0060
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5728 nan 0.1000 0.6206
## 2 6.9621 nan 0.1000 0.5619
## 3 6.4393 nan 0.1000 0.4687
## 4 6.1238 nan 0.1000 0.1526
## 5 5.7072 nan 0.1000 0.4163
## 6 5.3774 nan 0.1000 0.2364
## 7 5.1308 nan 0.1000 0.0863
## 8 4.8394 nan 0.1000 0.2650
## 9 4.6209 nan 0.1000 0.1795
## 10 4.3571 nan 0.1000 0.2073
## 20 2.7486 nan 0.1000 0.0641
## 40 1.3978 nan 0.1000 0.0122
## 60 0.9989 nan 0.1000 0.0035
## 80 0.7919 nan 0.1000 -0.0117
## 100 0.6775 nan 0.1000 -0.0126
## 120 0.5945 nan 0.1000 -0.0074
## 140 0.5263 nan 0.1000 -0.0079
## 160 0.4601 nan 0.1000 -0.0054
## 180 0.4098 nan 0.1000 -0.0027
## 200 0.3680 nan 0.1000 -0.0055
## 220 0.3327 nan 0.1000 -0.0055
## 240 0.3036 nan 0.1000 -0.0044
## 250 0.2896 nan 0.1000 -0.0037
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4347 nan 0.1000 0.6464
## 2 6.8252 nan 0.1000 0.4853
## 3 6.2340 nan 0.1000 0.5228
## 4 5.7521 nan 0.1000 0.4340
## 5 5.3528 nan 0.1000 0.2234
## 6 4.9464 nan 0.1000 0.2221
## 7 4.6185 nan 0.1000 0.1937
## 8 4.3340 nan 0.1000 0.1914
## 9 4.0549 nan 0.1000 0.2449
## 10 3.8236 nan 0.1000 0.1958
## 20 2.2340 nan 0.1000 0.0667
## 40 1.1830 nan 0.1000 -0.0064
## 60 0.7858 nan 0.1000 0.0132
## 80 0.6337 nan 0.1000 -0.0107
## 100 0.5286 nan 0.1000 -0.0093
## 120 0.4548 nan 0.1000 -0.0064
## 140 0.3905 nan 0.1000 -0.0039
## 160 0.3375 nan 0.1000 -0.0040
## 180 0.2969 nan 0.1000 -0.0057
## 200 0.2705 nan 0.1000 -0.0022
## 220 0.2397 nan 0.1000 -0.0027
## 240 0.2085 nan 0.1000 -0.0034
## 250 0.1991 nan 0.1000 -0.0029
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 8.0777 nan 0.1000 0.1987
## 2 7.7035 nan 0.1000 0.3116
## 3 7.3598 nan 0.1000 0.3174
## 4 7.1039 nan 0.1000 0.1550
## 5 6.9360 nan 0.1000 0.0746
## 6 6.7127 nan 0.1000 0.2127
## 7 6.5055 nan 0.1000 0.1919
## 8 6.2722 nan 0.1000 0.1934
## 9 6.1078 nan 0.1000 0.1532
## 10 5.9384 nan 0.1000 0.0897
## 20 4.7108 nan 0.1000 0.0847
## 40 3.5392 nan 0.1000 0.0048
## 60 2.7479 nan 0.1000 0.0345
## 80 2.3025 nan 0.1000 0.0017
## 100 1.9383 nan 0.1000 -0.0158
## 120 1.6801 nan 0.1000 -0.0054
## 140 1.4834 nan 0.1000 0.0042
## 160 1.3379 nan 0.1000 -0.0123
## 180 1.2335 nan 0.1000 -0.0016
## 200 1.1736 nan 0.1000 -0.0073
## 220 1.1225 nan 0.1000 -0.0017
## 240 1.0772 nan 0.1000 -0.0161
## 250 1.0657 nan 0.1000 -0.0023
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8274 nan 0.1000 0.4756
## 2 7.4727 nan 0.1000 0.2674
## 3 6.9678 nan 0.1000 0.5114
## 4 6.5747 nan 0.1000 0.3226
## 5 6.2860 nan 0.1000 0.2591
## 6 5.9456 nan 0.1000 0.3387
## 7 5.7238 nan 0.1000 0.2250
## 8 5.5126 nan 0.1000 0.1448
## 9 5.2933 nan 0.1000 0.1827
## 10 5.0866 nan 0.1000 0.1687
## 20 3.7833 nan 0.1000 0.0373
## 40 2.2514 nan 0.1000 0.0126
## 60 1.5936 nan 0.1000 0.0111
## 80 1.2721 nan 0.1000 0.0002
## 100 1.1067 nan 0.1000 -0.0070
## 120 0.9777 nan 0.1000 -0.0034
## 140 0.9074 nan 0.1000 -0.0049
## 160 0.8521 nan 0.1000 -0.0022
## 180 0.8078 nan 0.1000 -0.0114
## 200 0.7736 nan 0.1000 -0.0029
## 220 0.7364 nan 0.1000 -0.0039
## 240 0.7010 nan 0.1000 -0.0096
## 250 0.6832 nan 0.1000 -0.0021
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.7808 nan 0.1000 0.5757
## 2 7.1632 nan 0.1000 0.4788
## 3 6.6440 nan 0.1000 0.4924
## 4 6.2354 nan 0.1000 0.3526
## 5 5.9144 nan 0.1000 0.2376
## 6 5.5653 nan 0.1000 0.2714
## 7 5.2105 nan 0.1000 0.2438
## 8 4.9766 nan 0.1000 0.1429
## 9 4.7371 nan 0.1000 0.2111
## 10 4.5450 nan 0.1000 0.1541
## 20 2.9921 nan 0.1000 0.0412
## 40 1.7291 nan 0.1000 0.0162
## 60 1.2187 nan 0.1000 0.0052
## 80 1.0058 nan 0.1000 -0.0152
## 100 0.8816 nan 0.1000 -0.0030
## 120 0.7784 nan 0.1000 -0.0039
## 140 0.7041 nan 0.1000 -0.0084
## 160 0.6435 nan 0.1000 -0.0098
## 180 0.5896 nan 0.1000 -0.0072
## 200 0.5397 nan 0.1000 -0.0035
## 220 0.5124 nan 0.1000 -0.0010
## 240 0.4821 nan 0.1000 -0.0016
## 250 0.4668 nan 0.1000 -0.0066
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.6295 nan 0.1000 0.5187
## 2 6.9375 nan 0.1000 0.5406
## 3 6.3991 nan 0.1000 0.4507
## 4 5.9365 nan 0.1000 0.3710
## 5 5.5559 nan 0.1000 0.2734
## 6 5.2272 nan 0.1000 0.2717
## 7 4.9016 nan 0.1000 0.2240
## 8 4.6510 nan 0.1000 0.1922
## 9 4.3661 nan 0.1000 0.1619
## 10 4.1404 nan 0.1000 0.1748
## 20 2.6176 nan 0.1000 0.0433
## 40 1.3705 nan 0.1000 0.0173
## 60 0.9960 nan 0.1000 -0.0059
## 80 0.8060 nan 0.1000 -0.0128
## 100 0.6736 nan 0.1000 -0.0091
## 120 0.5993 nan 0.1000 -0.0044
## 140 0.5305 nan 0.1000 -0.0095
## 160 0.4719 nan 0.1000 -0.0057
## 180 0.4214 nan 0.1000 -0.0030
## 200 0.3796 nan 0.1000 -0.0030
## 220 0.3443 nan 0.1000 -0.0027
## 240 0.3099 nan 0.1000 -0.0040
## 250 0.2973 nan 0.1000 -0.0045
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5577 nan 0.1000 0.7064
## 2 6.8856 nan 0.1000 0.5727
## 3 6.3103 nan 0.1000 0.5346
## 4 5.7972 nan 0.1000 0.4454
## 5 5.3635 nan 0.1000 0.2813
## 6 4.9900 nan 0.1000 0.2206
## 7 4.6821 nan 0.1000 0.2362
## 8 4.3991 nan 0.1000 0.2016
## 9 4.1453 nan 0.1000 0.2067
## 10 3.8946 nan 0.1000 0.1571
## 20 2.3691 nan 0.1000 0.0229
## 40 1.2535 nan 0.1000 0.0045
## 60 0.8894 nan 0.1000 0.0011
## 80 0.7372 nan 0.1000 -0.0034
## 100 0.6137 nan 0.1000 -0.0068
## 120 0.5333 nan 0.1000 -0.0094
## 140 0.4616 nan 0.1000 -0.0065
## 160 0.3998 nan 0.1000 -0.0070
## 180 0.3506 nan 0.1000 -0.0076
## 200 0.3049 nan 0.1000 -0.0038
## 220 0.2641 nan 0.1000 -0.0044
## 240 0.2380 nan 0.1000 -0.0031
## 250 0.2220 nan 0.1000 -0.0033
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8174 nan 0.1000 0.3829
## 2 7.4562 nan 0.1000 0.3215
## 3 7.2279 nan 0.1000 0.1690
## 4 6.9750 nan 0.1000 0.2089
## 5 6.7559 nan 0.1000 0.1490
## 6 6.5344 nan 0.1000 0.1996
## 7 6.3904 nan 0.1000 0.1480
## 8 6.2756 nan 0.1000 0.0560
## 9 6.1675 nan 0.1000 0.0419
## 10 5.9845 nan 0.1000 0.1656
## 20 4.8072 nan 0.1000 0.0690
## 40 3.5126 nan 0.1000 0.0052
## 60 2.7861 nan 0.1000 -0.0083
## 80 2.2757 nan 0.1000 0.0011
## 100 1.9043 nan 0.1000 0.0080
## 120 1.6453 nan 0.1000 -0.0072
## 140 1.4765 nan 0.1000 -0.0113
## 160 1.3491 nan 0.1000 0.0044
## 180 1.2382 nan 0.1000 0.0034
## 200 1.1609 nan 0.1000 -0.0073
## 220 1.1268 nan 0.1000 -0.0015
## 240 1.0801 nan 0.1000 -0.0042
## 250 1.0568 nan 0.1000 -0.0016
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.5659 nan 0.1000 0.4470
## 2 7.1672 nan 0.1000 0.3335
## 3 6.7443 nan 0.1000 0.3360
## 4 6.3220 nan 0.1000 0.3349
## 5 6.0412 nan 0.1000 0.2482
## 6 5.7883 nan 0.1000 0.1777
## 7 5.5339 nan 0.1000 0.1727
## 8 5.3068 nan 0.1000 0.1341
## 9 5.0952 nan 0.1000 0.1975
## 10 4.8840 nan 0.1000 0.1768
## 20 3.5722 nan 0.1000 0.0140
## 40 2.1905 nan 0.1000 0.0157
## 60 1.5983 nan 0.1000 -0.0037
## 80 1.2519 nan 0.1000 0.0104
## 100 1.0836 nan 0.1000 -0.0018
## 120 0.9814 nan 0.1000 -0.0023
## 140 0.8915 nan 0.1000 -0.0008
## 160 0.8428 nan 0.1000 -0.0028
## 180 0.8038 nan 0.1000 -0.0073
## 200 0.7653 nan 0.1000 -0.0079
## 220 0.7353 nan 0.1000 -0.0046
## 240 0.7030 nan 0.1000 -0.0077
## 250 0.6903 nan 0.1000 -0.0032
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4724 nan 0.1000 0.6452
## 2 6.9174 nan 0.1000 0.5396
## 3 6.6064 nan 0.1000 0.2289
## 4 6.2127 nan 0.1000 0.3609
## 5 5.9232 nan 0.1000 0.1823
## 6 5.5622 nan 0.1000 0.2514
## 7 5.2728 nan 0.1000 0.1167
## 8 5.0187 nan 0.1000 0.1736
## 9 4.7133 nan 0.1000 0.2278
## 10 4.4169 nan 0.1000 0.2513
## 20 2.9521 nan 0.1000 0.0446
## 40 1.7525 nan 0.1000 0.0055
## 60 1.2426 nan 0.1000 0.0122
## 80 1.0131 nan 0.1000 -0.0103
## 100 0.8673 nan 0.1000 -0.0073
## 120 0.7783 nan 0.1000 -0.0049
## 140 0.7051 nan 0.1000 -0.0059
## 160 0.6472 nan 0.1000 -0.0078
## 180 0.5961 nan 0.1000 -0.0030
## 200 0.5531 nan 0.1000 -0.0079
## 220 0.5177 nan 0.1000 -0.0028
## 240 0.4776 nan 0.1000 -0.0061
## 250 0.4629 nan 0.1000 -0.0041
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.4649 nan 0.1000 0.5539
## 2 6.8456 nan 0.1000 0.5161
## 3 6.2459 nan 0.1000 0.5490
## 4 5.7583 nan 0.1000 0.4540
## 5 5.4292 nan 0.1000 0.2813
## 6 5.0847 nan 0.1000 0.2896
## 7 4.8740 nan 0.1000 0.1733
## 8 4.6023 nan 0.1000 0.2496
## 9 4.3386 nan 0.1000 0.2157
## 10 4.1326 nan 0.1000 0.1087
## 20 2.6291 nan 0.1000 0.0502
## 40 1.4206 nan 0.1000 0.0193
## 60 1.0405 nan 0.1000 -0.0048
## 80 0.8639 nan 0.1000 -0.0263
## 100 0.7541 nan 0.1000 -0.0121
## 120 0.6540 nan 0.1000 -0.0058
## 140 0.5935 nan 0.1000 -0.0029
## 160 0.5262 nan 0.1000 -0.0116
## 180 0.4719 nan 0.1000 -0.0029
## 200 0.4238 nan 0.1000 -0.0070
## 220 0.3894 nan 0.1000 -0.0028
## 240 0.3538 nan 0.1000 -0.0037
## 250 0.3381 nan 0.1000 -0.0042
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.3663 nan 0.1000 0.7182
## 2 6.7011 nan 0.1000 0.4305
## 3 6.0939 nan 0.1000 0.5933
## 4 5.7279 nan 0.1000 0.2276
## 5 5.2495 nan 0.1000 0.3566
## 6 4.8837 nan 0.1000 0.2907
## 7 4.6276 nan 0.1000 0.2212
## 8 4.2939 nan 0.1000 0.2855
## 9 3.9860 nan 0.1000 0.2139
## 10 3.7545 nan 0.1000 0.1446
## 20 2.3353 nan 0.1000 0.0329
## 40 1.2135 nan 0.1000 0.0093
## 60 0.8674 nan 0.1000 0.0052
## 80 0.6873 nan 0.1000 -0.0066
## 100 0.5749 nan 0.1000 -0.0095
## 120 0.5023 nan 0.1000 -0.0064
## 140 0.4352 nan 0.1000 -0.0081
## 160 0.3828 nan 0.1000 -0.0126
## 180 0.3352 nan 0.1000 -0.0061
## 200 0.2987 nan 0.1000 -0.0034
## 220 0.2677 nan 0.1000 -0.0051
## 240 0.2389 nan 0.1000 -0.0056
## 250 0.2292 nan 0.1000 -0.0031
##
## Iter TrainDeviance ValidDeviance StepSize Improve
## 1 7.8143 nan 0.1000 0.3899
## 2 7.5097 nan 0.1000 0.2101
## 3 7.2889 nan 0.1000 0.1266
## 4 6.9722 nan 0.1000 0.3186
## 5 6.7904 nan 0.1000 0.0990
## 6 6.6546 nan 0.1000 0.0829
## 7 6.4212 nan 0.1000 0.2253
## 8 6.1904 nan 0.1000 0.1856
## 9 6.0251 nan 0.1000 0.1618
## 10 5.9247 nan 0.1000 0.0434
## 20 4.8547 nan 0.1000 0.0374
## 40 3.6031 nan 0.1000 -0.0120
## 60 2.8346 nan 0.1000 0.0097
## 80 2.3414 nan 0.1000 0.0079
## 100 1.9481 nan 0.1000 0.0059
## 120 1.6919 nan 0.1000 0.0021
## 140 1.5060 nan 0.1000 0.0051
## 160 1.3715 nan 0.1000 -0.0000
## 180 1.2743 nan 0.1000 -0.0042
## 200 1.1960 nan 0.1000 -0.0029
## 220 1.1427 nan 0.1000 0.0040
## 240 1.1005 nan 0.1000 -0.0029
## 250 1.0807 nan 0.1000 -0.0022
## Stochastic Gradient Boosting
##
## No pre-processing
## Resampling: Cross-Validated (10 fold)
## Summary of sample sizes: 289, 289, 289, 288, 289, 289, ...
## Resampling results across tuning parameters:
##
## interaction.depth n.trees RMSE Rsquared MAE
## 1 50 1.875330 0.6534275 1.504550
## 1 100 1.564268 0.7494993 1.258620
## 1 150 1.388068 0.7911320 1.129668
## 1 200 1.302701 0.8053296 1.073083
## 1 250 1.270075 0.8105605 1.042449
## 2 50 1.598238 0.7388639 1.281893
## 2 100 1.349894 0.7879159 1.099576
## 2 150 1.297344 0.7996286 1.066260
## 2 200 1.292489 0.7998757 1.067389
## 2 250 1.297159 0.7988566 1.080604
## 3 50 1.470025 0.7675199 1.207686
## 3 100 1.337747 0.7860576 1.091706
## 3 150 1.319601 0.7906845 1.085343
## 3 200 1.330737 0.7874339 1.093669
## 3 250 1.344091 0.7838223 1.102184
## 4 50 1.412937 0.7782271 1.150156
## 4 100 1.311217 0.7961732 1.080303
## 4 150 1.304849 0.7958353 1.076757
## 4 200 1.315548 0.7906123 1.082972
## 4 250 1.316437 0.7905470 1.084422
## 5 50 1.428159 0.7658767 1.146223
## 5 100 1.357071 0.7799058 1.107430
## 5 150 1.353989 0.7791909 1.108936
## 5 200 1.363834 0.7756967 1.121407
## 5 250 1.379816 0.7704501 1.135681
##
## Tuning parameter 'shrinkage' was held constant at a value of 0.1
##
## Tuning parameter 'n.minobsinnode' was held constant at a value of 10
## RMSE was used to select the optimal model using the smallest value.
## The final values used for the model were n.trees = 250,
## interaction.depth = 1, shrinkage = 0.1 and n.minobsinnode = 10.
carseats.pred <- predict(carseats.gbm, carseats.test, type = "raw")
plot(carseats.test$Sales, carseats.pred,
main = "Gradient Boosing Regression: Predicted vs. Actual",
xlab = "Actual",
ylab = "Predicted")
abline(0,1)
## [1] 0.9813016
Okay, I’m going to tally up the results! For the classification division, the winner is the manual classification tree! Gradient boosting made a valiant run at it, but came up just a little short.
rbind(data.frame(model = "Manual Class", Acc = round(oj.class.acc, 5)),
data.frame(model = "Class w/tuneLength", Acc = round(oj.class.acc2, 5)),
data.frame(model = "Class w.tuneGrid", Acc = round(oj.class.acc3, 5)),
data.frame(model = "Bagging", Acc = round(oj.bag.acc, 5)),
data.frame(model = "Random Forest", Acc = round(oj.frst.acc, 5)),
data.frame(model = "Gradient Boosting", Acc = round(oj.gbm.acc, 5))
) %>% arrange(desc(Acc))
## model Acc
## 1 Manual Class 0.85446
## 2 Class w.tuneGrid 0.84977
## 3 Gradient Boosting 0.84977
## 4 Class w/tuneLength 0.84507
## 5 Random Forest 0.82629
## 6 Bagging 0.81221
And now for the regression division, the winnner is… gradient boosting!
rbind(data.frame(model = "Manual ANOVA", RMSE = round(carseats.anova.rmse, 5)),
data.frame(model = "ANOVA w/tuneLength", RMSE = round(carseats.anova.rmse2, 5)),
data.frame(model = "ANOVA w.tuneGrid", RMSE = round(carseats.anova.rmse3, 5)),
data.frame(model = "Bagging", RMSE = round(carseats.bag.rmse, 5)),
data.frame(model = "Random Forest", RMSE = round(carseats.frst.rmse, 5)),
data.frame(model = "Gradient Boosting", RMSE = round(carseats.gbm.rmse, 5))
) %>% arrange(RMSE)
## model RMSE
## 1 Gradient Boosting 0.98130
## 2 Random Forest 1.33358
## 3 Bagging 1.58380
## 4 ANOVA w/tuneLength 2.04190
## 5 ANOVA w.tuneGrid 2.14820
## 6 Manual ANOVA 2.17482
Here are plots of the ROC curves for all the models (one from each chapter) on the same graph. The ROCR package provides the prediction() and performance() functions which generate the data required for plotting the ROC curve, given a set of predictions and actual (true) values. The more “up and to the left” the ROC curve of a model is, the better the model. The AUC performance metric is literally the “Area Under the ROC Curve”, so the greater the area under this curve, the higher the AUC, and the better-performing the model is.
## Warning: package 'ROCR' was built under R version 3.6.1
## Loading required package: gplots
## Warning: package 'gplots' was built under R version 3.6.1
##
## Attaching package: 'gplots'
## The following object is masked from 'package:stats':
##
## lowess
# List of predictions
oj.class.pred <- predict(oj.class, oj.test, type = "prob")[,2]
oj.bag.pred <- predict(oj.bag, oj.test, type = "prob")[,2]
oj.frst.pred <- predict(oj.frst, oj.test, type = "prob")[,2]
oj.gbm.pred <- predict(oj.gbm, oj.test, type = "prob")[,2]
preds_list <- list(oj.class.pred, oj.bag.pred, oj.frst.pred, oj.gbm.pred)
#preds_list <- list(oj.class.pred)
# List of actual values (same for all)
m <- length(preds_list)
actuals_list <- rep(list(oj.test$Purchase), m)
# Plot the ROC curves
pred <- prediction(preds_list, actuals_list)
#pred <- prediction(oj.class.pred[,2], oj.test$Purchase)
rocs <- performance(pred, "tpr", "fpr")
plot(rocs, col = as.list(1:m), main = "Test Set ROC Curves")
legend(x = "bottomright",
legend = c("Decision Tree", "Bagged Trees", "Random Forest", "GBM"),
fill = 1:m)
PSU STAT508: Applied Data Mining and Statistical Learning, Tree-Based Methods.
DataCamp: Machine Learning with Tree-Based Models in R
An Introduction to Statistical Learning by Gareth James, et al.
StatMethods: Tree-Based Models
GBM (Boosted Models) Tuning Parameters from Listen Data
Harry Southworth on GitHub
A Gentle Introduction to the Gradient Boosting Algorithm for Machine Learning on Machine Learning Mastery
Gradient Boosting Classification with GBM in R in DataTechNotes