Parallel processing in caret does not work on R 2.13.0
I am using R package caret and parallel processing doesn't work. If I try to run example from the train function:
library(mlbench) data(BostonHousing) library(doMC) registerDoMC(2) ## NOTE: don't run models form RWeka when using ### multicore. The session will crash. ## The code for train() does not change: set.seed(1) usingMC <- train(medv ~ ., data = BostonHousing, "glmboost")
I get the following error:
Error in names(resamples) <- gsub("^\\.", "", names(resamples)) : attempt to set an attribute on NULL
I am using MacBook Pro, early 2011 model with 2.3GHz Intel Core i5 and Mac OS X 10.6.8.
R Session Info:
R version 2.13.0 (2011-04-13) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
attached base packages:  stats graphics grDevices utils datasets methods base
other attached packages:  caret_5.13-20 cluster_1.14.2 reshape_0.8.4 plyr_1.7.1 lattice_0.19-33 mlbench_2.1-0 doMC_1.2.3 multicore_0.1-7  foreach_1.3.2 codetools_0.2-8 iterators_1.0.5
loaded via a namespace (and not attached):  compiler_2.13.0 grid_2.13.0 rpart_3.1-51 tools_2.13.0
Is there something I can do to fix this?
It may be difficult to find someone who can reproduce your error: With
> sessionInfo () R version 2.15.0 (2012-03-30) Platform: x86_64-pc-linux-gnu (64-bit)
other attached packages:  mboost_2.1-2 caret_5.15-023 cluster_1.14.2 reshape_0.8.4  plyr_1.7.1 lattice_0.20-6 doMC_1.2.5 multicore_0.1-7  iterators_1.0.6 foreach_1.4.0 mlbench_2.1-0 loaded via a namespace (and not attached):  codetools_0.2-8 compiler_2.15.0 grid_2.15.0 Matrix_1.0-6  splines_2.15.0 survival_2.36-14 tools_2.15.0
Which means you'll probably need to dig into the code: traceback () and debug () should help.
I cannot reproduce the issue (see below) at least on 2.14.0.
The caret code does not have different versions for sequential and parallel processing, so I'm not sure were the problem is. Does the sequential version work? How about other models? Could you also try in a fresh session?
Also, you might want to email the package maintainer directly (unless you did and I missed it) to get better results.
> library(mlbench) > data(BostonHousing) > > library(doMC)
> registerDoMC(2) > > ## NOTE: don't run models form RWeka when using > ### multicore. The session will crash. > > ## The code for train() does not change: > set.seed(1) > usingMC <- train(medv ~ ., + data = BostonHousing, + "glmboost") Warning message: In glmboost.matrix(x = c(0.00632, 0.02731, 0.02729, 0.03237, 0.06905, : model with centered covariates does not contain intercept > usingMC 506 samples 13 predictors No pre-processing Resampling: Bootstrap (25 reps) Summary of sample sizes: 506, 506, 506, 506, 506, 506, ... Resampling results across tuning parameters: mstop RMSE Rsquared RMSE SD Rsquared SD 50 5.44 0.663 0.484 0.0661 100 5.33 0.675 0.518 0.0669 150 5.27 0.681 0.526 0.0661 Tuning parameter 'prune' was held constant at a value of 'no' RMSE was used to select the optimal model using the smallest value. The final values used for the model were mstop = 150 and prune = no. > sessionInfo() R version 2.14.0 (2011-10-31) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit) locale:  en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 attached base packages:  stats graphics grDevices utils datasets  methods base other attached packages:  mboost_2.1-1 doMC_1.2.5 multicore_0.1-7  iterators_1.0.5 mlbench_2.1-0 caret_5.15-023  foreach_1.4.0 cluster_1.14.1 reshape_0.8.4  plyr_1.7.1 lattice_0.20-0 loaded via a namespace (and not attached):  codetools_0.2-8 compiler_2.14.0 grid_2.14.0  Matrix_1.0-3 splines_2.14.0 survival_2.36-10  tools_2.14.0