Parallel processing in caret does not work on R 2.13.0

I am using R package caret and parallel processing doesn't work. If I try to run example from the train function:

library(mlbench)
data(BostonHousing)

library(doMC)
registerDoMC(2)

## NOTE: don't run models form RWeka when using
### multicore. The session will crash.

## The code for train() does not change:
set.seed(1)
usingMC <-  train(medv ~ .,
                  data = BostonHousing, 
                  "glmboost")

I get the following error:

Error in names(resamples) <- gsub("^\\.", "", names(resamples)) : 
  attempt to set an attribute on NULL

I am using MacBook Pro, early 2011 model with 2.3GHz Intel Core i5 and Mac OS X 10.6.8.

R Session Info:

R version 2.13.0 (2011-04-13) Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

attached base packages: [1] stats graphics grDevices utils datasets methods base

other attached packages: [1] caret_5.13-20 cluster_1.14.2 reshape_0.8.4 plyr_1.7.1 lattice_0.19-33 mlbench_2.1-0 doMC_1.2.3 multicore_0.1-7 [9] foreach_1.3.2 codetools_0.2-8 iterators_1.0.5

loaded via a namespace (and not attached): [1] compiler_2.13.0 grid_2.13.0 rpart_3.1-51 tools_2.13.0

Is there something I can do to fix this?

Answers


  1. It may be difficult to find someone who can reproduce your error: With

    > sessionInfo ()
    R version 2.15.0 (2012-03-30)
    Platform: x86_64-pc-linux-gnu (64-bit)
    

    [...snip...]

    other attached packages:
     [1] mboost_2.1-2    caret_5.15-023  cluster_1.14.2  reshape_0.8.4  
     [5] plyr_1.7.1      lattice_0.20-6  doMC_1.2.5      multicore_0.1-7
     [9] iterators_1.0.6 foreach_1.4.0   mlbench_2.1-0          
    
    loaded via a namespace (and not attached):
    [1] codetools_0.2-8  compiler_2.15.0  grid_2.15.0      Matrix_1.0-6    
    [5] splines_2.15.0   survival_2.36-14 tools_2.15.0    
    

    it works.

  2. Which means you'll probably need to dig into the code: traceback () and debug () should help.


I cannot reproduce the issue (see below) at least on 2.14.0.

The caret code does not have different versions for sequential and parallel processing, so I'm not sure were the problem is. Does the sequential version work? How about other models? Could you also try in a fresh session?

Also, you might want to email the package maintainer directly (unless you did and I missed it) to get better results.

> library(caret)

<-snip->

> library(mlbench)
> data(BostonHousing)
> 
> library(doMC)

<-snip->

> registerDoMC(2)
> 
> ## NOTE: don't run models form RWeka when using
> ### multicore. The session will crash.
> 
> ## The code for train() does not change:
> set.seed(1)
> usingMC <-  train(medv ~ .,
+                   data = BostonHousing, 
+                   "glmboost")
Warning message:
In glmboost.matrix(x = c(0.00632, 0.02731, 0.02729, 0.03237, 0.06905,  :
  model with centered covariates does not contain intercept
> usingMC
506 samples
 13 predictors

No pre-processing
Resampling: Bootstrap (25 reps) 

Summary of sample sizes: 506, 506, 506, 506, 506, 506, ... 

Resampling results across tuning parameters:

  mstop  RMSE  Rsquared  RMSE SD  Rsquared SD
  50     5.44  0.663     0.484    0.0661     
  100    5.33  0.675     0.518    0.0669     
  150    5.27  0.681     0.526    0.0661     

Tuning parameter 'prune' was held constant at a value
 of 'no'
RMSE was used to select the optimal model using 
 the smallest value.
The final values used for the model were mstop = 150
 and prune = no. 
> sessionInfo()
R version 2.14.0 (2011-10-31)
Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets 
[6] methods   base     

other attached packages:
 [1] mboost_2.1-1    doMC_1.2.5      multicore_0.1-7
 [4] iterators_1.0.5 mlbench_2.1-0   caret_5.15-023 
 [7] foreach_1.4.0   cluster_1.14.1  reshape_0.8.4  
[10] plyr_1.7.1      lattice_0.20-0 

loaded via a namespace (and not attached):
[1] codetools_0.2-8  compiler_2.14.0  grid_2.14.0     
[4] Matrix_1.0-3     splines_2.14.0   survival_2.36-10
[7] tools_2.14.0  

Need Your Help

read in column from csv file into bash script

bash csv

In a bash script I want to loop over values I have stored in a csv file. How can I import a column from a csv file into a bash script and then do something like:

removeItem localStorage does not work

javascript jquery local-storage

I want to remove a item in an array but it doesn't get removed. I have the id of the item but I can't use it. Can you show me how I can use the id of item in tasks array?