R function changing a data set - and not returning it

I am wondering if it is possible in R to make a function that alters a data set and does not return it as a value. The reason is mainly because I am afraid of time difficulty of the issue on very large data sets. To get concrete - I have a function of type

f <- function(data, ...) {
  add several columns to data
  return(data)
}

This means that I need to call

data <- f(data, ...)

to update a dataset. Instead, I would like to just call

f(data)

to update (add columns to) my data set.

I have two questions:

1) Is my assumption that the method I am using now will take a long time for very large data sets right? (Or will R somehow see, that I have just added columns?)

2) Is there a way to modify function to do what I have proposed?

Thanks in advance!

Answers


As far as as I know and as commented by dickoa in your question, data.table should be already doing that: Check ?data.table and then search for the link "help page for :=". This will explain how assignment by reference works with data.table.


Concerning your first question: just test it. I'll give an example below, but test it with your own code and your own data sets.

library(MASS)

data <- as.data.frame(mvrnorm(1E6, mu=rep(0, 20), Sigma=diag(1:20)))
data2 <- data

add_columns <- function(data) {
  data$X <- rnorm(1E6)
  data$y <- rnorm(1E6)
  data
}

Test with function call:

> system.time({
+   data <- add_columns(data)
+ })
   user  system elapsed 
  0.567   0.000   0.568 
> system.time({
+   data <- add_columns(data)
+ })
   user  system elapsed 
  0.711   0.128   0.839 

Without function call:

> system.time({
+   data2$X <- rnorm(1E6)
+   data2$y <- rnorm(1E6)
+ })
   user  system elapsed 
  0.650   0.020   0.669 
> system.time({
+   data2$X <- rnorm(1E6)
+   data2$y <- rnorm(1E6)
+ })
   user  system elapsed 
  0.589   0.024   0.613 

The function call is slightly slower (perhaps), but the difference is so small that I would not bother to start messing around with global assignments or environments.


You can pass by reference in R, but it is quite hard. See http://www.stat.berkeley.edu/~paciorek/computingTips/Pointers_passing_reference_.html or maybe the R.oo package.


Need Your Help

Get Object from other class file

java

Having issue of getting object from other class file...

Problems with UIImagePickerController cancel button not working

objective-c ios uiimagepickercontroller

I have an universal app that allows to select an image from the device photo library for later manipulation, the code works fine on the iPad but nothing happens on the iPhone, not even the cancel b...