Exceeding memory limit in R (even with 24GB RAM)
I am trying to merge two dataframes: one has 908450 observations of 33 variables, and the other has 908450 observations of 2 variables.
dataframe2 <-merge(dataframe1, dataframe2, by="id")
I've cleared all other dataframes from working memory, and reset my memory limit (for a brand new desktop with 24 GB of RAM) using the code:
But, I'm still getting the error Cannot allocate vector of size 173.Mb.
Any thoughts on how to get around this problem?
To follow up on my comments, use data.table. I put together a quick example matching your data to illustrate:
library(data.table) dt1 <- data.table(id = 1:908450, matrix(rnorm(908450*32), ncol = 32)) dt2 <- data.table(id = 1:908450, rnorm(908450)) #set keys setkey(dt1, id) setkey(dt2, id) #check dims > dim(dt1)  908450 33 > dim(dt2)  908450 2 #merge together and check system time: > system.time(dt3 <- dt1[dt2]) user system elapsed 0.43 0.03 0.47
So it took less than 1/2 second to merge together. I took a before and after screenshot watching my memory. Before the merge, I was using 3.4 gigs of ram. When I merged together, it jumped to 3.7 and leveled off. I think you'll be hard pressed to find something more memory or time efficient than that.
As far as I can think of there's three solutions:
- Use datatables
- Use swap memory ( can be adjustable on *nix machines)
- Use sampling