VectorLinux

Please login or register.

Login with username, password and session length
Advanced search  

News:

Visit our home page for VL info. For support and documentation, visit the Vector Linux Knowledge Center or search the Knowledge Center and this Forum using the search box above.

Author Topic: Error while merging large tables  (Read 284 times)

sanjaisahu

  • Member
  • *
  • Posts: 1
  • Sanjai
    • https://mindmajix.com/linux-training
Error while merging large tables
« on: October 31, 2018, 06:22:15 am »

Hi folks,

I need to merge different large tables (up to 10Gb each) into a single one. To do so I am using a computer cluster with 50+ cores and 10+Gb Ram that runs on Linux.

I always end up with an error message like: "Cannot allocate vector of size X Mb". Given that commands like memory.limit(size=X) are Windows-specific and not accepted, I cannot find a way around to merge my large tables.

Any suggestion welcome!

This is the code I use:

library(parallel)

no_cores <- detectCores() - 1
cl <- makeCluster(no_cores)

temp = list.files(pattern="*.txt$")
gc()
Here the error occurs:

myfiles = parLapply(cl,temp, function(x) read.csv(x,
                                        header=TRUE,
                                        sep=";",
                                        stringsAsFactors=F,
                                        encoding = "UTF-8",
                                        na.strings = c("NA","99","")))


myfiles.final = do.call(rbind, myfiles)



Thank you
Logged

bigpaws

  • Administrator
  • Vectorian
  • *****
  • Posts: 1896
Re: Error while merging large tables
« Reply #1 on: November 01, 2018, 08:07:39 am »

You have not mentioned the database you are using. The type of table and is the
merge from the same OS and the same database that the original tables are on.

Bigpaws
Logged