Please support VectorLinux!

Author Topic: Error while merging large tables  (Read 1211 times)

sanjaisahu

  • Member
  • *
  • Posts: 1
  • Sanjai
    • https://mindmajix.com/linux-training
Error while merging large tables
« on: October 31, 2018, 06:22:15 am »
Hi folks,

I need to merge different large tables (up to 10Gb each) into a single one. To do so I am using a computer cluster with 50+ cores and 10+Gb Ram that runs on Linux.

I always end up with an error message like: "Cannot allocate vector of size X Mb". Given that commands like memory.limit(size=X) are Windows-specific and not accepted, I cannot find a way around to merge my large tables.

Any suggestion welcome!

This is the code I use:

library(parallel)

no_cores <- detectCores() - 1
cl <- makeCluster(no_cores)

temp = list.files(pattern="*.txt$")
gc()
Here the error occurs:

myfiles = parLapply(cl,temp, function(x) read.csv(x,
                                        header=TRUE,
                                        sep=";",
                                        stringsAsFactors=F,
                                        encoding = "UTF-8",
                                        na.strings = c("NA","99","")))


myfiles.final = do.call(rbind, myfiles)



Thank you

bigpaws

  • Administrator
  • Vectorian
  • *****
  • Posts: 1897
Re: Error while merging large tables
« Reply #1 on: November 01, 2018, 08:07:39 am »
You have not mentioned the database you are using. The type of table and is the
merge from the same OS and the same database that the original tables are on.

Bigpaws

Edney

  • Member
  • *
  • Posts: 1
  • user
Re: Error while merging large tables
« Reply #2 on: December 21, 2018, 03:27:22 am »
Can we expect problems if any of those things don't match while trying to merge tables, Bigpaws?