所以我想要以我在下面指出的方式合并一些非常大的数据集。我试着这样做:
setwd(..)
myfiles = list.files(pattern="*.dta")
dflist <- lapply(myfiles, read.dta13)
merged.data.frame = Reduce(function(...) merge(..., all=T), dflist)
但是,我的内存不足...... 还有另一种方法可以在不耗尽内存的情况下实现这一目标吗?
数据框的外观,以及我想要的合并产品:
Dataframe 1
Country Year Sector ID Var 1
Austria 2001 Construction lp 22
Austria 2001 Construction fp 23
Austria 2001 Manufact lp 12
Austria 2001 Manufact fp 43
Austria 2002 Construction lp 55
Austria 2002 Construction fp 34
Austria 2002 Manufact lp 16
Austria 2002 Manufact fp 76
Dataframe 2
Country Year Sector Type Var1 Var2
Austria 2001 Construction A 23 5
Austria 2001 Construction B 34 5
Austria 2001 Manufact A 98 4
Austria 2001 Manufact B 48 3
Austria 2002 Construction A 43 9
Austria 2002 Construction B 23 7
Austria 2002 Manufact A 65 6
Austria 2002 Manufact B 45 6
Dataframe 3
Country Year Sector Var3
Austria 2001 Construction 123
Austria 2001 Acco 345
Austria 2001 Manufact 234
Austria 2001 Prod 466
Austria 2002 Construction 785
Austria 2002 Acco 789
Austria 2002 Manufact 678
Austria 2002 Prod 899
Merged:
Country Year Sector ID Type Var1 Var2 Var3
Austria 2001 Construction lp NA 22 NA NA
Austria 2001 Construction fp NA 23 NA NA
Austria 2001 Construction NA A 23 5 NA
Austria 2001 Construction NA B 34 5 NA
Austria 2001 Construction NA NA NA NA 123
Austria 2001 Manufact lp NA 12 NA NA
Austria 2001 Manufact fp NA 43 NA NA
Austria 2001 Manufact NA A 98 4 NA
Austria 2001 Manufact NA B 48 3 NA
Austria 2001 Manufact NA NA NA NA 234
Austria 2001 Acco NA NA NA NA 345
Austria 2001 Prod NA NA NA NA 466
.... 2002 ....
and so on.