如何将多个工作簿中的多个电子表格附加到R中的单个数据框架

时间:2018-05-23 09:30:11

标签: r

我有10个excel(.xlsx)文件,每个文件有10个电子表格。

我需要从每个工作簿中读取3个电子表格,最后将其附加到R中的单个数据框。

数据:

标题:

 Country    Jan-14  Feb-14  Mar-14  Apr-14  May-14  Jun-14  Jul-14  Aug-14  Sep-14  Oct-14  Nov-14  Dec-14  FY

实际数据

 Austria    43  52  64  82  60  61  57  36  110 96  66  64  791 
 Belgium    143 258 184 207 202 191 209 118 136 169 121 108 2,046   
 Bulgaria   0   0   0   0   0   0   0   0   0   0   0   0   0   

代码:

library(XLConnect)
files = list.files("C:/Users/kushaa/Documents/Frost_casestudy/")

sheet.index <- c(3,6,9)

colname = c("Country","Jan","Feb","Mar","Apr","May","Jun","Jul","Aug","Sep","Oct","Nov","Dec","FY","truck_type")

data.1 <- data.frame(matrix(rep(NA,length(colname)),ncol = length(colname)))

for (i in 1:length(files)){

   wb = loadWorkbook(files[i])
for (j in 1:length(sheet.index)){
   ss = readWorksheet(wb, sheet.index[j],startRow = 5, header = FALSE)
   truck_type = rep(sheet.names[j],nrow(ss))
   df = data.frame(ss,truck_type)
   names(df) <- colname
   data_merge <- rbind(data.1,df)

 }
}

但是只能从一张纸(truck_type = CV)获取数据而不是纸张(truck_type = LCV,HCV)

输出:

         Country  Jan  Feb  Mar  Apr  May  Jun  Jul  Aug  Sep  Oct  Nov  Dec    FY truck_type
1           <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>  <NA>       <NA>
2        Austria   43   54   67   90   68   97   65  108   83   75   87   90   927         CV
3        Belgium  275  232  306  235  330  339  279  239  261  211  155  122 2,984         CV

如何从文件名中提取年份:

  [1] "2014_by_country_and_type_Enlarged_Europe.xlsx"            
  [2] "20140211_02_2012_vo_By_Country_Enlarged_Europe.xls"       
  [3] "20150219_2013_vo_By_Country_Enlarged_Europe.xlsx"

查询:

  regmatches(files, regexpr("[0-9].*[0-9]", files))

但它给出了:

 [1] "2014"            
 [2] "20140211_02_2012" 
 [3]"20150219_2013"   

我需要输出为:

 2014
 2012
 2013

0 个答案:

没有答案