如何使用R将每个文件名的第一部分合并多个xlsx文件到一个xlsx文件中?

时间:2018-03-29 14:56:22

标签: r

我在目录中有几个.xlsx文件:

Russia - GDP.xlsx
Russia - GNP.xlsx
USA - GDP.xlsx
USA - GNP.xlsx

我想根据文件名的第一部分将文件合并到一个新的xlsx文件中。所以输出看起来像这样:

Russia.xlsx
USA.xlsx

每个.xlsx文件包含两个标签:GDP和GNP。

有没有办法用R做到这一点?谢谢你的帮助。

2 个答案:

答案 0 :(得分:0)

假设文件名与<country> - GDP.xlsx<country> - GNP.xlsx一致:

library(xlsx)

# Change following as needed
path <- "C:/OneXLSX/"

# Fetch only the files from the folder
files <- list.files(path, pattern = "*.xlsx")

# Get the countries name only
countries <- unique(unlist(strsplit(files, " -(.*)")))

# Loop through each country, read GDP/GNP and write them to one single file
for (each in countries){
  GDP <- read.xlsx(paste(path, each, " - GDP.xlsx", sep = ""), 1, header = TRUE)
  GNP <- read.xlsx(paste(path, each, " - GNP.xlsx", sep = ""), 1, header = TRUE)

  write.xlsx(GDP, paste(path, each, ".xlsx", sep = ""), 
             sheetName = "GDP", row.names = FALSE)
  write.xlsx(GNP, paste(path, each, ".xlsx", sep = ""), 
             sheetName = "GNP", append = TRUE, row.names = FALSE)
}

我选择使用Excel文件的包是xlsx。请随意将其更改为您的首选包,并根据需要更新语法。

如果代码不漂亮,我道歉。

答案 1 :(得分:0)

你可以试试这个:

# Reading all files in a folder
library(openxlsx)
path = "C:\\Users\\Folder"
files<-list.files(path, pattern='*.xlsx$',
                  full.names=TRUE)

# Assuming you have these file names
files<-c("Russia - GDP.xlsx",
         "Russia - GNP.xlsx",
         "USA - GDP.xlsx",
         "USA - GNP.xlsx")

names(files)<-files

# Getting the unique countries
rexp <- "(.*)\\s+\\-.*"
uniqueCountries<- unique(sub(rexp, "\\1",names(files)))

toread<-lapply(uniqueCountries, function(x) grep(x,files))
names(toread)<-uniqueCountries

lapply(toread, function(x) {#print(x)
  xy<-files[x]
  dat<-lapply(xy, read.xlsx)
  fileN=sprintf("%s.xlsx",x)
  write.xlsx(dat, fileN)
})