我在目录中有几个.xlsx文件:
Russia - GDP.xlsx
Russia - GNP.xlsx
USA - GDP.xlsx
USA - GNP.xlsx
我想根据文件名的第一部分将文件合并到一个新的xlsx文件中。所以输出看起来像这样:
Russia.xlsx
USA.xlsx
每个.xlsx文件包含两个标签:GDP和GNP。
有没有办法用R做到这一点?谢谢你的帮助。
答案 0 :(得分:0)
假设文件名与<country> - GDP.xlsx
和<country> - GNP.xlsx
一致:
library(xlsx)
# Change following as needed
path <- "C:/OneXLSX/"
# Fetch only the files from the folder
files <- list.files(path, pattern = "*.xlsx")
# Get the countries name only
countries <- unique(unlist(strsplit(files, " -(.*)")))
# Loop through each country, read GDP/GNP and write them to one single file
for (each in countries){
GDP <- read.xlsx(paste(path, each, " - GDP.xlsx", sep = ""), 1, header = TRUE)
GNP <- read.xlsx(paste(path, each, " - GNP.xlsx", sep = ""), 1, header = TRUE)
write.xlsx(GDP, paste(path, each, ".xlsx", sep = ""),
sheetName = "GDP", row.names = FALSE)
write.xlsx(GNP, paste(path, each, ".xlsx", sep = ""),
sheetName = "GNP", append = TRUE, row.names = FALSE)
}
我选择使用Excel文件的包是xlsx。请随意将其更改为您的首选包,并根据需要更新语法。
如果代码不漂亮,我道歉。
答案 1 :(得分:0)
你可以试试这个:
# Reading all files in a folder
library(openxlsx)
path = "C:\\Users\\Folder"
files<-list.files(path, pattern='*.xlsx$',
full.names=TRUE)
# Assuming you have these file names
files<-c("Russia - GDP.xlsx",
"Russia - GNP.xlsx",
"USA - GDP.xlsx",
"USA - GNP.xlsx")
names(files)<-files
# Getting the unique countries
rexp <- "(.*)\\s+\\-.*"
uniqueCountries<- unique(sub(rexp, "\\1",names(files)))
toread<-lapply(uniqueCountries, function(x) grep(x,files))
names(toread)<-uniqueCountries
lapply(toread, function(x) {#print(x)
xy<-files[x]
dat<-lapply(xy, read.xlsx)
fileN=sprintf("%s.xlsx",x)
write.xlsx(dat, fileN)
})