将多个文件同时加载到R中(具有相似的文件名)

时间:2018-09-30 18:21:52

标签: r

我正在尝试将多个文件加载到R环境中,我已经尝试过类似以下内容;

files <- list.files(pattern = ".Rda", recursive = TRUE)

lapply(files,load,.GlobalEnv)

仅加载一个数据文件(不正确)。我发现的问题是,每年所有文件的名称都相同。例如"Year1/beer/beer.Rda"也有"Year2/beer/beer.Rda"

我正在尝试在导入时重命名数据文件,因此beer1beer2将对应于啤酒第1年和啤酒第2年,等等。

有人有更好的数据加载方法吗?我拥有超过2年的数据。

文件名:

 [1] "Year1/beer/beer.Rda"         "Year1/blades/blades.Rda"     "Year1/carbbev/carbbev.Rda"  
 [4] "Year1/cigets/cigets.Rda"     "Year1/coffee/coffee.Rda"     "Year1/coldcer/coldcer.Rda"  
 [7] "Year1/deod/deod.Rda"         "Year1/diapers/diapers.Rda"   "Year1/factiss/factiss.Rda"  
[10] "Year1/fzdinent/fzdinent.Rda" "Year1/fzpizza/fzpizza.Rda"   "Year1/hhclean/hhclean.Rda"  
[13] "Year1/hotdog/hotdog.Rda"     "Year1/laundet/laundet.Rda"   "Year1/margbutr/margbutr.Rda"
[16] "Year1/mayo/mayo.Rda"         "Year1/milk/milk.Rda"         "Year1/mustketc/mustketc.Rda"
[19] "Year1/paptowl/paptowl.Rda"   "Year1/peanbutr/peanbutr.Rda" "Year1/photo/photo.Rda"      
[22] "Year1/razors/razors.Rda"     "Year1/saltsnck/saltsnck.Rda" "Year1/shamp/shamp.Rda"      
[25] "Year1/soup/soup.Rda"         "Year1/spagsauc/spagsauc.Rda" "Year1/sugarsub/sugarsub.Rda"
[28] "Year1/toitisu/toitisu.Rda"   "Year1/toothbr/toothbr.Rda"   "Year1/toothpa/toothpa.Rda"  
[31] "Year1/yogurt/yogurt.Rda"     "Year2/beer/beer.Rda"         "Year2/blades/blades.Rda"    
[34] "Year2/carbbev/carbbev.Rda"   "Year2/cigets/cigets.Rda"     "Year2/coffee/coffee.Rda"    
[37] "Year2/coldcer/coldcer.Rda"   "Year2/deod/deod.Rda"         "Year2/diapers/diapers.Rda"  
[40] "Year2/factiss/factiss.Rda"   "Year2/fzdinent/fzdinent.Rda" "Year2/fzpizza/fzpizza.Rda"  
[43] "Year2/hhclean/hhclean.Rda"   "Year2/hotdog/hotdog.Rda"     "Year2/laundet/laundet.Rda"  
[46] "Year2/margbutr/margbutr.Rda" "Year2/mayo/mayo.Rda"         "Year2/milk/milk.Rda"        
[49] "Year2/mustketc/mustketc.Rda" "Year2/paptowl/paptowl.Rda"   "Year2/peanbutr/peanbutr.Rda"
[52] "Year2/photo/photo.Rda"       "Year2/razors/razors.Rda"     "Year2/saltsnck/saltsnck.Rda"
[55] "Year2/shamp/shamp.Rda"       "Year2/soup/soup.Rda"         "Year2/spagsauc/spagsauc.Rda"
[58] "Year2/sugarsub/sugarsub.Rda" "Year2/toitisu/toitisu.Rda"   "Year2/toothbr/toothbr.Rda"  
[61] "Year2/toothpa/toothpa.Rda"   "Year2/yogurt/yogurt.Rda"

2 个答案:

答案 0 :(得分:2)

一种解决方案是解析文件名,并将其作为名称分配给数据帧列表中的元素。我们将使用一些样本数据,这些数据具有两年啤酒品牌的月销售量,并以CSV文件的形式保存到两个子目录year1year2中。

我们将使用lapply()将文件读入数据帧列表,然后使用names()函数通过在文件名后附加year<x>.来命名每个元素({ {1}})。

.csv

...以及输出。

fileList <- c("year1/beer.csv","year2/beer.csv")

data <- lapply(fileList,function(x){
     read.csv(x)
})
# generate data set names to be assigned to elements in the list
fileNameTokens <- strsplit(fileList,"/|[.]")

theNames <- unlist(lapply(fileNameTokens,function(x){
     paste0(x[1],".",x[2])
}))
names(data) <- theNames
# print first six rows of file 1 based on named extract
data[["year1.beer"]][1:6,]

接下来,我们将打印第二个文件的前几行。

> data[["year1.beer"]][1:6,]
  Month      Item Sales
1     1 Budweiser 83047
2     2 Budweiser 38374
3     3 Budweiser 47287
4     4 Budweiser 18417
5     5 Budweiser 23981
6     6 Budweiser 55471
> 

如果需要直接访问文件而不依赖于> # print first six rows of file 1 based on named extract > data[["year2.beer"]][1:6,] Month Item Sales 1 1 Budweiser 23847 2 2 Budweiser 33847 3 3 Budweiser 44400 4 4 Budweiser 35333 5 5 Budweiser 18710 6 6 Budweiser 63108 > 的名称,可以通过list()函数在lapply()函数中将它们分配给父环境,如另一个答案。

assign()

...和输出。

# alternate form, assigning directly to parent environment

data <- lapply(fileList,function(x){
     # x is the filename, parse into strings to generate data set name
     fileNameTokens <- unlist(strsplit(x,"/|[.]"))
     assign(paste0(fileNameTokens[1],".",fileNameTokens[2]), read.csv(x),pos=1)
})
head(year1.beer)

该技术还可以与> head(year1.beer) Month Item Sales 1 1 Budweiser 83047 2 2 Budweiser 38374 3 3 Budweiser 47287 4 4 Budweiser 18417 5 5 Budweiser 23981 6 6 Budweiser 55471 > 文件一起使用,如下所示。

RDS

...和输出。

data <- lapply(fileList,function(x){
     # x is the filename, parse into strings to generate data set name
     fileNameTokens <- unlist(strsplit(x,"/|[.]"))
     assign(paste0(fileNameTokens[1],".",fileNameTokens[2]), readRDS(x),pos=1)
})
head(year1.beer)

答案 1 :(得分:1)

一种选择是将文件加载到新环境中,然后将其分配给父环境中的自定义命名对象。

这是从https://stackoverflow.com/a/5577647/6561924修改的

# first create custom names for objects (e.g. add folder names)
file_names <- gsub("/", "_", files)
file_names <- gsub("\\.Rda", "", file_names)

# function to load objects in new environ
load_obj <- function(f, f_name) {
  env <- new.env()
  nm <- load(f, env)[1]  # load into new environ and capture name
  assign(f_name, env[[nm]], pos = 1) # pos 1 is parent env
}

# load all
mapply(load_obj, files, file_names)