在r

时间:2017-05-10 19:14:14

标签: r csv

我想从不同的目录中读取多个.csv文件,然后将其放在一个数据帧中。

我有两种目录可供阅读:

A:/ LogIIS /文件夹的 01 /"的 files.csv "

在其他网站上有一个包含多个files.csv的文件夹,如下例所示:

A:/ LogIIS /文件夹的 02 / FOLDER_的 A /"的 files.csv

" A:/ LogIIS /文件夹的 02 / FOLDER_的 /"的 files.csv "

" A:/ LogIIS /文件夹的 02 / FOLDER_ C /"的 files.csv "

" A:/ LogIIS /文件夹的 03 / FOLDER_的 A /"的 files.csv "

" A:/ LogIIS /文件夹的 03 / FOLDER_的 /"的 files.csv "

" A:/ LogIIS /文件夹的 03 / FOLDER_ C /"的 files.csv "

" A:/ LogIIS /文件夹的 03 / FOLDER_的 d /"的 files.csv "

提前致谢!

2 个答案:

答案 0 :(得分:1)

如果您需要明确定义文件模式(文件名或扩展名),则可以使用pattern函数中的list.files参数。

library(data.table)

# make an explicit alist of folders
folders = list(
  file.path('A:','LogIIS','FOLDER02','FOLDER_A'),
  file.path('A:','LogIIS','FOLDER02','FOLDER_B'),
  file.path('A:','LogIIS','FOLDER02','FOLDER_C'),
  file.path('A:','LogIIS','FOLDER03','FOLDER_A'),
  file.path('A:','LogIIS','FOLDER03','FOLDER_B'),
  file.path('A:','LogIIS','FOLDER03','FOLDER_C'),
  file.path('A:','LogIIS','FOLDER03','FOLDER_D')
)

# iterate through each folder in list and return all files
# unlist those lists of files into a single vector
files = unlist(sapply(folders, function(folder) {
  list.files(folder, full.names=TRUE)
}))

# read each file into a data.table
# return data.table results as a list
# combine list into a single data.table
rbindlist(use.names=TRUE, fill=FALSE,
  lapply(files, function(x) { 
    fread(x)  
  }) 
)

答案 1 :(得分:0)

我还会使用带有循环的list.files()函数来提取所有信息。列出公共顶级目录下的所有目录,在本例中为目录A:/ LogIIS

common_path = "A:/LogIIS/"
primary_dirs = list.files(common_path);
primary_dirs 
[1] "FOLDER01" "FOLDER02" "FOLDER03"

现在我会对所有primary_dirs执行嵌套循环,在您的示例中,所有.csv文件都有一个公用名files.csv,这简化了问题,您还没有说明如何要附加csv文件,但我会假设它们具有相同的列标题,并使用cbind()附加它们,否则您可以使用rbind()

main_data = data.frame(##populate heade) ## 

使用here

的答案
for(dir in primary_dirs) {
  sub_folders = list.files(paste(common_path,dir,sep = ""))
  if (any(sub_folders %in% "files.csv")) {
    ## there is files.csv in this directory read it in and append to a data.frame.
    ## read in data 
    temp_data = read.csv(file = paste(common_path,dir,"/files.csv",sep = ""))
    ## append
    main_data = cbind(main_data,temp_data);
  } else {
    ## try go one more directory deeper
    for(sub_dir in sub_folders) {
      sub_sub_files = list.files(paste(common_path,dir,"/",sub_dir,sep = ""))             
      if (any(sub_sub_files %in% "files.csv")) {
        ## found files.csv read it in and append it
        temp_data = read.csv(file = paste(common_path,dir,"/",sub_dir,"/files.csv",sep = ""))
        main_data = cbind(main_data,temp_data);
      } else {
        warning("could not find the file 'files.csv' two directories deep")
      }
    } 
  }
}