我想从不同的目录中读取多个.csv文件,然后将其放在一个数据帧中。
我有两种目录可供阅读:
A:/ LogIIS /文件夹的 01 /"的 files.csv "
在其他网站上有一个包含多个files.csv的文件夹,如下例所示:
A:/ LogIIS /文件夹的 02 / FOLDER_的 A /"的 files.csv
" A:/ LogIIS /文件夹的 02 / FOLDER_的乙 /"的 files.csv "
" A:/ LogIIS /文件夹的 02 / FOLDER_ C /"的 files.csv "
" A:/ LogIIS /文件夹的 03 / FOLDER_的 A /"的 files.csv "
" A:/ LogIIS /文件夹的 03 / FOLDER_的乙 /"的 files.csv "
" A:/ LogIIS /文件夹的 03 / FOLDER_ C /"的 files.csv "
" A:/ LogIIS /文件夹的 03 / FOLDER_的 d /"的 files.csv "
提前致谢!
答案 0 :(得分:1)
如果您需要明确定义文件模式(文件名或扩展名),则可以使用pattern
函数中的list.files
参数。
library(data.table)
# make an explicit alist of folders
folders = list(
file.path('A:','LogIIS','FOLDER02','FOLDER_A'),
file.path('A:','LogIIS','FOLDER02','FOLDER_B'),
file.path('A:','LogIIS','FOLDER02','FOLDER_C'),
file.path('A:','LogIIS','FOLDER03','FOLDER_A'),
file.path('A:','LogIIS','FOLDER03','FOLDER_B'),
file.path('A:','LogIIS','FOLDER03','FOLDER_C'),
file.path('A:','LogIIS','FOLDER03','FOLDER_D')
)
# iterate through each folder in list and return all files
# unlist those lists of files into a single vector
files = unlist(sapply(folders, function(folder) {
list.files(folder, full.names=TRUE)
}))
# read each file into a data.table
# return data.table results as a list
# combine list into a single data.table
rbindlist(use.names=TRUE, fill=FALSE,
lapply(files, function(x) {
fread(x)
})
)
答案 1 :(得分:0)
我还会使用带有循环的list.files()
函数来提取所有信息。列出公共顶级目录下的所有目录,在本例中为目录A:/ LogIIS
common_path = "A:/LogIIS/"
primary_dirs = list.files(common_path);
primary_dirs
[1] "FOLDER01" "FOLDER02" "FOLDER03"
现在我会对所有primary_dirs
执行嵌套循环,在您的示例中,所有.csv
文件都有一个公用名files.csv
,这简化了问题,您还没有说明如何要附加csv文件,但我会假设它们具有相同的列标题,并使用cbind()
附加它们,否则您可以使用rbind()
main_data = data.frame(##populate heade) ##
使用here
的答案for(dir in primary_dirs) {
sub_folders = list.files(paste(common_path,dir,sep = ""))
if (any(sub_folders %in% "files.csv")) {
## there is files.csv in this directory read it in and append to a data.frame.
## read in data
temp_data = read.csv(file = paste(common_path,dir,"/files.csv",sep = ""))
## append
main_data = cbind(main_data,temp_data);
} else {
## try go one more directory deeper
for(sub_dir in sub_folders) {
sub_sub_files = list.files(paste(common_path,dir,"/",sub_dir,sep = ""))
if (any(sub_sub_files %in% "files.csv")) {
## found files.csv read it in and append it
temp_data = read.csv(file = paste(common_path,dir,"/",sub_dir,"/files.csv",sep = ""))
main_data = cbind(main_data,temp_data);
} else {
warning("could not find the file 'files.csv' two directories deep")
}
}
}
}