我们假设我们有以下目录组织:
~/data1/file1
~/data1/file2
~/data1/file3
~/data2/file1
~/data2/file2
~/data2/file3
~/data3/file1
~/data3/file2
~/data3/file3
如何将每个文件中的列组合成以下内容:
~/data/file1 (containing all columns from file1 in each of the subdirectories)
~/data/file2 (containing all columns from file2 in each of the subdirectories)
~/data/file3 (containing all columns from file3 in each of the subdirectories)
答案 0 :(得分:1)
假设我理解正确,你可以试试:
在这里,我在三个目录.txt
,data1
和data2
中创建了9个data3
文件,这些目录位于我的工作目录TestN
下面,因此模仿情况。
filelist <- list.files(recursive=TRUE)
filelist #As you mentioned, each directory has `file1`, `file2`, `file3` with `.txt extension.
#[1] "data1/file1.txt" "data1/file2.txt" "data1/file3.txt" "data2/file1.txt"
#[5] "data2/file2.txt" "data2/file3.txt" "data3/file1.txt" "data3/file2.txt"
#[9] "data3/file3.txt"
然后,我用它们的基本名称分割filelist
,即。 file1.txt
,file2.txt
等,然后使用另一个lapply
然后cbind
属于同一{{1}的单个列文件,读取拆分文件夹中的各个文件}}。
basename
创建了另一个文件夹 lst1 <- lapply(split(filelist, basename(filelist)), function(x) do.call(cbind, lapply(x,
function(y) read.table(y, header = TRUE, stringsAsFactors = FALSE, sep = ""))))
data
现在,我在mainDir中创建了一些子文件夹
dir.create("data")
mainDir <- paste(getwd(), "data", sep="/")
setwd(mainDir) #change the working directory to mainDir from the previous code
将文件写入相关的子文件夹。
subDir <- gsub("\\..*", "", unique(basename(filelist)))
subDir
#[1] "file1" "file2" "file3"
lapply(subDir, dir.create)