我的数据集如下:
Files Batch
filepath1.txt One
filepath2.txt One
filepath3.txt One
filepath4.txt One
filepath5.txt two
filepath6.txt two
filepath7.txt two
filepath8.txt two
我想通过创建"文件"组来循环遍历整个数据集(有十几个" Batch"类别)。这是基于" Batch"他们在一个名为" batch"
的新变量中即
batch[1]
filepath1.txt
filepath2.txt
filepath3.txt
filepath4.txt
batch[2]
filepath5.txt
filepath6.txt
filepath7.txt
filepath8.txt
如何在完整数据集中为我的所有批处理组执行此操作?
答案 0 :(得分:2)
split
函数似乎就是你要找的东西。
> dat <- data.frame(File = paste0("file", 1:10, ".txt"), Batch = rep(c("one", "two"), each = 5))
> dat
File Batch
1 file1.txt one
2 file2.txt one
3 file3.txt one
4 file4.txt one
5 file5.txt one
6 file6.txt two
7 file7.txt two
8 file8.txt two
9 file9.txt two
10 file10.txt two
> split(dat, dat$Batch)
$one
File Batch
1 file1.txt one
2 file2.txt one
3 file3.txt one
4 file4.txt one
5 file5.txt one
$two
File Batch
6 file6.txt two
7 file7.txt two
8 file8.txt two
9 file9.txt two
10 file10.txt two