Question

我的数据集如下：

Files          Batch
filepath1.txt   One
filepath2.txt   One
filepath3.txt   One
filepath4.txt   One
filepath5.txt   two
filepath6.txt   two
filepath7.txt   two
filepath8.txt   two

我想通过创建＆＃34;文件＆＃34;组来循环遍历整个数据集（有十几个＆＃34; Batch＆＃34;类别）。这是基于＆＃34; Batch＆＃34;他们在一个名为＆＃34; batch＆＃34;

的新变量中

即

batch[1] 
filepath1.txt
filepath2.txt
filepath3.txt
filepath4.txt

batch[2]
filepath5.txt
filepath6.txt
filepath7.txt
filepath8.txt

如何在完整数据集中为我的所有批处理组执行此操作？

Answer 1

split函数似乎就是你要找的东西。

> dat <- data.frame(File = paste0("file", 1:10, ".txt"), Batch = rep(c("one", "two"), each = 5))
> dat
         File Batch
1   file1.txt   one
2   file2.txt   one
3   file3.txt   one
4   file4.txt   one
5   file5.txt   one
6   file6.txt   two
7   file7.txt   two
8   file8.txt   two
9   file9.txt   two
10 file10.txt   two
> split(dat, dat$Batch)
$one
       File Batch
1 file1.txt   one
2 file2.txt   one
3 file3.txt   one
4 file4.txt   one
5 file5.txt   one

$two
         File Batch
6   file6.txt   two
7   file7.txt   two
8   file8.txt   two
9   file9.txt   two
10 file10.txt   two

根据另一列创建一个子列的列表

1 个答案: