R - 从每组值的数据集中选择一个样本

时间:2015-11-09 11:46:54

标签: r sample

我有一个如下所示的数据集:

string_1,score,group
"sdfsd",0.546,0.5
"sdfsd",0.53,0.5
"sdfsd",0.52,0.5
"dgfbx",0.43,0.4
"dsgfgsd",0.48,0.4
"dsgfgsd",0.42,0.4
"dsgfgsd",0.84,0.8
"dsgfgsd",0.83,0.8
"dsgfgsd",0.82,0.8

我想从每个小组中取样。意思是 - 我想从每组值中随机抽取2行:0.4,0.5,0.8(组字段)

最简单的方法是什么?

由于

1 个答案:

答案 0 :(得分:2)

你可以考虑做这样的事情。它按组拆分数据,并返回采样行。

set.seed(1)
res <- do.call(rbind,lapply(split(dat,dat$group),function(x){x[sample(nrow(x),2),]}))
> res
      string_1 score group
0.4.4    dgfbx  0.43   0.4
0.4.6  dsgfgsd  0.42   0.4
0.5.2    sdfsd  0.53   0.5
0.5.3    sdfsd  0.52   0.5
0.8.7  dsgfgsd  0.84   0.8
0.8.8  dsgfgsd  0.83   0.8