随机返回数据框中子集的行号

时间:2016-10-18 13:11:08

标签: r random

我希望能够从数据集中随机返回行号,其中行是数据的子集。例如,使用数据框

x.f<-data.frame(
     G = c("M","M","M","M","M","M","F","F","F","F","F","F"),
     A = c("1","2","3","1","2","3","1","2","3","1","2","3"),
     E = c("W","W","W","B","B","B","W","W","W","B","B","B"))

我想,比方说,随机给我一个行号,其中G ==“M”和A ==“3”,所以答案将是第3行或第6行。返回的数字必须是位置在原始数据框中。虽然这个例子结构很好(每个可能的组合只出现一次),但实际上不会有这样的结构,例如组合(M,2,W)将随机分布在整个数据框中并且可以发生更多或更少次比其他组合。

4 个答案:

答案 0 :(得分:1)

请尝试一个:which(((x.f$G == "M") & (x.f$A == 3)))

答案 1 :(得分:1)

或许这个:

row.names(subset(x.f, x.f$G == "M" & x.f$A == 3))
[1] "3" "6"

答案 2 :(得分:1)

使用Sourabh和sample的答案,您可以尝试:

# create a function using the sample function, which selects one value by chance
foo <- function(G, A, data){
  sample(which(data$G == G & data$A == A), 1)
}

foo("M", 3, x.f)
3

要测试相等性,请使用循环运行该函数1000次:

res <- NULL
for(i in 1:1000){
  res[i] <- foo("M", 3, x.f)
}
hist(res)

enter image description here 似乎是平等分配。

答案 3 :(得分:1)

其他任何一个答案都会为您提供符合条件的行列表,但不会随机选择一行。如需完整答案:

sample(which(x.f$G == "M" & x.f$A == 3),1)

sample(row.names(subset(x.f, x.f$G == "M" & x.f$A == 3)),1)

sample(row.names(x.f[x.f$G=="M" & x.f$A==3,]),1)

一切都会奏效。可能有两到三种其他方法来生成符合一组条件的行索引或名称列表。