在Dataframe-R中的每组ID中随机分配一个整数

时间:2017-03-29 19:16:41

标签: r for-loop integer sampling

我正在尝试在每组现有ID中设置一个随机整数。整数必须满足以下条件:唯一,非重复,并且一组ID的最高整数不会大于具有该ID的行数。

我尝试在for循环中执行此操作,它适用于第一组ID,但不会重复下一组。我查看了几个现有的堆栈溢出问题和其他在某种程度上解决这个问题的站点,但仍然无法正确解决这个问题。以下链接:

Randomly Assign Integers in R within groups without replacement

http://r.789695.n4.nabble.com/Random-numbers-for-a-group-td964301.html

random selection within groups

我需要它是动态的,因为每周可以有更多的ID或更少的ID。实际的DF有几个其他列,但为了便于再现,它们在未使用时被遗漏。

以下示例:

#Desired Output

Groups <- c("A","A","A","A","B","B","B","B","B","B","B","B","C","C","C","C","C","C")
Desired_Integer <- c(1,4,2,3,6,3,1,2,8,5,7,4,5,6,1,4,3,2)
Example <- data.frame(Groups,Desired_Integer)


#Attempted For Loop for Example (assuming Example is a DF with one column, Groups for the For Loop)

Groups <- c("A","A","A","A","B","B","B","B","B","B","B","B","C","C","C","C","C","C")
Example <- as.data.frame(Groups)

for (i in Example$Groups)
{
Example$Desired_Integer <- sample.int(length(which(Example$Groups == i)))
}

提前感谢您的帮助!

1 个答案:

答案 0 :(得分:2)

您可以使用基本功能ave

执行此操作
dd <- data.frame(Groups = rep(c("A","B","C"), c(4,8,6)))
rand_seq_for <- function(x) sample.int(length(x))
dd$rand_int <- ave(1:nrow(dd), dd$Groups, FUN=rand_seq_for)

或使用dplyr,你可以做到

library(dplyr)
dd %>% group_by(Groups) %>% mutate(rand_int=sample.int(n()))