我想将每个亚组的一半分配给治疗条件,将一半分配给对照组。当我的子组中的记录数为奇数时,可以任意分配最后一个。
我正试图在dplyr小组中做到这一点,并努力解释奇/偶。我尝试过:
set.seed(1)
library(dplyr)
mtcars %>%
group_by(cyl) %>%
mutate(group = case_when(
n() %% 2 == 0 ~ sample(rep(c("treatment", "control"), n() / 2)),
TRUE ~ sample(rep(c("treatment", "control"), ceiling(n() / 2)))[-1]
))
但是我得到了错误:
错误:
TRUE ~ sample(rep(c("treatment", "control"), ceiling(n()/2)))[-1]
的长度必须为10或1,而不是11
如果该方法更简单,我也愿意使用purrr
。
答案 0 :(得分:2)
mtcars %>%
group_by(cyl) %>%
mutate(group = sample(rep(c("treatment", "control"), ceiling(n()/2)), n()))
n = 2k
行,它会重排k
“处理”和k
“控制”值。 n = 2k + 1
,它从2k + 1
“处理”值和k + 1
“控制”值中采样k + 1
值。我相信这就是您所需要的。这当然可以推广到任意数量的组:
mtcars %>%
group_by(cyl) %>%
mutate(group = sample(rep(c("A", "B", "C"), ceiling(n()/3)), n())) %>%
count(cyl, group)
答案 1 :(得分:1)
我相信这是问题所要求的。
mtcars %>%
group_by(cyl) %>%
mutate(i = row_number() %in% sample(row_number(), n() %/% 2),
group = ifelse(i, "treatment", "control")) %>%
select(-i)
通过count
设置group
的值来检查结果。
library(dplyr)
set.seed(1)
mtcars %>%
group_by(cyl) %>%
mutate(i = row_number() %in% sample(row_number(), n() %/% 2),
group = ifelse(i, "treatment", "control")) %>%
select(-i) %>%
count(cyl, group)
## A tibble: 6 x 3
## Groups: cyl [3]
# cyl group n
# <dbl> <chr> <int>
#1 4 control 6
#2 4 treatment 5
#3 6 control 4
#4 6 treatment 3
#5 8 control 7
#6 8 treatment 7