假设多站点RCT有两组 - 实验和对照。需要为每个时间段和站点计算权重。我将实验对象的权重固定为1,然后计算对照组的权重,使得对照组权重的总和等于实验组权重的总和(对于每个时间段和站点)。以下是生成假设数据集的代码:
set.seed(13458) # Set random seed
ID <- c(1:20) # Generate 20 unique subject IDs
timePeriod <- c(rep(1, 3), rep(2, 6), rep(3, 11)) # Generate time periods
site <- c(rep("A", 3), rep("B", 6), rep("C", 11)) # Generate sites
group <- sample(c("exp", "cont"), 20, replace = TRUE) # Random assignment
outcome <- sample(c(rep(0,75), 0:100), 20, replace = TRUE) # Generate outcomes
DF <- data.frame(ID, timePeriod, site, group, outcome) # Create a data frame
输出:
head(DF)
ID timePeriod site group outcome
1 1 1 A cont 12
2 2 1 A cont 37
3 3 1 A exp 59
4 4 2 B exp 0
5 5 2 B cont 0
6 6 2 B exp 0
7 7 2 B cont 0
8 8 2 B exp 22
9 9 2 B exp 34
10 10 3 C cont 26
这是一个相当笨拙的策略,用于计算每个时间段和站点的实验和对照受试者的权重。
library(plyr)
a <- ddply(DF, c("timePeriod", "site", "group"), function(x){
countSubjects <- length(x$group)
data.frame(N = countSubjects)
})
a$weight <- rep(NA, nrow(a)) for(i in 1:nrow(a))
{
n <- a$N[i+1]
d <- a$N[i]
weight <- n/d
a$weight[i] <- ifelse(a$group[i] == "cont", weight,
ifelse(a$group[i] == "exp", 1, a$ratio))
}
> print(a)
timePeriod site group N weight
1 1 A cont 2 0.5
2 1 A exp 1 1.0
3 2 B cont 2 2.0
4 2 B exp 4 1.0
5 3 C cont 5 1.2
6 3 C exp 6 1.0
>
如何使用这些权重来计算调整后的结果(即将每个受试者的结果乘以时间段,地点和组的相应权重)?我的兴趣是将计算的权重和调整后的结果添加到原始数据框。 (以下是调整后结果的一个例子。)
ID 1: 12 * .5 = 6
ID 2: 37 * .5 = 18.5
ID 3: 59 * 1 = 59
ID 4: 0 * 1 = 0
…
ID 8: 22 * 1 = 22
…
ID 10: 26 * 1.2 = 31.2
答案 0 :(得分:1)
使用dplyr可以这样做:
tmp <- DF %>% group_by(timePeriod, site, group) %>% mutate(N=n(), outcome)
tmp %>% group_by(timePeriod, site) %>%
arrange(group) %>%
mutate(weight=ifelse(group=="cont", last(N)/first(N), last(N)/last(N)),
adjusted=outcome*weight )