在所附的数据框中,我有10个主题(5个男性和5个女性)。每个受试者有三个分析物(A,B,C),每个分析物具有三个访问值(访问次数= 1,2,3)。现在,我想按每种分析物和每次访问对男性和女性进行两组比较。我在分析物中使用i并在访问中使用j的嵌套循环。附加了所需的输出格式(9行7列)。我希望有9行,但这里只有3行。我认为i循环的输出未正确存储,但是我不确定如何正确包含i。有什么建议么?非常感谢你!
df1 = data.frame(id = c(1:10), gender = c(rep(c("F","M"),5)))
df2 = data.frame(id = c(1:10), analyte = c(rep(c("A","B","C"), 10)))
df3 = data.frame(id = rep((1:10),each=3), visit = rep(c("day1","day2","day3"),10))
set.seed(123)
df4 = data.frame(id = rep((1:10),each=9), val=rnorm(n = 90, mean = 0, sd = 1))
df5 = Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "id", all.x = TRUE), list(df1,df2,df3))
df = cbind(df5,df4)[,-5]
mk1=unique(df$analyte)
mk2=unique(df$visit)
out=matrix(NA, ncol=7, nrow=9)
for(i in 1:length(mk1)){
for (j in 1:length(mk2)){
dd = df[as.character(df$analyte)==mk1[i]&as.character(df$visit)==mk2[j],]
x = as.vector(dd$val[dd$gender=="F"])
y = as.vector(dd$val[dd$gender=="M"])
med1=as.numeric(quantile(x, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
med2=as.numeric(quantile(y, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
ci=wilcox.test(x, y, conf.int = TRUE, exact=FALSE)$conf.int
out[j,] = c(mk1[i], mk2[j],length(x),length(y),
med1, med2, wilcox.test(x, y, conf.int = TRUE,
exact=FALSE)$p.value)
}
}
colnames(out)=c("Analyte", "VISIT", "Female (N)", "Male (N)",
"Median of Female", "Median of Male", "P_wilxon")
答案 0 :(得分:1)
您的直接问题是您要重新分配到同一输出矩阵行。由于 j 从未达到9,因此下面仅保留最后三行。
out[j,] <- ...
但是,不是使用嵌套的for
循环将输出迭代地分配给具有硬编码维的预定义矩阵,而是使用更具动态性的方法。考虑通过{em> visit 和 analyte 将by
设置为数据帧的子集,然后将子集传递到所需的操作中。最后,最后一个对象的数据帧的行绑定列表:
run_comparison <- function(dd) {
x <- as.vector(dd$val[dd$gender=="F"])
y <- as.vector(dd$val[dd$gender=="M"])
med1 <- as.numeric(quantile(x, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
med2 <- as.numeric(quantile(y, probs=seq(0,1, by=0.25), na.rm=TRUE, type=2)[3])
wx <- wilcox.test(x, y, conf.int = TRUE, exact=FALSE)
data.frame(ANALYTE = dd$analyte[[1]], Visit = dd$visit[[1]],
Female_N = length(x), Male_N = length(y),
Female_Median = med1, Male_Median= med2,
P_Wilcox = wx$p.value)
}
df_list <- by(df, df[c("analyte", "visit")], run_comparison)
final_df <- do.call(rbind, df_list)