我需要做一个循环(我之前没有做过)并给出观察结果(第1列),我需要弄清楚(i)变量(s1-s5)中的哪些组合是显着的( P <0.05),(ii)仅将变量中显着的组合与相应的p值保持一致。我认为这是学习如何在R中进行循环的好方法。原始数据很大,与此类似:
ob <- c(120,100,85,56,87)
s1 <- c("ab","aa","ab","aa","bb")
s2 <- c("aa","aa","ab","bb","bb")
s3 <- c("bb","ab","aa","ab","ab")
s4 <- c("aa","ab","bb","ab","aa")
s5 <- c("bb","ab","aa","ab","bb")
dset <- data.frame(ob,s1,s2,s3,s4,s5)
DSET
ob s1 s2 s3 s4 s5
120 ab aa bb aa bb
100 aa aa ab ab ab
85 ab ab aa bb aa
56 aa bb ab ab ab
87 bb bb ab aa bb
任何帮助将不胜感激!
巴兹
答案 0 :(得分:1)
也许我错过了一些东西,但是我没有看到在data.frame中添加一列p值而不转换data.frame是有意义的。如果它们位于不同的列中,您如何知道哪个p值对应于哪个自变量?这里有一种方法,使用for循环为每个自变量运行anova并将它们存储在一个新的向量上:
#Use grep to return the columns that match the pattern "s". This returns their column index.
#This is what we'll use in the for loop
vars <- grep("s", names(dset))
#Create a new vector to hold the anova results and name it
dat <- vector("integer", length = ncol(dset))
names(dat) <- colnames(dset)
#Run for loop, assigning the p-value from the anova to the proper spot in the vector we made
for (var in vars) {
dat[var] <- anova(lm(ob ~ dset[, var], data = dset))$"Pr(>F)"[1]
}
以上所有内容都将产生:
> dat
ob s1 s2 s3 s4 s5
0.0000000 0.7219532 0.3108441 0.4668372 0.6908916 0.6908916
我会告诉您如何将其与原始data.frame相关联。