我试图迭代~60列,目标是执行按案例/控制状态加权的t检验,并将输出捕获为列表。这是我到目前为止的尝试 - 请注意,我的数据框称为生物标记,第3-59列代表我感兴趣的变量,由第2列加权(称为案例):
tests <- list()
column_biomarkers <- colnames(biomarkers[3:59])
for (i in column_biomarkers){
tests[[i]] <- t.test(biomarkers$i[case == 1],biomarkers$i[case == 0],pool.sd=FALSE,na.rm=TRUE)
}
sapply(tests, function(x) {
c(x$estimate[1],
x$estimate[2],
ci.lower = x$conf.int[1],
ci.upper = x$conf.int[2],
p.value = x$p.value)
})
但是,我这次尝试会导致以下错误:
var(x)中的错误:&#39; x&#39;是NULL
任何建议都将不胜感激!我是使用R的新手。
示例数据:
structure(list(subject = 1:10, case = c(1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 1L), biomarker_1 = c(308.29999, 2533.3, 2723.3, 3125.3,
853, 6442.2998, 1472.5, 170.5, 64.5, 2624.8), biomarker_2 = c(4930.7998,
2401, 5158.5, 6526, 3774.2, 5753, 1955.2, 1332.2, 1296.8, 5859.2998
), biomarker_3 = c(4810, 3279.5, 7929.5, 8353, 4074.2, 7940.5,
1545.7, 2189.2, 1488.7, 6352.5)), .Names = c("subject", "case",
"biomarker_1", "biomarker_2", "biomarker_3"), row.names = c(NA,
10L), class = "data.frame")
答案 0 :(得分:2)
考虑将数据框拆分为两个分组,并使用mapply()
(多变量应用函数在对象之间按元素运行操作)在列中运行t检验。
controldf <- df[df$case==1, 3:ncol(df)]
treatmentdf <- df[df$case==0, 3:ncol(df)]
tfct <- function(v1, v2){
t.test(v1, v2, pool.sd=FALSE, na.rm=TRUE)
}
ttests <- mapply(tfct, controldf, treatmentdf)
ttests
# biomarker_1 biomarker_2
# statistic -0.4310577 2.287416
# parameter 7.943542 7.987304
# p.value 0.677885 0.05152236
# conf.int Numeric,2 Numeric,2
# estimate Numeric,2 Numeric,2
# null.value 0 0
# alternative "two.sided" "two.sided"
# method "Welch Two Sample t-test" "Welch Two Sample t-test"
# data.name "v1 and v2" "v1 and v2"
#
# biomarker_3
# statistic 1.169058
# parameter 7.995322
# p.value 0.2760513
# conf.int Numeric,2
# estimate Numeric,2
# null.value 0
# alternative "two.sided"
# method "Welch Two Sample t-test"
# data.name "v1 and v2"
甚至将结果迁移到数据框:
# Transposed data frame output of results
testdf <- data.frame(t(ttests))
head(testdf)
# statistic parameter p.value conf.int
# biomarker_1 -0.4310577 7.943542 0.677885 -3219.767, 2206.667
# biomarker_2 2.287416 7.987304 0.05152236 -19.24659, 4598.82973
# biomarker_3 1.169058 7.995322 0.2760513 -1785.201, 5455.684
# estimate null.value alternative
# biomarker_1 1727.85, 2234.40 0 two.sided
# biomarker_2 5272.575, 2982.783 0 two.sided
# biomarker_3 5897.425, 4062.183 0 two.sided
# method data.name
# biomarker_1 Welch Two Sample t-test v1 and v2
# biomarker_2 Welch Two Sample t-test v1 and v2
# biomarker_3 Welch Two Sample t-test v1 and v2
答案 1 :(得分:0)
这是另一种可能的解决方案。
tests <- list()
column_biomarkers <- colnames(biomarkers[3:5])
for (i in column_biomarkers){
tests[[i]] <-t.test(biomarkers[[i]][biomarkers$case == 1],biomarkers[[i]][biomarkers$case == 0],pool.sd=FALSE,na.rm=TRUE)
}
R不喜欢生物标记$ i [biomarkers $ case == 1],R不接受我作为有效的列名,因此使用[[]]符号似乎有效。