问题概述:我有一个数据集,其中包含15个问题的教学前和教学后考试的结果。我希望对结果进行t检验以比较总体均值,但是很难正确格式化数据集。 下面是数据集的示例部分:
1Pre 1Post 2Pre 2Post 3Pre 3Post 4Pre 4Post
Correct B B A A B B C C
1 B B C D C B C C
2 C B B D C B C A
3 <NA> <NA> <NA> <NA> <NA> <NA> <NA> <NA>
4 B B B A B B C C
5 B B B A B B C C
6 C B D A A D C B
7 C C D D E E C C
8 C A B B A A <NA> <NA>
目标:我想将“正确”的值与下面适用于应试者的行中的值进行匹配,以使值1正确,而值0不正确。我使用以下代码完成了此任务:
for(j in 1:ncol(qDat)){
for(i in 1:nrow(qDat)){
if(qDat[i,j] == correctAns[1]){
qDat[i,j]=1
}else{
qDat[i,j]=0
}
}
}
然后,除了比较每个问题的前后得分之间的差异外,我还想进行t检验来比较前后均值,但是,我需要使用NA省略任何数据点。目前,我的方法不适用于任何NA值,因此将其替换为零。是否有任何方法可以运行这些测试并仅省去NA值?谢谢!
所需的输出:
1Pre 1Post 2Pre 2Post 3Pre 3Post
Correct B B A A B B
1 1 1 0 0 0 1
2 0 1 0 0 0 1
3 <NA> <NA> <NA> <NA> <NA> <NA>
4 1 1 0 0 1 1
5 1 1 0 0 1 1
6 0 1 0 1 0 0
7 0 0 0 0 0 0
8 0 0 0 0 0 0
答案 0 :(得分:2)
您可以尝试将以下参数传递给t.test调用:
na.action = na.omit
类似的东西:
with(qDat, t.test(`1Pre`, `1Post`, na.action = na.omit))
答案 1 :(得分:1)
那呢:
重新编写循环-当您将NA
视为0
时,不必担心NA
,我们可以简单地测试结果并将FALSE
设置为{{1 }}:
test <- qDat == correctAns # or correctAns[1] depending on your needs
test[is.na(test)] <- FALSE
storage.mode(test) <- "integer"
test
# X1 X2 X3 X4 X5 X6 X7 X8
# [1,] 0 1 0 0 1 0 1 0
# [2,] 0 0 1 0 0 0 0 0
# [3,] 0 1 0 0 1 0 0 0
# [4,] 0 0 1 0 0 0 0 0
# [5,] 1 0 0 0 0 0 1 0
# [6,] 0 0 1 1 1 1 1 0
# [7,] 0 0 0 1 0 0 1 0
# [8,] 0 0 0 0 0 0 0 1
与数据
set.seed(123)
correctAns <- sample(LETTERS[1:3], 8, replace = TRUE)
correctAns
# [1] "A" "C" "B" "C" "C" "A" "B" "C"
qDat <- sample(c(LETTERS[1:3], NA_character_), 8*2*4, replace = TRUE)
qDat <- data.frame(matrix(qDat, 8, 4*2), stringsAsFactors = FALSE)
qDat
# X1 X2 X3 X4 X5 X6 X7 X8
# 1 C A C C A B A <NA>
# 2 B A C <NA> B <NA> <NA> B
# 3 <NA> B C A B A <NA> <NA>
# 4 B <NA> C B B B B <NA>
# 5 C <NA> B <NA> A <NA> C <NA>
# 6 C C A A A A A B
# 7 A C <NA> B A C B <NA>
# 8 <NA> <NA> <NA> A B A B C
修改
set.seed(123)
# correctAns is a vector of length 30
correctAns <- sample(LETTERS[1:3], 30, replace = TRUE)
length(correctAns)
# [1] 30
# qDat is a dataframe of dimensions 106x30
qDat <- sample(c(LETTERS[1:3], NA_character_), 106*30, replace = TRUE)
qDat <- data.frame(matrix(qDat, 106, 30), stringsAsFactors = FALSE)
dim(qDat)
# [1] 106 30
# still works
test <- qDat == correctAns
test[is.na(test)] <- FALSE
storage.mode(test) <- "integer"
str(test)
# int [1:106, 1:30] 0 0 0 0 0 0 0 0 1 0 ...
# - attr(*, "dimnames")=List of 2
# ..$ : NULL
# ..$ : chr [1:30] "X1" "X2" "X3" "X4" ...