对矩阵中的每一行执行t检验 - 处理NA

时间:2016-07-20 12:47:06

标签: r hypothesis-test rowwise

我想对矩阵中的每一行执行t检验。矩阵看起来像:

data <- 
structure(c(NA, NA, 216750, 440450, NA, NA, 597510, 1839055, 
            851820, 1210200, NA, NA, NA, NA, 486720, 602970, 333150, 346532, 
            NA, NA, 421290, 425660, NA, 375440), .Dim = c(6L, 4L), .Dimnames = list(
              c("Gregg", "Mark", "Donnie", 
                "Fred", "Tim", "Gracie"
              ), c("AUC_Rep1", "AUC_Rep2", "AUC_Rep3", "AUC_Rep4")))

正如您所看到的,数据存在两个问题。第一个是它包含NAs,第二个是在某些行中没有足够的数据 - 整行中只有一个值。

你知道有什么方法可以避免这个问题吗?我想创建一个函数,首先忽略NAs,如果行中只有1个值,它应该给NA作为t-test的输出。

我通常使用pi0包中的函数 - matrix.t.test

1 个答案:

答案 0 :(得分:0)

调整@count的注释以返回p值:

tpval <- function(x) {
  if(sum(!is.na(x)) < 2) {
    NA_real_
  } else {
    t.test(x, na.rm=TRUE)$p.value
  }
}

> apply(data, 1, tpval)
 Gregg       Mark     Donnie       Fred        Tim     Gracie
    NA         NA 0.03350020 0.03600664         NA 0.02547686

我经常遇到同样的问题。所以最近创建了一个包matrixTests来完成你想要的东西:

library(matrixTests)
row_t_onesample(data)

结果是:

> row_t_onesample(data)
       obs    mean          var   stderr df statistic     pvalue  conf.low conf.high alternative mean.null conf.level
Gregg    1  597510          NaN      NaN  0        NA         NA        NA        NA   two.sided         0       0.95
Mark     1 1839055          NaN      NaN  0        NA         NA        NA        NA   two.sided         0       0.95
Donnie   4  494145  70080791100 132363.9  3  3.733231 0.03350020  72904.05  915386.0   two.sided         0       0.95
Fred     4  669820 136234723133 184549.9  3  3.629478 0.03600664  82499.72 1257140.3   two.sided         0       0.95
Tim      1  333150          NaN      NaN  0        NA         NA        NA        NA   two.sided         0       0.95
Gracie   2  360986    417836232  14454.0  1 24.974817 0.02547686 177330.52  544641.5   two.sided         0       0.95

Warning message:
row_t_onesample: 3 of the rows had less than 2 "x" observations.
First occurrence at row 1