在R中检索新添加的行的索引 - for循环

时间:2015-06-19 20:00:41

标签: r for-loop indexing

我正在尝试检索新添加的行的索引,通过for循环添加。

从头开始,我有一个p值矩阵列表,每个都有可变数量的行和列。这是因为并非所有组都有足够数量的受治疗个体进行t检验。以下是访问此示例列表时打印到控制台的内容:

$Group1
                              Normal  Treatment 1  Treatment 2  
Treatment 1                        1           NA           NA
Treatment 2                        1            1           NA
Treatment 3                        1            1            1

$Group2
                              Normal  Treatment 2   
Treatment 2                        1           NA      
Treatment 4                        1            1     

我希望每个组都以正确的顺序具有相同数量的行和列,并且缺少的值只是填入了NA。这是我想要的样本:

$Group1
                              Normal  Treatment 1  Treatment 2  Treatment 3 
Treatment 1                        1           NA           NA           NA
Treatment 2                        1            1           NA           NA
Treatment 3                        1            1            1           NA
Treatment 4                       NA           NA           NA           NA

$Group2
                              Normal  Treatment 1  Treatment 2  Treatment 3  
Treatment 1                       NA           NA           NA           NA
Treatment 2                        1           NA           NA           NA
Treatment 3                       NA           NA           NA           NA
Treatment 4                        1            1           NA           NA

这是我到目前为止的代码:

fix.results.row <- function(x, factors) {
  results.matrix <- x
  num <- 1
  for (i in factors){
    if (!i %in% rownames(results.matrix)) {
      results.matrix <- rbind(results.matrix, NA)
      rownames(results.matrix)[num] <- i
     } 
    num <- num + 1
  }
  rownames(results.matrix) <- results.matrix[rownames(factors),,drop=FALSE]
  return(results.matrix)
}

在上面的函数中,x将是我的矩阵列表,因子将按照我想要的顺序列出所有因子。我有类似的功能来添加列。

我的问题,正如我所看到的,在第2组。如果它发现我缺少治疗1,它将用rowname治疗1替换rowname治疗2,所以治疗2的数据现在是错误标记的处理1.然后它按照我想要的方式重新排序变量,但数据已被错误标记!

如果我可以访问新添加的行的索引,该索引从组更改为组,那么我只能更改该特定的行名称。有什么建议?如果我需要提供更多信息,请告诉我。我试图涵盖所有内容,但我不确定是否还有其他任何你需要的东西。

2 个答案:

答案 0 :(得分:2)

这不是很优雅,但它可能比使用两个函数分别填充行和列更好。

此处,x是所有矩阵的列表; factor是所需行和列名称的可选列表

fix_rc <- function(x, factors) {
  f <- function(x) factor(ul <- unique(unlist(x)), levels = sort(ul))
  if (missing(factors))
    factors <- list(f(sapply(x, rownames)),
                    f(sapply(x, colnames)))

  template <- matrix(NA, length(factors[[1]]), length(factors[[2]]),
                     dimnames = factors)

  lapply(x, function(xx) {
    ## original
    # xx <- rbind(xx, template[, colnames(xx)])
    # xx <- cbind(xx, template[rownames(xx), ])
    # xx[rownames(template), colnames(template)]
    ## better  http://stackoverflow.com/questions/31050787/r-how-to-match-join-2-matrices-of-different-dimensions-nrow-ncol/31051218#31051218
    xx <- as.data.frame.table(xx)
    template[as.matrix(xx[, 1:2])] <- xx$Freq
    template
  })
}

以下是我正在使用的数据

l <- list(Group1 = matrix(c(1,1,1,NA,1,1,NA,NA,1), 3, 3,
                          dimnames = list(paste('Treatment', 1:3),
                                          c('Normal', paste('Treatment', 1:2)))),
          Group2 = matrix(c(1,1,NA,1), 2, 2,
                          dimnames = list(paste('Treatment', c(2,4)),
                                          c('Normal','Treatment 2'))))

# $Group1
#             Normal Treatment 1 Treatment 2
# Treatment 1      1          NA          NA
# Treatment 2      1           1          NA
# Treatment 3      1           1           1
# 
# $Group2
#             Normal Treatment 2
# Treatment 2      1          NA
# Treatment 4      1           1

你可以像这样使用它。请注意,当您不提供factors时,该函数将从您的矩阵列表中获取所有行名和列名

fix_rc(l)

# $Group1
#             Normal Treatment 1 Treatment 2
# Treatment 1      1          NA          NA
# Treatment 2      1           1          NA
# Treatment 3      1           1           1
# Treatment 4     NA          NA          NA
# 
# $Group2
#             Normal Treatment 1 Treatment 2
# Treatment 1     NA          NA          NA
# Treatment 2      1          NA          NA
# Treatment 3     NA          NA          NA
# Treatment 4      1          NA           1

我不确定您所需输出的列中的处理3来自哪里,但如果您愿意,可以在此处获取

fix_rc(l, factors = list(paste('Treatment', 1:6),
                         c('Normal', paste('Treatment', 1:3))))

# $Group1
#             Normal Treatment 1 Treatment 2 Treatment 3
# Treatment 1      1          NA          NA          NA
# Treatment 2      1           1          NA          NA
# Treatment 3      1           1           1          NA
# Treatment 4     NA          NA          NA          NA
# Treatment 5     NA          NA          NA          NA
# Treatment 6     NA          NA          NA          NA
# 
# $Group2
#             Normal Treatment 1 Treatment 2 Treatment 3
# Treatment 1     NA          NA          NA          NA
# Treatment 2      1          NA          NA          NA
# Treatment 3     NA          NA          NA          NA
# Treatment 4      1          NA           1          NA
# Treatment 5     NA          NA          NA          NA
# Treatment 6     NA          NA          NA          NA

答案 1 :(得分:0)

不是一个完整的解决方案,但是如果您使用数据框:那么它更容易到达那里吗?

df1 <- data.frame(normal=c(1,1,1)
, treatment1=c(NA, 1,1)
, treatment2=c(NA,NA,1)
, row.names=c("Treatment1", "Treatment2", "Treatment3")
)

df2 <- data.frame(normal=c(1,1)
    , treatment2=c(NA,1)
    , row.names=c("Treatment2", "Treatment4")
)

df1$names <- rownames(df1)
df2$names <- rownames(df2)

df3 <- merge(df1,df2, by="names", all=TRUE)

df3

       names normal.x treatment1 treatment2.x normal.y treatment2.y
1 Treatment1        1         NA           NA       NA           NA
2 Treatment2        1          1           NA        1           NA
3 Treatment3        1          1            1       NA           NA
4 Treatment4       NA         NA           NA        1            1

现在你要做的就是根据名字组合列