在vapply期间创建由unlist创建的临时值的向量

时间:2014-11-06 13:14:13

标签: r

这个问题建立在my former question about subsets of a matrix

之上

我的df看起来像

structure(list(HQ673618_1 = c(NA, 90.8, 89.8, 89.6, 89.8, 88.9, 
87.8, 88.2, 88.3), HQ674317_1 = c(90.8, NA, 98.6, 97.7, 98.4, 
97.4, 94.9, 96.2, 95.1), EU686630_1 = c(89.8, 98.6, NA, 98.4, 
98.9, 97.7, 95.4, 96.4, 95.8), EU686593_2 = c(89.6, 97.7, 98.4, 
NA, 98.1, 96.8, 94.4, 95.6, 94.8), JN166322_2 = c(89.8, 98.4, 
98.9, 98.1, NA, 97.5, 95.3, 96.5, 95.9), EU491340_2 = c(88.9, 
97.4, 97.7, 96.8, 97.5, NA, 96.5, 97.7, 96), AB694259_3 = c(87.8, 
94.9, 95.4, 94.4, 95.3, 96.5, NA, 98.3, 95.9), AB694258_3 = c(88.2, 
96.2, 96.4, 95.6, 96.5, 97.7, 98.3, NA, 95.8), AB694462_3 = c(88.3, 
95.1, 95.8, 94.8, 95.9, 96, 95.9, 95.8, NA)), .Names = c("HQ673618_1", 
"HQ674317_1", "EU686630_1", "EU686593_2", "JN166322_2", "EU491340_2", 
"AB694259_3", "AB694258_3", "AB694462_3"), class = "data.frame", row.names = c("HQ673618_1", 
"HQ674317_1", "EU686630_1", "EU686593_2", "JN166322_2", "EU491340_2", 
"AB694259_3", "AB694258_3", "AB694462_3"))

我想要一种方法来对按名称后缀" _n"分隔的值进行逐块平均。解决方案是:

indx <- gsub(".*_", "", names(df))
vapply(unique(indx), function(x) {
                          temp <- which(indx %in% x) 
                          mean(unlist(df[temp, temp]), na.rm = TRUE)
                          }, 
        FUN.VALUE = double(1))

我可以引入一行,其中对于indx的每个唯一值,一个向量&#34; temp_current_indx_value&#34;创建包含

给出的所有值
unlist(df[temp, temp], na.rm = TRUE)

但仅适用于较低(或较高)的三角形?

非常感谢你。我想要绘制所有这些值。

2 个答案:

答案 0 :(得分:2)

由于我之前的回答与此问题相关,我只想添加您使用我的代码完成它的方式

indx <- gsub(".*_", "", names(df))
sub.matrices <- lapply(unique(indx), function(x) {
  temp <- which(indx %in% x) 
  df[temp, temp]
})
unique_values <- lapply(sub.matrices, function(x) unique(na.omit(unlist(x))))

或者

unique_values <- lapply(sub.matrices, function(x) x[upper.tri(x)])
mean_values <- lapply(unique_values, mean)

或者

mean_values <- vapply(unique_values, mean, FUN.VALUE = double(1))

答案 1 :(得分:1)

你可以这样做:

group.list    <- split(names(df), gsub(".*_", "", names(df)))
sub.matrices  <- Map(`[`, list(data.matrix(df)), group.list, group.list)
sub.triangles <- Map(function(x) x[upper.tri(x)], sub.matrices)
sub.means     <- Map(mean, sub.matrices, na.rm = TRUE)

其中sub.means是您上一个问题的答案,而sub.triangles就是这个新问题的答案。

此外,您可能希望将Map替换为mapply,以简化从列表到矩阵或向量的输出(有意义的地方。)