是否有更有效的方法在矩阵的连续行中执行函数?

时间:2016-07-21 16:36:30

标签: r matrix distance

我希望计算矩阵的每一行与同一矩阵的每一行之间的variation of information。此距离指标不包含在dist中,因此我必须手动迭代。每行都是一个聚类,每列都是一个样本。矩阵的值为{1,0},表示样本是否是群集的成员。这是一个示例矩阵和我现在拥有的。可能需要一段时间,是否有更有效的方法来执行此计算?

# subset those clusterings which meet threshold of member count
m <- 100
n <- 70
membership <- matrix(sample(0:1, m * n, replace = TRUE), m, n)

# create distance matrix, set diagonal to 0
dist.matrix <- matrix(, nrow = m, ncol = m)
diag(dist.matrix) <- 0

# iterate through each row and calculate distances with subsequent rows
# fill values in distance matrix
for (i in 1:m) {
    for (j in (i+1):m) {
        if (j > m) break
        vi <- igraph::compare(membership[i,], membership[j,], method = "vi")
        dist.matrix[i,j] <- vi
        dist.matrix[j,i] <- vi
    }
}

1 个答案:

答案 0 :(得分:0)

您可以使用expand.grid定义组合,使用sapply来计算值,并重新整形以生成最终矩阵

df_combs <- expand.grid(1:nrow(membership), 1:nrow(membership))
df_combs$compare <- apply(df_combs, 1, function(x) igraph::compare(membership[x[1],], membership[x[2],], method = "vi"))
df_wide <- reshape(df_combs, direction = "wide", timevar = "Var1", idvar = "Var2")
df_wide$Var2 <- NULL

df_wide与dist.matrix相同。