Question

假设我有两个或更多矩阵。矩阵中的行数和列数相同。但矩阵不一定是方形的。

Matrix1
     a        b        c
1    0.911    0.067    0.023
2    0.891    0.089    0.019
3    0.044    0.931    0.025
4    0.919    0.058    0.023

Matrix2
     a        b        c
1    0.024    0.070    0.906
2    0.020    0.090    0.891
3    0.025    0.930    0.045
4    0.024    0.058    0.918

行始终总和为1.列可能会将位置从矩阵移动到矩阵。所以列名并不意味着什么。上面的例子，mat 1中的列'a'是mat2中的列'c'。价值不会相同但相似。

我可以使用哪种方法/算法来对齐许多此类矩阵中的列？

所需的结果如下所示

Matrix1
     a        b        c
1    0.911    0.067    0.023
2    0.891    0.089    0.019
3    0.044    0.931    0.025
4    0.919    0.058    0.023

Matrix2
     c        b        b
1    0.906    0.070    0.024
2    0.891    0.090    0.020
3    0.045    0.930    0.025
4    0.918    0.058    0.024

列已对齐。 mat1中的'a'对应于mat2中的'c'，依此类推。在这一个可能的结果中，mat1是参考，mat2与它对齐。

如果有人想尝试，我会使用R.

mat1 <-
 matrix(c(0.911,0.891,0.044,0.919,0.067,0.089,0.931,0.058,0.023,0.019,0.025,0.023),nrow=4)
mat2 <-
 matrix(c(0.024,0.020,0.025,0.024,0.070,0.090,0.930,0.058,0.906,0.891,0.045,0.918),nrow=4)

Answer 1

You could do something like this. The function returns the column indices of mat in the order that best matches (by Euclidean distance) the columns of m.base.

col.order <- function(m.base, mat){
  no.cols <- ncol(mat)
  col.ord <- rep(NA, no.cols)
  for(i in 1:no.cols){
    vec <- m.base[, i]
    col.dists <- apply(mat, 2, function(x) sum((x-vec)^2))
    best.dist <- min(col.dists[is.na(col.ord)])
    best.col <- match(best.dist, col.dists)
    col.ord[best.col] <- i
  }
  return(col.ord)
}

mat2[, col.order(mat1,mat2)]

      [,1]  [,2]  [,3]
[1,] 0.906 0.070 0.024
[2,] 0.891 0.090 0.020
[3,] 0.045 0.930 0.025
[4,] 0.918 0.058 0.024

Answer 2

Assuming that each column will always have a pretty good match, this should work.

Matrix2[, sapply(1:ncol(Matrix1), 
     function(i) which.min(colSums(abs(Matrix2 - Matrix1[,i]))))]
      c     b     a
1 0.906 0.070 0.024
2 0.891 0.090 0.020
3 0.045 0.930 0.025
4 0.918 0.058 0.024

跨矩阵匹配列

2 个答案: