假设我有两个或更多矩阵。矩阵中的行数和列数相同。但矩阵不一定是方形的。
Matrix1
a b c
1 0.911 0.067 0.023
2 0.891 0.089 0.019
3 0.044 0.931 0.025
4 0.919 0.058 0.023
Matrix2
a b c
1 0.024 0.070 0.906
2 0.020 0.090 0.891
3 0.025 0.930 0.045
4 0.024 0.058 0.918
行始终总和为1.列可能会将位置从矩阵移动到矩阵。所以列名并不意味着什么。上面的例子,mat 1中的列'a'是mat2中的列'c'。价值不会相同但相似。
我可以使用哪种方法/算法来对齐许多此类矩阵中的列?
所需的结果如下所示
Matrix1
a b c
1 0.911 0.067 0.023
2 0.891 0.089 0.019
3 0.044 0.931 0.025
4 0.919 0.058 0.023
Matrix2
c b b
1 0.906 0.070 0.024
2 0.891 0.090 0.020
3 0.045 0.930 0.025
4 0.918 0.058 0.024
列已对齐。 mat1中的'a'对应于mat2中的'c',依此类推。在这一个可能的结果中,mat1是参考,mat2与它对齐。
如果有人想尝试,我会使用R.
mat1 <-
matrix(c(0.911,0.891,0.044,0.919,0.067,0.089,0.931,0.058,0.023,0.019,0.025,0.023),nrow=4)
mat2 <-
matrix(c(0.024,0.020,0.025,0.024,0.070,0.090,0.930,0.058,0.906,0.891,0.045,0.918),nrow=4)
答案 0 :(得分:1)
You could do something like this. The function returns the column indices of mat
in the order that best matches (by Euclidean distance) the columns of m.base
.
col.order <- function(m.base, mat){
no.cols <- ncol(mat)
col.ord <- rep(NA, no.cols)
for(i in 1:no.cols){
vec <- m.base[, i]
col.dists <- apply(mat, 2, function(x) sum((x-vec)^2))
best.dist <- min(col.dists[is.na(col.ord)])
best.col <- match(best.dist, col.dists)
col.ord[best.col] <- i
}
return(col.ord)
}
mat2[, col.order(mat1,mat2)]
[,1] [,2] [,3]
[1,] 0.906 0.070 0.024
[2,] 0.891 0.090 0.020
[3,] 0.045 0.930 0.025
[4,] 0.918 0.058 0.024
答案 1 :(得分:0)
Assuming that each column will always have a pretty good match, this should work.
Matrix2[, sapply(1:ncol(Matrix1),
function(i) which.min(colSums(abs(Matrix2 - Matrix1[,i]))))]
c b a
1 0.906 0.070 0.024
2 0.891 0.090 0.020
3 0.045 0.930 0.025
4 0.918 0.058 0.024