我有一个10x10矩阵,0和1,1s通常聚集在一起。我试图将1的簇提取到他们自己的矩阵列表中。解释:这是我的起始矩阵:
field <- matrix(0,10,10)
field[3:4,3:4]<-1
field[6:7,7]<-1
field[7:8,8]<-1
field[8,6]<-1
field
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 0 0 0 0 0 0 0 0 0 0
[2,] 0 0 0 0 0 0 0 0 0 0
[3,] 0 0 1 1 0 0 0 0 0 0
[4,] 0 0 1 1 0 0 0 0 0 0
[5,] 0 0 0 0 0 0 0 0 0 0
[6,] 0 0 0 0 0 0 1 0 0 0
[7,] 0 0 0 0 0 0 1 1 0 0
[8,] 0 0 0 0 0 1 0 1 0 0
[9,] 0 0 0 0 0 0 0 0 0 0
[10,] 0 0 0 0 0 0 0 0 0 0
我想得到一个类似于以下列表产生的矩阵列表(边框为0来制作一个矩形):
list(
field[2:5,2:5],
field[5:9,5:9]
)
[[1]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 1 1 0
[3,] 0 1 1 0
[4,] 0 0 0 0
[[2]]
[,1] [,2] [,3] [,4] [,5]
[1,] 0 0 0 0 0
[2,] 0 0 1 0 0
[3,] 0 0 1 1 0
[4,] 0 1 0 1 0
[5,] 0 0 0 0 0
我甚至不知道如何在概念上做到这一点。是否有任何类型的包存在用于相关目的,或任何人可以提供的任何帮助/解释这样做?还是只是不可能?谢谢你的帮助!
答案 0 :(得分:1)
首先,您需要定义相似度量。关于此问题有a discussion on Math SE。
一种简单的方法是计算cosine similarity
所以,例如
# Generate the matrix
field <- matrix(0,10,10)
field[3:4,3:4]<-1
field[6:7,7]<-1
field[7:8,8]<-1
field[8,6]<-1
# Our similarity function, adapt as needed
simil <- function(m1, m2)
{
# Check dimensions are the same
if (any(dim(m1) != dim(m2)))
stop(paste("ERROR: matrices are not the same size: ",
nrow(m1), "x", ncol(m1), "vs",
nrow(m2), "x", ncol(m2)))
# Linearize the matrices
m1 <- as.vector(m1)
m2 <- as.vector(m2)
# Cosine similarity
similarity <- (m1%*%m2)/sqrt((m1%*%m1) * (m2%*%m2))
return(similarity)
}
现在,在三个可能的领域进行测试,似乎效果很好
m1 <- field[2:5, 2:5]
m2 <- field[6:9, 6:9]
m3 <- field[4:7, 7:10]
> simil(m1, m2)
[,1]
[1,] 0.6708204
> simil(m1, m3)
[,1]
[1,] 0
> simil(m2, m3)
[,1]
[1,] 0.2581989
而且,正如所料:
> simil(m1,m1)
[,1]
[1,] 1
> simil(m1,!m1)
[,1]
[1,] 0
我们现在生成所有可能的矩阵,我用两个嵌套的for
循环来完成它,这通常是低效的,但对于小矩阵并不重要。
field.len <- 4
subfields <- list()
i <- 1
for (col in (1:(ncol(field)-field.len+1)))
{
for (row in (1:(nrow(field)-field.len+1)))
{
submatrix <- field[row:(row+field.len-1),col:(col+field.len-1)]
# Discard zero matrices
if (sum(submatrix) > 0)
{
subfields[[i]] <- submatrix
i <- i+1
}
}
}
最后,找到相似度矩阵
simil.matrix <- sapply(subfields, function(sf1)
{
res <- sapply(subfields, function(sf2)
{
res <- simil(sf1, sf2)
res
})
res
})
例如:
> simil.matrix[1,24]
[1] 0.8660254
> subfields[[1]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 0 1 1
[4,] 0 0 1 1
> subfields[[24]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 0 1 0
[4,] 0 0 1 1
两个不太相似的矩阵
> simil.matrix[10,5]
[1] 0.25
> subfields[[10]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 1 1 0 0
[3,] 1 1 0 0
[4,] 0 0 0 0
> subfields[[5]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 1 1 0
[4,] 0 1 1 0
两个不同的人
> simil.matrix[4,5]
[1] 0
> subfields[[4]]
[,1] [,2] [,3] [,4]
[1,] 0 0 1 1
[2,] 0 0 0 0
[3,] 0 0 0 0
[4,] 0 0 0 0
> subfields[[5]]
[,1] [,2] [,3] [,4]
[1,] 0 0 0 0
[2,] 0 0 0 0
[3,] 0 1 1 0
[4,] 0 1 1 0
可能有更好的方法,但这似乎是一个很好的开始。