互相关矩阵和层次聚类Python

时间:2018-10-17 21:46:53

标签: python hierarchical-clustering

我正在尝试执行以下操作,即使用'scipy.cluster.hierarchy'将树状图放置在互相关矩阵的顶部和左侧。我的互相关矩阵是10x10,由-1至+1的值以及沿其对角线的值组成。它看起来如下:

array([[ 0.   ,  0.94 ,  0.909,  0.857,  0.736,  0.485,  0.163, -0.079,
    -0.142, -0.265],
   [ 0.94 ,  0.   ,  0.977,  0.943,  0.832,  0.591,  0.251, -0.018,
    -0.074, -0.233],
   [ 0.909,  0.977,  0.   ,  0.97 ,  0.887,  0.665,  0.331,  0.053,
    -0.013, -0.18 ],
   [ 0.857,  0.943,  0.97 ,  0.   ,  0.956,  0.777,  0.461,  0.178,
     0.103, -0.08 ],
   [ 0.736,  0.832,  0.887,  0.956,  0.   ,  0.909,  0.659,  0.392,
     0.306,  0.119],
   [ 0.485,  0.591,  0.665,  0.777,  0.909,  0.   ,  0.896,  0.7  ,
     0.62 ,  0.448],
   [ 0.163,  0.251,  0.331,  0.461,  0.659,  0.896,  0.   ,  0.933,
     0.869,  0.766],
   [-0.079, -0.018,  0.053,  0.178,  0.392,  0.7  ,  0.933,  0.   ,
     0.959,  0.935],
   [-0.142, -0.074, -0.013,  0.103,  0.306,  0.62 ,  0.869,  0.959,
     0.   ,  0.925],
   [-0.265, -0.233, -0.18 , -0.08 ,  0.119,  0.448,  0.766,  0.935,
     0.925,  0.   ]])

我正在使用以下代码,其中变量Av_Matrix是上述矩阵:

import matplotlib.pyplot as plt
import numpy as np
import scipy
import pylab
import scipy.cluster.hierarchy as sch
from scipy.spatial.distance import squareform

#Replace the ones on the diagonal with zeros
for b in range(10):
    Av_Matrix[b,b]=0

#Rounded to 3 decimal places to better see if atrix is symmetric
Av_Matrix = np.matrix.round(Av_Matrix,3)

condensedD = squareform(Av_Matrix)

# Compute and plot first dendrogram.
fig = pylab.figure(figsize=(8,8))
ax1 = fig.add_axes([0.09,0.1,0.2,0.6])
Y = sch.linkage(condensedD, method='single')
Z1 = sch.dendrogram(Y, orientation='left')
ax1.set_xticks([])
ax1.set_yticks([])

# Compute and plot second dendrogram.
ax2 = fig.add_axes([0.3,0.71,0.6,0.2])
Y = sch.linkage(condensedD, method='single')
Z2 = sch.dendrogram(Y)
ax2.set_xticks([])
ax2.set_yticks([])

# Plot distance matrix.
axmatrix = fig.add_axes([0.3,0.1,0.6,0.6])
idx1 = Z1['leaves']
idx2 = Z2['leaves']
D = Av_Matrix[idx1,:]
D = Av_Matrix[:,idx2]
im = axmatrix.matshow(Av_Matrix, aspect='auto', origin='lower', cmap=pylab.cm.YlGnBu)
axmatrix.set_xticks([])
axmatrix.set_yticks([])

# Plot colorbar.
axcolor = fig.add_axes([0.91,0.1,0.02,0.6])
pylab.colorbar(im, cax=axcolor)
fig.show()

我知道在distance.py文件中检查了距离矩阵以及简化的距离矩阵的有效性,这就是发生我的错误的地方。首先,我的距离矩阵不是对称的,我使用np.allclose进行了检查,但是即使返回True,它仍然提到了非对称性。然后,我将对角线上的那些替换为零,并四舍五入到小数点后三位。这似乎可行,但下一个错误如下:

File "C:/Users/Dymphie/PycharmProjects/iCSD\Matrix_dendogram.py", line 31, in Matrix_dendogram Z1 = sch.dendrogram(Y, orientation='left') File "C:\Users\Dymphie\AppData\Local\Enthought\Canopy32\User\lib\site-packages\scipy\cluster\hierarchy.py", line 2118, in dendrogram is_valid_linkage(Z, throw=True, name='Z') File "C:\Users\Dymphie\AppData\Local\Enthought\Canopy32\User\lib\site-packages\scipy\cluster\hierarchy.py", line 1324, in is_valid_linkage 'distances.') % name) ValueError: Linkage 'Z' contains negative distances.

“ Z”链接数组如下所示:

array([[  0.   ,   9.   ,  -0.265,   2.   ],
   [  1.   ,  10.   ,  -0.233,   3.   ],
   [  2.   ,  11.   ,  -0.18 ,   4.   ],
   [  8.   ,  12.   ,  -0.142,   5.   ],
   [  3.   ,  13.   ,  -0.08 ,   6.   ],
   [  7.   ,  14.   ,  -0.079,   7.   ],
   [  4.   ,  15.   ,   0.119,   8.   ],
   [  6.   ,  16.   ,   0.163,   9.   ],
   [  5.   ,  17.   ,   0.448,  10.   ]])

我对此的想法是,这可能是由原始Av_Matrix中的负相关性引起的,并且仅通过尝试矩阵的绝对值然后运行代码确实导致没有更多错误的结果。 所以我的问题是,我该如何使用包含负值的互相关矩阵构造这些树状图?

(真的很抱歉,这么长的条目!)

任何人将不胜感激。

0 个答案:

没有答案