Seaborn clustermap提供自己的链接矩阵

时间:2018-09-17 14:36:09

标签: seaborn

我有一个约有400个样本和约20个区域的数据集。在此数据集中,“ 1”表示折断区域(该区域中dna的断裂),“ 0”表示完整区域。我想将此数据与seaborn.clustermap聚类。由于断开的区域比完整的区域具有更多的信息,因此我最初会选择Jaccard距离。但是,我有很多空行(根本没有中断)。这将导致严重的崩溃(0/0-> nan)。为了解决这个问题,我尝试设置自己的链接矩阵,但是文档非常稀疏,无法弄清楚。有任何想法吗?

import pandas as pd
import seaborn as sns; sns.set(color_codes=True)
import matplotlib.pyplot as plt
import numpy as np
import scipy.cluster.hierarchy


# my dataset is called 'df'
print(df.shape)
## = (464, 23) ##
Y = scipy.spatial.distance.pdist(df, metric='jaccard')
Y = np.nan_to_num(Y)  # distance matrix
linkage = scipy.cluster.hierarchy.linkage(Y, method='average') # 
linkage matrix
print(len(Y))
## 107416 . ##
print(len(linkage))
## 463 ##

cmap = sns.cubehelix_palette(as_cmap=True, rot=-.3, light=1)
sns.clustermap(df, cmap=cmap, row_linkage=linkage, col_linkage=linkage)
plt.show()

这将导致以下错误消息:

Traceback (most recent call last):
  File "/Users/nienke/Documents/stage/scripts/structuralvariants/realcluster.py", line 32, in <module>
    sns.clustermap(df, cmap=cmap, row_linkage=linkage, col_linkage=linkage)
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/seaborn/matrix.py", line 1301, in clustermap
    **kwargs)
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/seaborn/matrix.py", line 1142, in plot
    self.plot_matrix(colorbar_kws, xind, yind, **kws)
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/seaborn/matrix.py", line 1100, in plot_matrix
    self.data2d = self.data2d.iloc[yind, xind]
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1367, in __getitem__
    return self._getitem_tuple(key)
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1737, in _getitem_tuple
    self._has_valid_tuple(tup)
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 204, in _has_valid_tuple
    if not self._has_valid_type(k, i):
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1674, in _has_valid_type
    return self._is_valid_list_like(key, axis)
  File "/Users/nienke/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py", line 1731, in _is_valid_list_like
    raise IndexError("positional indexers are out-of-bounds")
IndexError: positional indexers are out-of-bounds

非常感谢

0 个答案:

没有答案