显示所有相关性中只有正相关

时间:2017-05-20 17:35:04

标签: pandas numpy

我有一个在相关矩阵相关矩阵中看起来像这样的数据集

如何显示3个最佳正相关而不是我所有的相关性。+相关应以数字显示)

2 个答案:

答案 0 :(得分:0)

您可以屏蔽所有不感兴趣的值,如下所示。

# Set the diagonal to -np.inf
corr[np.diag_indices_from(corr)] = -np.inf
# Find the value of the k-largest correlation
k = 3
threshold = np.sort(corr)[-k]
# Mask all values that are below the threshold
corr[corr < threshold] = np.nan
# Do your plotting as before

答案 1 :(得分:0)

演示:

In [156]: df = pd.DataFrame(np.random.randint(1, 6, size=(5, 5))).add_prefix('col').corr()

In [157]: df
Out[157]:
          col0      col1      col2      col3      col4
col0  1.000000  0.000000  0.060193 -0.722222 -0.218218
col1  0.000000  1.000000 -0.233126 -0.215166  0.845154
col2  0.060193 -0.233126  1.000000  0.541736  0.118217
col3 -0.722222 -0.215166  0.541736  1.000000  0.036370
col4 -0.218218  0.845154  0.118217  0.036370  1.000000

In [158]: corr = df.values

In [159]: corr[np.tril_indices_from(corr)] = np.nan

In [160]: x = pd.DataFrame(corr, columns=df.columns, index=df.index)

In [161]: x.stack(dropna=False).nlargest(3).unstack()
Out[161]:
          col3      col4
col1       NaN  0.845154
col2  0.541736  0.118217

In [162]: sns.heatmap(x.stack(dropna=False).nlargest(3).unstack())
Out[162]: <matplotlib.axes._subplots.AxesSubplot at 0xcacf7b8>

enter image description here