使用seaborn.pairplot()以多种颜色绘制数据框吗?

时间:2019-01-22 22:07:25

标签: python pandas seaborn

我想创建一个与此图像相似的图,以便比较数据集的多个暗淡。数据集没有预设。我设法以一种颜色正确显示了数据,但是我希望y = 0的一种颜色和y = 1的一种颜色比较这些点。就像虹膜数据集的图像一样。一旦在hue='y'方法中加入sns.pairplot,代码将直到最后才编译。

我也不了解控制台输出。有什么问题吗?

enter image description here     进口seaborn为sns; sns.set(style =“ ticks”,color_codes = True)     将熊猫作为pd导入

dataframe = pd.DataFrame(dict(F1=X[:, 0], F2=X[:, 1], F3=X[:, 2], F4=X[:, 3], y=y))

print(dataframe)

g = sns.pairplot(dataframe, hue='y')

这是dataframe的输出。在我看来还不错:

            F1        F2        F3        F4    y
0     3.173182  2.849991  2.497907  2.851715  0.0
1     2.468625 -0.216985  0.275206  1.232518  1.0
2     2.398419  2.258931  2.255533  4.895872  0.0
3     1.379937  1.041677  1.165911  1.992650  1.0
4     2.489665  2.269068  4.129961  2.218203  0.0
5     4.140160  2.809088  2.973027  3.553128  0.0
6     2.997969  1.701299  2.978875  1.946793  0.0
7     3.864436  3.554276  3.568455  2.839489  0.0
8    -0.000605  1.376971  1.128350  1.293777  1.0
9     2.398057  1.180861  2.400801  2.264726  1.0
10    0.997385 -0.560205  0.954628  2.788858  1.0

...        ...       ...       ...       ...  ...

3990  3.334553  4.576306  2.470476  3.032781  0.0
3991  1.465784  2.304793  1.267303 -0.030802  1.0
3992  0.505905 -0.280769 -1.223464  1.077305  1.0
3993  2.581596  3.924394  3.878303  2.579366  0.0
3994  4.362067  2.247818  2.948595  1.906314  0.0
3995  2.310546  0.006672  2.382227  1.940343  1.0
3996 -0.944635  1.387136  0.604135  2.421478  1.0
3997  1.290999  1.485965  0.262792  0.899340  1.0
3998  0.864532  1.759607  1.118346  1.038935  1.0
3999  1.819110  2.218838  3.927945  2.593009  0.0

[4000 rows x 5 columns]

但最终我收到此错误:

Traceback (most recent call last):
  File "/Users//PycharmProjects//V3_multiTops/vergleich.py", line 131, in <module>
    g = sns.pairplot(dataframe, hue='y')
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/axisgrid.py", line 2111, in pairplot
    grid.map_diag(kdeplot, **diag_kws)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/axisgrid.py", line 1399, in map_diag
    func(data_k, label=label_k, color=color, **kwargs)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/distributions.py", line 691, in kdeplot
    cumulative=cumulative, **kwargs)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/distributions.py", line 294, in _univariate_kdeplot
    x, y = _scipy_univariate_kde(data, bw, gridsize, cut, clip)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/seaborn/distributions.py", line 366, in _scipy_univariate_kde
    kde = stats.gaussian_kde(data, bw_method=bw)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/stats/kde.py", line 172, in __init__
    self.set_bandwidth(bw_method=bw_method)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/stats/kde.py", line 499, in set_bandwidth
    self._compute_covariance()
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/stats/kde.py", line 510, in _compute_covariance
    self._data_inv_cov = linalg.inv(self._data_covariance)
  File "/Users//PycharmProjects//venv/lib/python3.7/site-packages/scipy/linalg/basic.py", line 975, in inv
    raise LinAlgError("singular matrix")
numpy.linalg.linalg.LinAlgError: singular matrix

我认为sns.pairplot()做错了,我还不了解。你能给我解释一下吗?

1 个答案:

答案 0 :(得分:1)

问题似乎是"y"列本身是数字。因此,它将作为列/行包含在pairgrid中。无论如何,这似乎是不希望的。要选择将参与网格的变量,请使用pairplot的{​​{1}}关键字。

vars

sns.pairplot(df, vars=df.columns[:-1], hue="y") 数据集未指定iris的原因是vars列不是数字。非数字列不包括在网格中。

完整示例:

hue

enter image description here