我有一个数据集,其中我将仅使用单个列来应用kmeans聚类。但是,在绘制图形时,我正在获取“ numpy.ndarray”。我尝试将其转换为浮动格式,但仍然面临相同的问题
数据框:
Brim
1234.5
345
675.7
120
110
代码:
from sklearn.cluster import KMeans
import numpy as np
km = KMeans(n_clusters=4, init='k-means++',n_init=10)
km.fit(df1)
x = km.fit_predict(df1)
x
array([0, 0, 0, ..., 3, 3, 3])
np.shape(x)
(1097,)
import matplotlib.pyplot as plt
%matplotlib inline
plt.scatter(df1[x ==1,0], df1[x == 0,1], s=100, c='red')
plt.scatter(df1[x ==1,0], df1[x == 1,1], s=100, c='black')
plt.scatter(df1[x ==2,0], df1[x == 2,1], s=100, c='blue')
plt.scatter(df1[x ==3,0], df1[x == 3,1], s=100, c='cyan')
错误:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-62-5f0966ccc828> in <module>()
1 import matplotlib.pyplot as plt
2 get_ipython().run_line_magic('matplotlib', 'inline')
----> 3 plt.scatter(df1[x ==1,0], df1[x == 0,1], s=100, c='red')
4 plt.scatter(df1[x ==1,0], df1[x == 1,1], s=100, c='black')
5 plt.scatter(df1[x ==2,0], df1[x == 2,1], s=100, c='blue')
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2137 return self._getitem_multilevel(key)
2138 else:
->2139 return self._getitem_column(key)
2140
2141 def _getitem_column(self, key):
~\AppData\Local\Continuum\anaconda3\lib\site-
packages\pandas\core\frame.py in _getitem_column(self, key)
2144 # get column
2145 if self.columns.is_unique:
-> 2146 return self._get_item_cache(key)
2147
2148 # duplicate columns & possible reduce dimensionality
~\AppData\Local\Continuum\anaconda3\lib\site- packages\pandas\core\generic.py in _get_item_cache(self, item)
1838 """Return the cached item, item represents a label indexer."""
1839 cache = self._item_cache
-> 1840 res = cache.get(item)
1841 if res is None:
1842 values = self._data.get(item)
TypeError: unhashable type: 'numpy.ndarray'
答案 0 :(得分:0)
如果我正确理解了您的代码,则您尝试根据x
的值对DataFrame进行切片以进行打印。
为此,您应该使用df1.loc[x==1,0]
而不是df1[x==1,0]
(对于其他所有切片,依此类推)。