我执行了k-means clustring,当我使用肘曲线进行可视化时,得到了一个与上述行代码相关的内存错误:
tss = np.sum(pdist(patches)**2)/patches.shape[0]
我该如何解决这个问题?
K= range(50,100,10)
# Total with-in sum of square
wcss = [sum(d**2) for d in dist]
tss = np.(pdist(patches)**2)/patches.shape[0]
bss = tss-wcss
fig = plt.figure()
ax = fig.add_subplot(111)
ax.plot(K, bss/tss*100, 'b*-')
plt.grid(True)
plt.xlabel('Number of clusters')
plt.ylabel('Percentage of variance explained')
plt.title('Elbow for KMeans clustering')
plt.show()
我遇到了内存错误
MemoryError Traceback (most recent call last)
<ipython-input-13-73c7d61897f4> in <module>()
1 # Total with-in sum of square
2 wcss = [sum(d**2) for d in dist]
----> 3 tss = np.sum(pdist(patches)**2)/patches.shape[0]
4 bss = tss-wcss
5 fig = plt.figure()
/root/anaconda2/lib/python2.7/site-packages/scipy/spatial/distance.pyc in pdist(X, metric, p, w, V, VI)
1218
1219 m, n = s
-> 1220 dm = np.zeros((m * (m - 1)) // 2, dtype=np.double)
1221
1222 wmink_names = ['wminkowski', 'wmi', 'wm', 'wpnorm']
MemoryError: