这是我的代码,我试图为我的点列表运行这个DBSCAN算法,该算法位于下面的坐标矩阵中。矩阵如下所示:
[[43.285569, 5.350558],
[48.728766, 2.369763],
[48.82206, 2.325197],
[48.82206, 2.325197],
....................
[48.822879, 2.325046],
[48.822943, 2.325099],
[48.830726, 2.331268]]
然而,当我运行代码时,我收到以下错误:
"UnicodeWarning: Unicode unequal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
if self._markerfacecolor != fc:"
有谁能说明为什么会这样?谢谢!
from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
import numpy as np
import numpy
import csv
def plot_cluster(cluster, sample_matrix):
import matplotlib.pyplot as plt
import numpy as np
f = lambda row: [float(x) for x in row]
sample_matrix = map(f,sample_matrix)
print sample_matrix
sample_matrix = StandardScaler().fit_transform(sample_matrix)
core_samples_mask = np.zeros_like(cluster.labels_, dtype=bool)
core_samples_mask[cluster.core_sample_indices_] = True
labels = cluster.labels_
# Black removed and is used for noise instead.
unique_labels = set(labels)
colors = plt.cm.Spectral(np.linspace(0, 1, len(unique_labels)))
for k, col in zip(unique_labels, colors):
if k == -1:
# Black used for noise.
col = 'k'
class_member_mask = (labels == k) # generator comprehension
# X is your data matrix
X = np.array(sample_matrix)
xy = X[class_member_mask & core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,
markeredgecolor='k', markersize=14)
xy = X[class_member_mask & ~core_samples_mask]
plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,
markeredgecolor='k', markersize=6)
plt.ylim([0,10])
plt.xlim([0,10])
# plt.title('Estimated number of clusters: %d' % n_clusters_)
plt.savefig('cluster.png')
dbscan_object = DBSCAN(3.0,4)
input = np.genfromtxt(open("dataset_import_noaddress.csv","rb"),delimiter=",", skip_header=1)
coordinates = np.delete(input, [0,1], 1)
result = dbscan_object.fit(coordinates)
print result.labels_
print 'plotting '
plot_cluster(result, coordinates)
答案 0 :(得分:0)
当你的绘图逻辑出错时,不要责备DBSCAN。
了解如何阅读错误跟踪并将其包含在您的问题中。发生错误的 信息是最重要的信息!
那么“错误”的标记面部颜色是什么?
也许您需要做的就是用黑色对象(而不是字符串)替换所有出现的'k'
,警告就会消失。
由于您的数据是坐标,您还应该使用半径距离,而不是欧几里德。 不缩放数据集!你实际上扭曲了你的数据。停止回收您在互联网上找到的您不理解的代码。如果您不理解StandardScaler
是什么,请不要使用它。在这里,使用它是不正确的。