Unicode不等比较无法将两个参数转换为Unicode“matplotlib中的错误

时间:2016-03-16 12:46:49

标签: python matplotlib unicode scikit-learn

这是我的代码,我试图为我的点列表运行这个DBSCAN算法,该算法位于下面的坐标矩阵中。矩阵如下所示:

[[43.285569, 5.350558], 
[48.728766, 2.369763], 
[48.82206, 2.325197], 
[48.82206, 2.325197], 
....................
[48.822879, 2.325046], 
[48.822943, 2.325099], 
[48.830726, 2.331268]]

然而,当我运行代码时,我收到以下错误:

"UnicodeWarning: Unicode unequal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
  if self._markerfacecolor != fc:"

有谁能说明为什么会这样?谢谢!

from sklearn.cluster import DBSCAN
from sklearn.preprocessing import StandardScaler
import numpy as np
import numpy
import csv


def plot_cluster(cluster, sample_matrix):
    import matplotlib.pyplot as plt
    import numpy as np

    f = lambda row: [float(x) for x in row]

    sample_matrix = map(f,sample_matrix)
    print sample_matrix
    sample_matrix = StandardScaler().fit_transform(sample_matrix)

    core_samples_mask = np.zeros_like(cluster.labels_, dtype=bool)
    core_samples_mask[cluster.core_sample_indices_] = True
    labels = cluster.labels_

    # Black removed and is used for noise instead.
    unique_labels = set(labels)
    colors = plt.cm.Spectral(np.linspace(0, 1, len(unique_labels)))
    for k, col in zip(unique_labels, colors):
        if k == -1:
            # Black used for noise.
            col = 'k'

        class_member_mask = (labels == k)  # generator comprehension
        # X is your data matrix
        X = np.array(sample_matrix)

        xy = X[class_member_mask & core_samples_mask]

        plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,
                 markeredgecolor='k', markersize=14)

        xy = X[class_member_mask & ~core_samples_mask]
        plt.plot(xy[:, 0], xy[:, 1], 'o', markerfacecolor=col,
                 markeredgecolor='k', markersize=6)

    plt.ylim([0,10])
    plt.xlim([0,10])
#        plt.title('Estimated number of clusters: %d' % n_clusters_)
    plt.savefig('cluster.png')

dbscan_object = DBSCAN(3.0,4)

input = np.genfromtxt(open("dataset_import_noaddress.csv","rb"),delimiter=",", skip_header=1)
coordinates = np.delete(input, [0,1], 1)

result = dbscan_object.fit(coordinates)
print result.labels_

print 'plotting '
plot_cluster(result, coordinates)

1 个答案:

答案 0 :(得分:0)

当你的绘图逻辑出错时,不要责备DBSCAN。

了解如何阅读错误跟踪并将其包含在您的问题中。发生错误的 信息是最重要的信息!

那么“错误”的标记面部颜色是什么?

也许您需要做的就是用黑色对象(而不是字符串)替换所有出现的'k',警告就会消失。

由于您的数据是坐标,您还应该使用半径距离,而不是欧几里德。 缩放数据集!你实际上扭曲了你的数据。停止回收您在互联网上找到的您不理解的代码。如果您不理解StandardScaler是什么,请不要使用它。在这里,使用它是不正确的。