Question

我正在尝试以绘图方式为数据框中的每个类添加颜色，这是我的代码：

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

knn = KNeighborsClassifier(n_neighbors=7)

# fitting the model
knn.fit(X_train, y_train)

# predict the response
pred = knn.predict(X_test)

dfp=pd.DataFrame(X_test)
dfp.columns = ['SepalLengthCm','SepalWidthCm', 'PetalLengthCm', 'PetalWidthCm']
dfp["PClass"]=pred

pyo.init_notebook_mode()
data=[go.Scatter(x=dfp['SepalLengthCm'], y=dfp['SepalWidthCm'], 
                    text=dfp['PClass'],
                    mode='markers',
                    marker=dict(
                    color=dfp['PClass']))]

layout = go.Layout(title='Chart', hovermode='closest')
fig = go.Figure(data=data, layout=layout)

pyo.iplot(data)

这就是我的df的样子

SepalLengthCm   SepalWidthCm    PetalLengthCm   PetalWidthCm    PClass
       6.1           2.8             4.7         1.2    Iris-versicolor
      5.7            3.8             1.7         0.3        Iris-setosa
      7.7             2.6        6.9         2.3    Iris-virginica

因此，问题在于它没有根据dfp['PClass']列分配颜色，并且绘图上的每个点都是相同的颜色：黑色。即使将鼠标悬停在该点上，也会根据其类正确标记每个点。任何想法为什么它不能正常工作？

Answer 1

在您的代码示例中，您尝试使用color=dfp['PClass'])将颜色分配给分类组。例如，这是ggplot与ggplot(mtcars, aes(x=wt, y=mpg, shape=cyl, color=cyl, size=cyl))一起应用的逻辑，其中cyl是分类变量。您会在here页面的下方看到一个示例。

但是对于密谋，这是行不通的。 color中的go.Scatter仅接受带有color = np.random.randn(500)的{{3}}中的数值：

this example

为了获得所需的结果，您将必须使用中的多条迹线来构建绘图：

this example

情节：如何按组分配散点图的颜色？

1 个答案: