使用Plotly自定义标签

时间:2019-01-29 12:44:21

标签: python pandas cluster-analysis plotly k-means

我正在尝试自定义悬停时出现的数据标签: enter image description here 这是上面给出我输出的代码:

import pandas as pd
import plotly.plotly as py
import plotly.graph_objs as go

# Create random data
labels = ['A', 'B', 'C']
N = 20
df = pd.DataFrame(index = range(N))
standardized_cols = []

for col in labels:
    df[col] = np.random.randn(N)
    standardized_colname  =  col + "_standardized"
    standardized_cols.append(standardized_colname)
    df[standardized_colname] = (df[col]-df[col].mean())/df[col].std()

# Cluster
c = KMeans(n_clusters=3, random_state=1).fit(df[standardized_cols]).labels_

# Plot
trace = go.Scatter3d(
    x=df.A_standardized,
    y=df.B_standardized,
    z=df.C_standardized,

    mode='markers',
    marker=dict(
        size=5,
        color=c,              
        colorscale='Viridis',   
    ),
    name= 'test',
    text= c
)

data = [trace]

fig = go.Figure(data=data, layout=layout)
iplot(fig)

我的数据: enter image description here]

该图显示了标准化列的聚类。 但是,当将鼠标悬停在数据上时,我希望看到标签中未标准化的数据,例如

A: 0,999
B: 0,565
C: 0,765
Cluster: 2

我进行了实验,但不知道如何实现。这可能吗?

1 个答案:

答案 0 :(得分:2)

您可以进行一些列表理解,然后将想要添加的任何列添加到text中,请参见下面的示例(注意,我正在离线绘图):

# data
np.random.seed(1)
labels = ['A', 'B', 'C']
N = 20
df = pd.DataFrame(index = range(N))
standardized_cols = []

for col in labels:
    df[col] = np.random.randn(N)
    standardized_colname  =  col + "_standardized"
    standardized_cols.append(standardized_colname)
    df[standardized_colname] = (df[col]-df[col].mean())/df[col].std()

c = KMeans(n_clusters=3, random_state=1).fit(df[standardized_cols]).labels_

情节:

import plotly as py
import plotly.graph_objs as go


trace = go.Scatter3d(
    x=df.A_standardized,
    y=df.B_standardized,
    z=df.C_standardized,

    mode='markers',
    marker=dict(
        size=5,
        color=c,              
        colorscale='Viridis',   
    ),
    name= 'test',

    # list comprehension to add text on hover
    text= [f"A: {a}<br>B: {b}<br>C: {c}" for a,b,c in list(zip(df['A'], df['B'], df['C']))],
    # if you do not want to display x,y,z
    # hoverinfo='text'


)


layout = dict(title = 'TEST',)

data = [trace]
fig = dict(data=data, layout=layout)

py.offline.plot(fig, filename = 'stackTest.html')

enter image description here

您可以修改列表理解以显示所需内容

如果您不想显示x,y,z,请添加hoverinfo='text'