用字符串标签绘制pandas数据帧

时间:2016-05-31 14:48:52

标签: python pandas matplotlib

我有一个包含多个字段的pandas数据框。重要的是:

In[191]: tasks[['start','end','appId','index']]
Out[189]: 
             start               end                           appId  index
2576 1464262540102.000 1464262541204.000  application_1464258584784_0012      1
2577 1464262540098.000 1464262541208.000  application_1464258584784_0012      0
2579 1464262540104.000 1464262541194.000  application_1464258584784_0012      3
2583 1464262540107.000 1464262541287.000  application_1464258584784_0012      6
2599 1464262540125.000 1464262541214.000  application_1464258584784_0012     26
2600 1464262541191.000 1464262541655.000  application_1464258584784_0012     28
.
.
.
2701 1464262562172.000 1464262591147.000  application_1464258584784_0013     14
2718 1464262578901.000 1464262588156.000  application_1464258584784_0013     28
2727 1464262591145.000 1464262602085.000  application_1464258584784_0013     40

我想为来自坐标的每一行绘制一条线(x1 = start,y1 = index),(x2 = end,y1 = index)。根据appId的值,每行将具有不同的颜色,这是一个字符串。这都是在时间序列图中的子图中完成的。我在这里发布代码,但重要的是tasks.iterrows()部分,你可以忽略其余部分。

def plot_stage_in_host(dfm,dfg,appId,stageId,parameters,host):
    [s,e] = time_interval_for_app(dfm, appId,stageId, host)
    time_series = create_time_series_host(dfg, host, parameters, s,e)
    fig,p1 = plt.subplots()
    p2 = p1.twinx()
    for para in parameters:          
        p1.plot(time_series.loc[time_series['parameter']==para].time,time_series.loc[time_series['parameter']==para].value,label=para)
    p1.legend()
    p1.set_xlabel("Time")
    p1.set_ylabel(ylabel='%')
    p1.set(ylim=(-1,1))
    p2.set_ylabel("TASK INDEX")
    tasks = dfm.loc[(dfm["hostname"]==host) & (dfm["start"]>s) & (dfm["end"]<e) & (dfm["end"]!=0)] #& (dfm["appId"]==appId) & (dfm["stageId"]==stageId)]
    apps = tasks.appId.unique()
    norm = colors.Normalize(0,len(apps))
    scalar_map = cm.ScalarMappable(norm=norm, cmap='hsv')
    for _,row in tasks.iterrows():
        color = scalar_map.to_rgba(np.where(apps == row['appId'])[0][0])
        p2.plot([row['start'],row['end']],[row['index'],row['index']],lw=4 ,c=color)
    p2.legend(apps,loc='lower right')
    p2.show()

这是我得到的结果。

enter image description here

显然不考虑标签,图例显示所有线条的颜色相同。如何正确标记它们并显示图例?

1 个答案:

答案 0 :(得分:1)

问题在于,每次使用label=参数在for循环中绘制图形时,都要分配标签。尝试删除它并将p2.lengend()一个字符串列表作为参数来表示您想要显示的标签。

p2.legend(['label1', 'label2'])

如果要为每一行指定不同的颜色,请尝试以下操作:

import matplotlib.pyplot as plt
import numpy as np
xdata = [1, 2, 3, 4, 5]
ydata = [[np.random.randint(0, 6) for i in range(5)],
        [np.random.randint(0, 6) for i in range(5)],
        [np.random.randint(0, 6) for i in range(5)]]
colors = ['r', 'g', 'b']  # can be hex colors as well
legend_names = ['a', 'b', 'c']
for c, y in zip(colors, ydata):
    plt.plot(xdata, y, c=c)
plt.legend(legend_names)
plt.show()

它给出了以下结果: enter image description here

希望这有帮助!