绘制时间轴上的直方图/曲线

时间:2018-03-09 05:26:45

标签: python pandas matplotlib histogram

我觉得有一种非常简单的方法可以做到这一点。我正在尝试绘制在环境中运行的任务的时间线,包括。同一图表上的两个图:

  1. 任务运行时间为broken_barh
  2. 基于每个时间点(或直方图)上的任务集合的总体负载曲线,假设具有较低的不透明度或线条。
  3. 在该示例中,有6个任务正在运行(A-F),具有不同的长度,具有不同的开始时间。它们完全按照我的需要(1 /)绘制,在类似gant的图表中,在X轴上的时间。

    import numpy as np
    import pandas as pd
    %matplotlib inline
    import matplotlib as mpl
    from matplotlib import pyplot as plt
    
    cols=['ID','From','To']
    
    df = pd.DataFrame([['A', 736758.993, 736758.995], ['B', 736758.995, 736758.998],
                       ['C', 736758.994, 736758.996], ['D', 736758.996, 736758.997],
                       ['E', 736758.996, 736758.997], ['F', 736758.995, 736758.996]],
                       columns=cols)
    
    df['Diff'] = df['To']-df['From']
    
    fig,ax=plt.subplots()
    for i, slice in df.iterrows():
        values = [[slice['From'], slice['Diff']]]
        ax.broken_barh((values), (i-0.4,0.8), color=np.random.rand(3))
    
    ax.xaxis_date()
    

    对此我想添加2 / a曲线,显示每次的活动任务计数(1:23:51-23:52,2为23:52-53等,峰值在23:54左右)

    这个问题是我不能只绘制开始时间的直方图,因为不同的任务在时间上重叠。你知道创建这种直方图的好方法吗?

1 个答案:

答案 0 :(得分:1)

我很确定有更清洁的方法可以解决这个问题。在尝试创建直方图时,特别是浮动数学问题非常烦人。不过,第一部分是一个简单的衬垫。只需按照建议使用hlines并增加linewidth即可创建条形图。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.cm as cm

df = pd.DataFrame([['A', 736758.993, 736758.995], ['B', 736758.995, 736758.998],
                   ['C', 736758.994, 736758.996], ['D', 736758.994, 736758.997],
                   ['E', 736758.997, 736758.998], ['F', 736758.995, 736758.999]],
                   columns = ['ID','From','To'])

#create two subplots with shared x axis
fig, (ax1, ax2) = plt.subplots(2, 1, sharex = True)
#plot1 - Gantt chart for individual IDs
ax1.hlines(df.ID, df.From, df.To, colors = cm.inferno(df.index/len(df)), linewidth = 20)

#plot 2 - make table of time series for each ID - multiply by 1000 to avoid float problems
hist_count = df.apply(lambda row: pd.Series(np.arange(1000 * row["From"], 1000 * row["To"])), axis = 1)
hist_count = pd.melt(hist_count)["value"].dropna().astype(int)
#find borders for bins 
min_time = hist_count.min(axis = 0)
max_time = hist_count.max(axis = 0)
#plot 2 histogram - add 0.0001 to prevent arbitrary binning due to float problems
ax2.hist(hist_count / 1000 + 0.0001, range = (min_time / 1000, (max_time + 1) / 1000), bins = max_time - min_time + 1)
ax2.xaxis_date()

plt.show()

样本数据集的输出: enter image description here