Question

我正在努力找到最佳的设置来绘制时间序列，这些时间序列的长度可能存在很大差异。为了给您提供更多背景信息，输入数据是由FIO生成的，因此它们在“时间”轴上以微秒为单位，在“值”轴上以各种其他度量方式（例如IOPS，带宽，完成延迟，提交延迟等）出现）

问题：生成数据的测试可能运行一个小时，十分钟或十个小时。除此之外，IOPS数据点比提交延迟要稀疏得多（您每秒测量一次事物的频率实际上不能超过每秒一次，但是提交延迟通常在1-2微秒之内）。

当我不更改xticks时，X（时间）轴上的比例通常是毫无价值的（即，它太稀疏或太密）。对于每种情况，我都可以为xticks找到一个“足够好”的值，但是我想不出一种自动解决这个问题的方法。

这是我的一些练习，对不起，这有点脱离上下文，但是至少应该说明我尝试的方向。

def plot_one(self, args):
    file_name, file, plot_histogram = args
    logging.info('plotting: {}'.format(file_name))
    if type(file) is bytes:
        file = BytesIO(file)
    df = pd.read_csv(file, header=None, usecols=(0, 1))
    df.columns = ['offset', 'nsec']
    df = df.groupby(['offset']).mean()
    df['offset'] = df.index
    if plot_histogram:
        ax = df['nsec'].plot.hist(title=os.path.basename(file_name))
        figure = ax.get_figure()
        figure.savefig(file_name + '_hist.png', bbox_inches='tight')
    max_offset = df['offset'].iloc[-1]
    ratio = max(1, min(max_offset // 4000, 40))
    height = 5
    width = int(height * ratio)
    xticks, step = None, None
    if max_offset > 10000:
        xticks = pd.Series(data=range(0, max_offset // 1000)) * 1000
        step = 1000
    else:
        xticks = pd.Series(data=range(0, max_offset))
        step = 1
    ax = df.plot(
        title=os.path.basename(file_name),
        x='offset',
        y='nsec',
        grid=True,
        figsize=(width, height),
        xticks=xticks,
        rot=90,
    )
    ax.set_xticklabels(xticks // step)
    ax.set_xlim(0, df['offset'].iloc[-1])
    figure = ax.get_figure()
    figure.savefig(file_name + '.png', bbox_inches='tight')
    logging.info('saving figure: {}.png'.format(file_name))

绘制时间长度截然不同的

0 个答案: