绘制具有不连续间隔的时间序列

时间:2018-12-17 15:31:17

标签: python numpy matplotlib

我正在尝试绘制一些数据。数据是元组(date, value)的列表,按date排序。该列表是根据字典列表创建的,该字典没有按日期排序,并且所涵盖的间隔不是连续的,即,并非第一个和最后一个日期之间的所有日期都有条目。我选择了元组列表,以确保每个value与正确的date配对。从here帖子中,我看到可以使用numpy创建时间序列(连续间隔)。然后,我尝试使用元组列表中的日期创建一个数组:

import datetime
import matplotlib.pyplot as plt
import numpy as np

def plot_by_date(delay_list_tech):
    """
    plot data from delay_list_tech
    input: delay_list_tech - list of dictionaries
    """   

    # create list of tuples
    answer_row = []
    answer_list = []
    for row in delay_list_tech:
        y_val = row['delay_days']

        dummy_date = row['effective_date']
        x_val = dummy_date.split('-')
        x_val_year = int(x_val[0])
        x_val_mont = int(x_val[1])
        x_val_day = int(x_val[2])
        x_date = datetime.date(x_val_year, x_val_mont, x_val_day)

        answer_row.append(x_date)
        answer_row.append(y_val)
        dummy_row = answer_row.copy()
        answer_list.append(tuple(dummy_row))
        answer_row.clear()

    # sorting
    answer_list.sort(key=lambda pair: pair[0], reverse=False)

    # error on generating array for x axis
    x = np.array(answer_list[idx][0] for idx in range(len(answer_list)))

是否可以使用非连续数据源创建时间序列?

非常感谢

Tiago

1 个答案:

答案 0 :(得分:0)

由于没有发布任何示例数据,因此不确定到底要遇到什么麻烦。但是,您的代码做了很多不必要的事情。这是一个清理好的功能版本:

import datetime
import matplotlib.pyplot as plt
import numpy as np

def plot_by_date(delay_list_tech):
    """plot data from delay_list_tech
    input: delay_list_tech - list of dictionaries
    """   
    # create list of tuples
    answer_list = []
    for row in delay_list_tech:
        x_val = row['effective_date'].split('-')
        x_val_year = int(x_val[0])
        x_val_mont = int(x_val[1])
        x_val_day = int(x_val[2])
        x_date = datetime.date(x_val_year, x_val_mont, x_val_day)
        answer_list.append((x_date, row['delay_days']))

    # sorting
    answer_list.sort()

    # error on generating array for x axis
    x = np.array([row[0] for row in answer_list])
    y = np.array([row[1] for row in answer_list])

    plt.plot(x, y)

对其进行测试:

# some test data
d = [
    ('1991-01-15', 47),
    ('1995-04-14', 10),
    ('1987-01-12', 99),
    ('2001-03-19', 41),
    ('1999-11-03', 9),
]

# convert to list of dictionaries, as per OP's question
d = [dict(zip(('effective_date', 'delay_days'), row)) for row in d]

# plot
plot_by_date(d)

输出:

enter image description here