根据熊猫中的时间序列创建一个数月x年的数据框

时间:2020-06-29 14:48:25

标签: python pandas dataframe time-series

我有一个时间序列数据,其中包含几年中每个月的天数,并试图创建一个新的数据框,该数据框以月为行,以年为列。

我有这个

    DateTime    Days    Month   Year
        
    2004-11-30  3   November    2004
    2004-12-31  16  December    2004
    2005-01-31  12  January     2005
    2005-02-28  11  February    2005
    2005-03-31  11  March       2005
    ... ... ... ...
    2019-06-30  0   June        2019
    2019-07-31  2   July        2019
    2019-08-31  5   August      2019
    2019-09-30  5   September   2019
    2019-10-31  3   October     2019

我正试图得到这个

Month     2004  2005 ... 2019

January   nan   12       7
February  nan   11       9
...
November  17    17       nan
December  14    15       nan

我创建了一个新数据框,其第一列表示月份,并尝试遍历第一个数据框以将新列(年)和信息添加到单元格中,但条件是检查第一个数据框中的月份(天)是否与新数据框中的月份匹配(输出)从不为True,因此新数据框永远不会更新。我想这是因为以天为单位的月永远不会与同一迭代中的输出月相同。

for index, row in days.iterrows():
print(days.loc[index, 'Days'])    #this prints out as expected
for month in output.items():
    print(index.month_name())     #this prints out as expected
    if index.month_name()==month:
        output.at[month, index.year]=days.loc[index, 'Days']    #I wanted to use this to fill up the cells, is this right?
        print(days.loc[index, 'Days'])      #this never gets printed out

您能告诉我如何解决此问题吗?还是有一种更好的方法来完成结果,而不是进行迭代? 这是我第一次尝试在python中使用库,因此,我将不胜感激。

1 个答案:

答案 0 :(得分:0)

如果您输入的数据框每月和每年都有一个值,请使用pivot

df.pivot('Month', 'Year', 'Days')

输出:

Year      2004 2005 2019
Month                   
August     NaN  NaN    5
December    16  NaN  NaN
February   NaN   11  NaN
January    NaN   12  NaN
July       NaN  NaN    2
June       NaN  NaN    0
March      NaN   11  NaN
November     3  NaN  NaN
October    NaN  NaN    3
September  NaN  NaN    5