在Python中将年度和季度数据转换为每月一次

时间:2019-12-11 04:55:16

标签: python python-3.x pandas dataframe datetime

我尝试将以下数据框转换为每月数据?

       date         v1         v2
0   1996-12  1,789.20   1,001.19 
1   1997-12  2,077.09   1,218.06 
2   1998-12  2,377.18   1,458.75 
3   1999-03    406.50     237.90 
4   1999-06    954.70     541.20 
5   1999-09  1,531.00         NaN
6   1999-12  2,678.82   1,693.13 
7   2000-03    449.10     263.40 
8   2000-06  1,044.10     599.90 
9   2000-09  1,693.40     970.00 
10  2000-12  3,161.66   2,049.12 

如何使用熊猫来做到这一点?我正在考虑bfill或用0来填写NaN的月份值,如果您有更好的解决方案,感谢您的分享。

2 个答案:

答案 0 :(得分:2)

您可以执行以下操作。您也可以将fillna(0)更改为fillna(method='ffill')

df['date']=pd.to_datetime(df['date'])
df.set_index('date').resample('M').last().fillna(0).reset_index()

输出

date    v1  v2
0   1996-12-31  1,789.20    1,001.19
1   1997-01-31  0   0
2   1997-02-28  0   0
3   1997-03-31  0   0
4   1997-04-30  0   0
5   1997-05-31  0   0
6   1997-06-30  0   0
7   1997-07-31  0   0
8   1997-08-31  0   0
9   1997-09-30  0   0
10  1997-10-31  0   0
11  1997-11-30  0   0
12  1997-12-31  2,077.09    1,218.06
13  1998-01-31  0   0
14  1998-02-28  0   0
15  1998-03-31  0   0
16  1998-04-30  0   0
17  1998-05-31  0   0
18  1998-06-30  0   0
19  1998-07-31  0   0
20  1998-08-31  0   0
21  1998-09-30  0   0
22  1998-10-31  0   0
23  1998-11-30  0   0
24  1998-12-31  2,377.18    1,458.75
25  1999-01-31  0   0
26  1999-02-28  0   0
27  1999-03-31  406.50  237.90
28  1999-04-30  0   0
29  1999-05-31  0   0
30  1999-06-30  954.70  541.20
31  1999-07-31  0   0
32  1999-08-31  0   0
33  1999-09-30  1,531.00    0
34  1999-10-31  0   0
35  1999-11-30  0   0
36  1999-12-31  2,678.82    1,693.13
37  2000-01-31  0   0
38  2000-02-29  0   0
39  2000-03-31  449.10  263.40
40  2000-04-30  0   0
41  2000-05-31  0   0
42  2000-06-30  1,044.10    599.90
43  2000-07-31  0   0
44  2000-08-31  0   0
45  2000-09-30  1,693.40    970.00
46  2000-10-31  0   0
47  2000-11-30  0   0
48  2000-12-31  3,161.66    2,049.12

答案 1 :(得分:2)

IIUC,试试这个

s = pd.to_datetime(df.date).dt.to_period('M')
df.drop('date',1).set_index(s).resample('M').bfill().reset_index()

Out[266]:
       date        v1        v2
0   1996-12  1,789.20  1,001.19
1   1997-01  2,077.09  1,218.06
2   1997-02  2,077.09  1,218.06
3   1997-03  2,077.09  1,218.06
4   1997-04  2,077.09  1,218.06
5   1997-05  2,077.09  1,218.06
6   1997-06  2,077.09  1,218.06
7   1997-07  2,077.09  1,218.06
8   1997-08  2,077.09  1,218.06
9   1997-09  2,077.09  1,218.06
10  1997-10  2,077.09  1,218.06
11  1997-11  2,077.09  1,218.06
12  1997-12  2,077.09  1,218.06
13  1998-01  2,377.18  1,458.75
14  1998-02  2,377.18  1,458.75
15  1998-03  2,377.18  1,458.75
16  1998-04  2,377.18  1,458.75
17  1998-05  2,377.18  1,458.75
18  1998-06  2,377.18  1,458.75
19  1998-07  2,377.18  1,458.75
20  1998-08  2,377.18  1,458.75
21  1998-09  2,377.18  1,458.75
22  1998-10  2,377.18  1,458.75
23  1998-11  2,377.18  1,458.75
24  1998-12  2,377.18  1,458.75
25  1999-01    406.50    237.90
26  1999-02    406.50    237.90
27  1999-03    406.50    237.90
28  1999-04    954.70    541.20
29  1999-05    954.70    541.20
30  1999-06    954.70    541.20
31  1999-07  1,531.00       NaN
32  1999-08  1,531.00       NaN
33  1999-09  1,531.00       NaN
34  1999-10  2,678.82  1,693.13
35  1999-11  2,678.82  1,693.13
36  1999-12  2,678.82  1,693.13
37  2000-01    449.10    263.40
38  2000-02    449.10    263.40
39  2000-03    449.10    263.40
40  2000-04  1,044.10    599.90
41  2000-05  1,044.10    599.90
42  2000-06  1,044.10    599.90
43  2000-07  1,693.40    970.00
44  2000-08  1,693.40    970.00
45  2000-09  1,693.40    970.00
46  2000-10  3,161.66  2,049.12
47  2000-11  3,161.66  2,049.12
48  2000-12  3,161.66  2,049.12