这是我的数据集
RECEIPT_MONTH_YEAR RECEIPT_MONTH
2014-01-01 1
2014-02-01 2
2014-03-01 3
2014-04-01 4
2014-05-01 5
2014-06-01 6
2014-07-01 7
2014-08-01 8
2014-09-01 9
2014-10-01 10
2014-11-01 11
2015-08-01 8
2015-09-01 9
2015-10-01 10
2015-11-01 11
2015-12-01 12
我已将'RECEIPT_MONTH_YEAR'设置为我需要填写缺失日期的索引 并在剩余变量中填写na值
需要像
一样RECEIPT_MONTH_YEAR RECEIPT_MONTH
2014-01-01 1
2014-02-01 2
2014-03-01 3
2014-04-01 4
2014-05-01 5
2014-06-01 6
2014-07-01 7
2014-08-01 8
2014-09-01 9
2014-10-01 10
2014-11-01 11
2014-12-01 NA
2015-01-01 NA
2015-02-01 NA
2015-03-01 NA
2015-04-01 NA
2015-05-01 NA
2015-06-01 NA
2015-07-01 NA
2015-08-01 8
2015-09-01 9
2015-10-01 10
2015-11-01 11
2015-12-01 12
答案 0 :(得分:1)
在MS
asfreq
df.set_index('RECEIPT_MONTH_YEAR').asfreq('MS').reset_index()
Out[62]:
RECEIPT_MONTH_YEAR RECEIPT_MONTH
0 2014-01-01 1.0
1 2014-02-01 2.0
2 2014-03-01 3.0
3 2014-04-01 4.0
4 2014-05-01 5.0
5 2014-06-01 6.0
6 2014-07-01 7.0
7 2014-08-01 8.0
8 2014-09-01 9.0
9 2014-10-01 10.0
10 2014-11-01 11.0
11 2014-12-01 NaN
12 2015-01-01 NaN
13 2015-02-01 NaN
14 2015-03-01 NaN
15 2015-04-01 NaN
16 2015-05-01 NaN
17 2015-06-01 NaN
18 2015-07-01 NaN
19 2015-08-01 8.0
20 2015-09-01 9.0
21 2015-10-01 10.0
22 2015-11-01 11.0
23 2015-12-01 12.0
更新
s=df.set_index('RECEIPT_MONTH_YEAR').asfreq('MS').reset_index()
s['RECEIPT_MONTH']=s['RECEIPT_MONTH_YEAR'].dt.month