我有一个如下所示的数据框
Category Date Value
A Jan 1
A Feb 1
A Mar 1
B Jan 1
B Feb 1
C Jan 1
C Mar 1
我想用0填满每个类别的缺失月份,即
Category Date Value
A Jan 1
A Feb 1
A Mar 1
B Jan 1
B Feb 1
B Mar 0
C Jan 1
C Feb 0
C Mar 1
我不太确定从哪里开始。预先感谢!
答案 0 :(得分:2)
您可以unstack
和fill_value=0
和stack
来获得结果
df.set_index(["Category","Date"]).unstack(fill_value=0).stack().reset_index()
输出
Category Date Value
0 A Feb 1
1 A Jan 1
2 A Mar 1
3 B Feb 1
4 B Jan 1
5 B Mar 0
6 C Feb 0
7 C Jan 1
8 C Mar 1
答案 1 :(得分:1)
您可以reindex
使用多索引:
multi = [(x,y) for x in df["Category"].unique() for y in df["Date"].unique()]
print (df.set_index(["Category","Date"]).reindex(multi).fillna(0).reset_index())
Category Date Value
0 A Jan 1.0
1 A Feb 1.0
2 A Mar 1.0
3 B Jan 1.0
4 B Feb 1.0
5 B Mar 0.0
6 C Jan 1.0
7 C Feb 0.0
8 C Mar 1.0
答案 2 :(得分:0)
另一种方法
df.pivot_table(index=['Category'], columns='Date', values='Value').fillna(0).stack().reset_index()
Category Date 0
0 A Feb 1.0
1 A Jan 1.0
2 A Mar 1.0
3 B Feb 1.0
4 B Jan 1.0
5 B Mar 0.0
6 C Feb 0.0
7 C Jan 1.0
8 C Mar 1.0