如何在此日期框架中填写缺失的数据。
在没有销售的情况下缺少数天的值。如何在特定商店和日期销售0件商品的日子里填写缺失值?
输入
Dates Store Item Sales
2017-01-01 Chicago Apple 10
2017-01-02 NewYork Pear 10
2017-01-03 Chicago Apple 10
输出
Dates Store Item Sales
2017-01-01 Chicago Apple 10
2017-01-01 Chicago Pear 0
2017-01-02 Chicago Apple 0
2017-01-02 Chicago Pear 0
2017-01-03 Chicago Apple 10
2017-01-03 Chicago Pear 0
2017-01-01 NewYork Apple 0
2017-01-01 NewYork Pear 0
2017-01-02 NewYork Apple 0
2017-01-02 NewYork Pear 10
2017-01-03 NewYork Apple 0
2017-01-03 NewYork Pear 0
答案 0 :(得分:4)
使用:
Multiindex
set_index
Multiindex
from_product
reindex
并添加0
以查找缺失值sort_index
和reset_index
Store
df = df.set_index(['Dates','Store','Item'])
mux = pd.MultiIndex.from_product(df.index.levels, names=df.index.names)
df = df.reindex(mux, fill_value=0).sort_index(level='Store').reset_index()
print (df)
Dates Store Item Sales
0 2017-01-01 Chicago Apple 10
1 2017-01-01 Chicago Pear 0
2 2017-01-02 Chicago Apple 0
3 2017-01-02 Chicago Pear 0
4 2017-01-03 Chicago Apple 10
5 2017-01-03 Chicago Pear 0
6 2017-01-01 NewYork Apple 0
7 2017-01-01 NewYork Pear 0
8 2017-01-02 NewYork Apple 0
9 2017-01-02 NewYork Pear 10
10 2017-01-03 NewYork Apple 0
11 2017-01-03 NewYork Pear 0
答案 1 :(得分:0)
使用set_index
,stack
和unstack
df.set_index(['Dates','Store','Item']).unstack().stack(dropna=False).\
unstack(1).stack(dropna=False).fillna(0).reset_index()
Out[258]:
Dates Item Store Sales
0 2017-01-01 Apple Chicago 10.0
1 2017-01-01 Apple NewYork 0.0
2 2017-01-01 Pear Chicago 0.0
3 2017-01-01 Pear NewYork 0.0
4 2017-01-02 Apple Chicago 0.0
5 2017-01-02 Apple NewYork 0.0
6 2017-01-02 Pear Chicago 0.0
7 2017-01-02 Pear NewYork 10.0
8 2017-01-03 Apple Chicago 10.0
9 2017-01-03 Apple NewYork 0.0
10 2017-01-03 Pear Chicago 0.0
11 2017-01-03 Pear NewYork 0.0