根据另一栏回填

时间:2019-10-20 13:39:09

标签: python python-3.x pandas

df=pd.DataFrame({"Date":[date(2019,10,1),date(2019,10,2),date(2019,10,1),date(2019,10,2),date(2019,10,1),date(2019,10,2),date(2019,10,1),date(2019,10,2)],
                "CatID":[1,1,1,None,2,2,2,2],
                 "ShopID":[1,1,1,1,2,2,2,2]
                })

df.CatID=df['CatID'].bfill()
df

对于CatID中的None值,我希望根据ShopID重新填充它。 None的值由2填充,这不是我想要的值(由于CatID的值来自ShopID=2),它应保留为None,我该怎么办?

1 个答案:

答案 0 :(得分:0)

我不确定您到底要实现什么:

import pandas as pd
from datetime import date


df=pd.DataFrame({"Date":[date(2019,10,1),date(2019,10,2),date(2019,10,1),date(2019,10,2),date(2019,10,1),date(2019,10,2),date(2019,10,1),date(2019,10,2)],
                "CatID":[1,1,1,None,2,2,2,2],
                 "ShopID":[1,1,1,1,2,2,2,2]
                })

df['CatID_copy'] = df['CatID']
df['CatID'] = df['CatID'].bfill()
df.loc[df['CatID_copy'].isna(), 'CatID'] = df['CatID_copy'] 
df.drop(columns='CatID_copy', inplace=True)

输出:

   Date         CatID  ShopID
0  2019-10-01    1.0       1
1  2019-10-02    1.0       1
2  2019-10-01    1.0       1
3  2019-10-02    NaN       1
4  2019-10-01    2.0       2
5  2019-10-02    2.0       2
6  2019-10-01    2.0       2
7  2019-10-02    2.0       2

如果有所不同,请附加预期的输出。