用最近两天的平均值估算缺失值-熊猫

时间:2020-10-26 09:22:14

标签: python pandas

对于每个维度,我都希望使用前两天的平均值(不包含空值)来估算缺失值。请参见以下示例:

Main_DF

date        portal_name      price_category  avg_price   total_sales
2020-01-01  british_airways  business_class  4310        6312
2020-01-01  british_airways  economy_class   1200        12432
2020-01-01  british_airways  first_class     8990        2313
2020-01-02  british_airways  business_class  4564        5423
2020-01-02  british_airways  economy_class   1145        14242
2020-01-02  british_airways  first_class     9533        2210
2020-01-03  british_airways  business_class              
2020-01-03  british_airways  economy_class   1145        14242
2020-01-03  british_airways  first_class       
2020-01-04  british_airways  business_class              
2020-01-04  british_airways  economy_class   1321        17334
2020-01-04  british_airways  first_class            

Output_DF


date        portal_name      price_category  avg_price   total_sales
2020-01-01  british_airways  business_class  4310        6312
2020-01-01  british_airways  economy_class   1200        12432
2020-01-01  british_airways  first_class     8990        2313
2020-01-02  british_airways  business_class  4564        5423
2020-01-02  british_airways  economy_class   1145        14242
2020-01-02  british_airways  first_class     9533        2210
2020-01-03  british_airways  business_class  4437        5868    
2020-01-03  british_airways  economy_class   1145        14242
2020-01-03  british_airways  first_class     9261        2262    
2020-01-04  british_airways  business_class  4437        5868   
2020-01-04  british_airways  economy_class   1321        17334
2020-01-04  british_airways  first_class     9261        2262

1 个答案:

答案 0 :(得分:3)

IIUC,您可以将groupbytransform一起使用tail(2)df[["avg_price", "total_sales"]] = (df.groupby("price_category") .transform(lambda x: x.fillna(x[x.notnull()].tail(2).mean()))) print (df) date portal_name price_category avg_price total_sales 0 2020-01-01 british_airways business_class 4310.0 6312.0 1 2020-01-01 british_airways economy_class 1200.0 12432.0 2 2020-01-01 british_airways first_class 8990.0 2313.0 3 2020-01-02 british_airways business_class 4564.0 5423.0 4 2020-01-02 british_airways economy_class 1145.0 14242.0 5 2020-01-02 british_airways first_class 9533.0 2210.0 6 2020-01-03 british_airways business_class 4437.0 5867.5 7 2020-01-03 british_airways economy_class 1145.0 14242.0 8 2020-01-03 british_airways first_class 9261.5 2261.5 9 2020-01-04 british_airways business_class 4437.0 5867.5 10 2020-01-04 british_airways economy_class 1321.0 17334.0 11 2020-01-04 british_airways first_class 9261.5 2261.5 来获取最后两天:

JsonPath.parse(Body).set(fieldPath, Value);
相关问题