Pandas Add missing row

时间:2018-10-02 09:10:45

标签: python pandas

I have the following input :

Year    Brand   Model   Value
2018    A           a   1,00
2018    A           b   2,00
2018    B           a   3,00
2017    A           b   4,00
2016    C           b   5,00

I would like to add the missing combinaisons :

  • for each year, I must have A, B and C
  • for each Brand, I must have a and b

The expected output is like that :

Year    Brand   Model   Value
2018    A          a    1
2018    A          b    2
2018    B          a    3,00
2018    B          b    
2018    C          a    
2018    C          b    
2017    A          a    
2017    A          b    4
2017    B          a    
2017    B          b    
2017    C          a    
2017    C          b    
2016    A          a    
2016    A          b    
2016    B          a    
2016    B          b    
2016    C          a    
2016    C          b    5

How can I do that ?

1 个答案:

答案 0 :(得分:2)

使用reindex创建的MultiIndex中的MultiIndex.from_product

mux = pd.MultiIndex.from_product([df['Year'].unique(),
                                  df['Brand'].unique(),
                                  df['Model'].unique()], names=['Year','Brand','Model'])
df = df.set_index(['Year','Brand','Model']).reindex(mux).reset_index()
print (df)
    Year Brand Model Value
0   2018     A     a  1,00
1   2018     A     b  2,00
2   2018     B     a  3,00
3   2018     B     b   NaN
4   2018     C     a   NaN
5   2018     C     b   NaN
6   2017     A     a   NaN
7   2017     A     b  4,00
8   2017     B     a   NaN
9   2017     B     b   NaN
10  2017     C     a   NaN
11  2017     C     b   NaN
12  2016     A     a   NaN
13  2016     A     b   NaN
14  2016     B     a   NaN
15  2016     B     b   NaN
16  2016     C     a   NaN
17  2016     C     b  5,00