Question

假设您有以下合并的数据框，请注意索引2005有三列缺失

#         HPI  Int_rate    US_GDP   LowHPI
#YEAR                                                  
#2001.0  80.0       2.0     50.0      50
#2002.0  85.0       3.0     55.0      51
#2003.0  88.0       2.0     65.0      52
#2004.0  85.0       2.0     55.0      50
#2005.0   NaN       NaN     NaN       53

如果我有其他数据框或系列看起来像

([100,3,70],['HPI', 'Int_rate','US_GDP'])

有没有办法用它来自动填写缺失的列？感谢

Answer 1

假设In [50]: df Out[50]: HPI Int_rate US_GDP LowHPI 2001.0 80.0 2.0 50.0 50 2002.0 85.0 3.0 55.0 51 2003.0 88.0 2.0 65.0 52 2004.0 85.0 2.0 55.0 50 2005.0 NaN NaN NaN 53 In [51]: another_df Out[51]: HPI Int_rate US_GDP 2005.0 100 3 70 In [52]: df = df.combine_first(another_df) In [53]: df Out[53]: HPI Int_rate LowHPI US_GDP 2001.0 80.0 2.0 50 50.0 2002.0 85.0 3.0 51 55.0 2003.0 88.0 2.0 52 65.0 2004.0 85.0 2.0 50 55.0 2005.0 100.0 3.0 53 70.0具有匹配的索引：

opencv

Answer 2

还有update使用df中的所有非空值覆盖another_df：

df.update(another_df)
>>> df
      HPI  Int_rate  US_GDP  LowHPI
2001   80         2      50      50
2002   85         3      55      51
2003   88         2      65      52
2004   85         2      55      50
2005  100         3      70      53

是否有一种快速方法可以填写pandas数据帧中的缺失数据而无需更改整行？

2 个答案: