假设您有以下合并的数据框,请注意索引2005有三列缺失
# HPI Int_rate US_GDP LowHPI
#YEAR
#2001.0 80.0 2.0 50.0 50
#2002.0 85.0 3.0 55.0 51
#2003.0 88.0 2.0 65.0 52
#2004.0 85.0 2.0 55.0 50
#2005.0 NaN NaN NaN 53
如果我有其他数据框或系列看起来像
([100,3,70],['HPI', 'Int_rate','US_GDP'])
有没有办法用它来自动填写缺失的列?感谢
答案 0 :(得分:0)
假设In [50]: df
Out[50]:
HPI Int_rate US_GDP LowHPI
2001.0 80.0 2.0 50.0 50
2002.0 85.0 3.0 55.0 51
2003.0 88.0 2.0 65.0 52
2004.0 85.0 2.0 55.0 50
2005.0 NaN NaN NaN 53
In [51]: another_df
Out[51]:
HPI Int_rate US_GDP
2005.0 100 3 70
In [52]: df = df.combine_first(another_df)
In [53]: df
Out[53]:
HPI Int_rate LowHPI US_GDP
2001.0 80.0 2.0 50 50.0
2002.0 85.0 3.0 51 55.0
2003.0 88.0 2.0 52 65.0
2004.0 85.0 2.0 50 55.0
2005.0 100.0 3.0 53 70.0
具有匹配的索引:
opencv
答案 1 :(得分:0)
还有update
使用df
中的所有非空值覆盖another_df
:
df.update(another_df)
>>> df
HPI Int_rate US_GDP LowHPI
2001 80 2 50 50
2002 85 3 55 51
2003 88 2 65 52
2004 85 2 55 50
2005 100 3 70 53