根据条件用另一个df的值填充nan

时间:2020-05-18 05:01:12

标签: python pandas dataframe nan

我有一个看起来像这样的df

df1:

    Quantity     Date      Open
0       NaN    2006-01-16   NaN
1     -20.0    2006-01-17   NaN
2     -20.0    2006-01-18   NaN
3       NaN    2006-01-19   NaN
4      20.0    2006-01-20   NaN
.        .         .         .
.        .         .         .
.        .         .         .

和另一个看起来像这样的数据框

df2
          Date       Open     Quantity
0    2006-01-16     4567.00     -20.0
1    2006-01-19     4506.00      20.0
2    2006-01-25     4495.05     -20.0
3    2006-01-27     4609.80      20.0
4    2006-02-01     4574.05     -20.0   

我想做的是在['Quantity','Open']列上合并df1和df2 假设它仅在df1.Quantity为NaN的行上合并。因此,df1应该看起来像这样

df1:

    Quantity     Date      Open
0     -20.0    2006-01-16   4567.00
1     -20.0    2006-01-17   NaN
2     -20.0    2006-01-18   NaN
3      20.0    2006-01-19   4506.00
4      20.0    2006-01-20   NaN

我尝试的是此代码df1.Open = df1.loc[df1['Quantity'].isna(), 'Open'].fillna(df2.EntryPrice)。我这样做是因为我确定df2​​中的日期包含在df1中的日期中,并且在df1.Quantity中具有NaN值。但是,当我运行它时,这就是结果

      Quantity       Date    Open
0          -20 2006-01-16  4567.0
1        -20.0 2006-01-17     NaN
2        -20.0 2006-01-18     NaN
3           20 2006-01-19  4609.8
4         20.0 2006-01-20     NaN
...        ...        ...     ...
3317     -20.0 2017-05-23     NaN
3318       NaN 2017-05-23     NaN
3319      20.0 2017-05-24     NaN
3320      20.0 2017-05-25     NaN
3321      20.0 2017-05-26     NaN

如您所见,在第3318行,“数量”和“未清”列中的NaN值仍未填充。有人可以帮助我

1 个答案:

答案 0 :(得分:0)

在两个DatetimeIndex中创建DataFrame,然后仅在Open中替换已过滤的行,然后替换Quantity所有缺失的行中的缺失值:

df1 = df1.set_index('Date')
df2 = df2.set_index('Date')
mask = df1['Quantity'].isna()

df1.Open = df1.loc[mask, 'Open'].fillna(df2.Open)
df1.Quantity = df1['Quantity'].fillna(df2.Quantity)
df1 = df1.reset_index()
print (df1)
         Date  Quantity    Open
0  2006-01-16     -20.0  4567.0
1  2006-01-17     -20.0     NaN
2  2006-01-18     -20.0     NaN
3  2006-01-19      20.0  4506.0
4  2006-01-20      20.0     NaN