我有一个看起来像这样的df
df1:
Quantity Date Open
0 NaN 2006-01-16 NaN
1 -20.0 2006-01-17 NaN
2 -20.0 2006-01-18 NaN
3 NaN 2006-01-19 NaN
4 20.0 2006-01-20 NaN
. . . .
. . . .
. . . .
和另一个看起来像这样的数据框
df2
Date Open Quantity
0 2006-01-16 4567.00 -20.0
1 2006-01-19 4506.00 20.0
2 2006-01-25 4495.05 -20.0
3 2006-01-27 4609.80 20.0
4 2006-02-01 4574.05 -20.0
我想做的是在['Quantity','Open']列上合并df1和df2 假设它仅在df1.Quantity为NaN的行上合并。因此,df1应该看起来像这样
df1:
Quantity Date Open
0 -20.0 2006-01-16 4567.00
1 -20.0 2006-01-17 NaN
2 -20.0 2006-01-18 NaN
3 20.0 2006-01-19 4506.00
4 20.0 2006-01-20 NaN
我尝试的是此代码df1.Open = df1.loc[df1['Quantity'].isna(), 'Open'].fillna(df2.EntryPrice)
。我这样做是因为我确定df2中的日期包含在df1中的日期中,并且在df1.Quantity中具有NaN值。但是,当我运行它时,这就是结果
Quantity Date Open
0 -20 2006-01-16 4567.0
1 -20.0 2006-01-17 NaN
2 -20.0 2006-01-18 NaN
3 20 2006-01-19 4609.8
4 20.0 2006-01-20 NaN
... ... ... ...
3317 -20.0 2017-05-23 NaN
3318 NaN 2017-05-23 NaN
3319 20.0 2017-05-24 NaN
3320 20.0 2017-05-25 NaN
3321 20.0 2017-05-26 NaN
如您所见,在第3318行,“数量”和“未清”列中的NaN值仍未填充。有人可以帮助我
答案 0 :(得分:0)
在两个DatetimeIndex
中创建DataFrame
,然后仅在Open
中替换已过滤的行,然后替换Quantity
所有缺失的行中的缺失值:
df1 = df1.set_index('Date')
df2 = df2.set_index('Date')
mask = df1['Quantity'].isna()
df1.Open = df1.loc[mask, 'Open'].fillna(df2.Open)
df1.Quantity = df1['Quantity'].fillna(df2.Quantity)
df1 = df1.reset_index()
print (df1)
Date Quantity Open
0 2006-01-16 -20.0 4567.0
1 2006-01-17 -20.0 NaN
2 2006-01-18 -20.0 NaN
3 2006-01-19 20.0 4506.0
4 2006-01-20 20.0 NaN