根据条件从另一个df填充df的子集中的NaN

时间:2020-08-19 13:08:16

标签: python pandas

给出参考表fallback

    Morning     Afternoon   Evening
Red     4         6.0           13
Blue    7         NaN           9
Green   9         1.0           2

和数据表players

    Player  Morning     Afternoon   Evening     Team    Total
0   Bill    4.0             NaN      13.0       Red     17.0
1   Emma    NaN             NaN      NaN        Blue    0.0
2   Mike    NaN             1.0      NaN        Green   1.0
3   Jill    NaN             NaN      NaN        Red     0.0

我想根据以下规则在NaN中填充players数据:对于一名球员,在Morning, Afternoon, Evening全部三个中缺少数据(即Total为零),请填充fallback中与它们的Team相匹配的数据中的那三列。期望的结果:

    Player  Morning     Afternoon   Evening     Team
0   Bill    4.0            NaN      13.0        Red
1   Emma    7.0            NaN      9.0         Blue
2   Mike    NaN            1.0      NaN         Green
3   Jill    4.0            6.0      13.0        Red

用于生成样本数据的代码:

fallback = pd.DataFrame(
    {
        'Morning': [4, 7, 9],
        'Afternoon': [6, np.NaN, 1],
        'Evening': [13, 9, 2]
    },
    index=['Red', 'Blue', 'Green'])

players = pd.DataFrame({
    'Player': ['Bill', 'Emma', 'Mike', 'Jill'],
    'Morning': [4, np.NaN, np.NaN, np.NaN],
    'Afternoon': [np.NaN, np.NaN, 1, np.NaN],
    'Evening': [13, np.NaN, np.NaN, np.NaN],
    'Team': ['Red', 'Blue', 'Green', 'Red']
})
players['Total'] = players[['Morning', 'Afternoon', 'Evening']].sum(1)

outcome = pd.DataFrame({
    'Player': ['Bill', 'Emma', 'Mike', 'Jill'],
    'Morning': [4, 7, np.NaN, 4],
    'Afternoon': [np.NaN, np.NaN, 1, 6],
    'Evening': [13, 9, np.NaN, 13],
    'Team': ['Red', 'Blue', 'Green', 'Red']
})

2 个答案:

答案 0 :(得分:1)

根据条件-将DataFrame.combine_first转换为Team创建的Team使用DataFrame.all,并用{{3}}测试缺失值:

index

答案 1 :(得分:1)

我们可以对allisna进行切片,然后将后备字段更改为目标行索引,然后更改为update

player2 = player[player[['Morning','Afternoon','Evening']].isna().all(1)]
fallback = fallback.reindex(player2.Team).reset_index()
fallback.index = player2.index
player.update(fallback)