Question

我正在尝试在 Pandas 中编写 fillna() 或 lambda 函数，以检查“user_score”列是否为 NaN，如果是，则使用来自另一个 DataFrame 的列数据。我尝试了两个选项：

games_data['user_score'].fillna(
    genre_score[games_data['genre']]['user_score']
    if np.isnan(games_data['user_score'])
    else games_data['user_score'],
    inplace = True
)

# but here is 'ValueError: The truth value of a Series is ambiguous'

和

games_data['user_score'] = games_data.apply(
    lambda row: 
    genre_score[row['genre']]['user_score'] 
    if np.isnan(row['user_score'])
    else row['user_score'],
    axis=1
)

# but here is 'KeyError' with another column from games_data

我的数据帧：

游戏数据

genre_score

我很乐意为您提供帮助！

Answer 1

您也可以直接使用 user_score_by_genre 映射 fillna()：

user_score_by_genre = games_data.genre.map(genre_score.user_score)
games_data.user_score = games_data.user_score.fillna(user_score_by_genre)

顺便说一句，如果 games_data.user_score 永远不会偏离 genre_score 值，您可以跳过 fillna() 并直接分配给 games_data.user_score：

games_data.user_score = games_data.genre.map(genre_score.user_score)

~~Pandas 的内置 Series.where 也可以使用，而且更加简洁：~~

~~df1.user_score.where(df1.user_score.isna(), df2.user_score, inplace=True)~~

Answer 2

使用numpy.where：

import numpy as np

df1['user_score'] = np.where(df1['user_score'].isna(), df2['user_score'], df1['user_score'])

Answer 3

我找到了解决方案的一部分 here

我使用 series.map:

user_score_by_genre = games_data['genre'].map(genre_score['user_score'])

然后我使用@MayankPorwal 回答：

games_data['user_score'] = np.where(games_data['user_score'].isna(), user_score_by_genre, games_data['user_score'])

我不确定这是不是最好的方法，但它对我有用。

在 Pandas 中使用 fillna() 和 lambda 函数替换 NaN 值

3 个答案: