我正在尝试在 Pandas 中编写 fillna() 或 lambda 函数,以检查“user_score”列是否为 NaN,如果是,则使用来自另一个 DataFrame 的列数据。我尝试了两个选项:
games_data['user_score'].fillna(
genre_score[games_data['genre']]['user_score']
if np.isnan(games_data['user_score'])
else games_data['user_score'],
inplace = True
)
# but here is 'ValueError: The truth value of a Series is ambiguous'
和
games_data['user_score'] = games_data.apply(
lambda row:
genre_score[row['genre']]['user_score']
if np.isnan(row['user_score'])
else row['user_score'],
axis=1
)
# but here is 'KeyError' with another column from games_data
我的数据帧:
游戏数据
genre_score
我很乐意为您提供帮助!
答案 0 :(得分:2)
您也可以直接使用 user_score_by_genre
映射 fillna()
:
user_score_by_genre = games_data.genre.map(genre_score.user_score)
games_data.user_score = games_data.user_score.fillna(user_score_by_genre)
顺便说一句,如果 games_data.user_score
永远不会偏离 genre_score
值,您可以跳过 fillna()
并直接分配给 games_data.user_score
:
games_data.user_score = games_data.genre.map(genre_score.user_score)
Pandas 的内置 Series.where
也可以使用,而且更加简洁:
df1.user_score.where(df1.user_score.isna(), df2.user_score, inplace=True)
答案 1 :(得分:1)
使用numpy.where
:
import numpy as np
df1['user_score'] = np.where(df1['user_score'].isna(), df2['user_score'], df1['user_score'])
答案 2 :(得分:1)
我找到了解决方案的一部分 here
我使用 series.map:
user_score_by_genre = games_data['genre'].map(genre_score['user_score'])
然后我使用@MayankPorwal 回答:
games_data['user_score'] = np.where(games_data['user_score'].isna(), user_score_by_genre, games_data['user_score'])
我不确定这是不是最好的方法,但它对我有用。