在 Pandas 中使用 fillna() 和 lambda 函数替换 NaN 值

时间:2021-03-28 15:35:46

标签: python pandas

我正在尝试在 Pandas 中编写 fillna() 或 lambda 函数,以检查“user_score”列是否为 NaN,如果是,则使用来自另一个 DataFrame 的列数据。我尝试了两个选项:

games_data['user_score'].fillna(
    genre_score[games_data['genre']]['user_score']
    if np.isnan(games_data['user_score'])
    else games_data['user_score'],
    inplace = True
)

# but here is 'ValueError: The truth value of a Series is ambiguous'

games_data['user_score'] = games_data.apply(
    lambda row: 
    genre_score[row['genre']]['user_score'] 
    if np.isnan(row['user_score'])
    else row['user_score'],
    axis=1
)

# but here is 'KeyError' with another column from games_data

我的数据帧:

游戏数据

enter image description here

genre_score

enter image description here

我很乐意为您提供帮助!

3 个答案:

答案 0 :(得分:2)

您也可以直接使用 user_score_by_genre 映射 fillna()

user_score_by_genre = games_data.genre.map(genre_score.user_score)
games_data.user_score = games_data.user_score.fillna(user_score_by_genre)

顺便说一句,如果 games_data.user_score 永远不会偏离 genre_score 值,您可以跳过 fillna() 并直接分配给 games_data.user_score

games_data.user_score = games_data.genre.map(genre_score.user_score)

Pandas 的内置 Series.where 也可以使用,而且更加简洁:

df1.user_score.where(df1.user_score.isna(), df2.user_score, inplace=True)

答案 1 :(得分:1)

使用numpy.where

import numpy as np

df1['user_score'] = np.where(df1['user_score'].isna(), df2['user_score'], df1['user_score'])

答案 2 :(得分:1)

我找到了解决方案的一部分 here

我使用 series.map:

user_score_by_genre = games_data['genre'].map(genre_score['user_score'])

然后我使用@MayankPorwal 回答:

games_data['user_score'] = np.where(games_data['user_score'].isna(), user_score_by_genre, games_data['user_score'])

我不确定这是不是最好的方法,但它对我有用。