我有一个dataframe problem_data,在某些单元格中有NaN值。我运行了以下代码。
5.2.0
结果是:
problem_data[problem_data['level_type'] == 5.0]
然后,我运行以下命令来填充'points'列的NaN。
problem_id level_type points tags
5 prob_1479 5.0 NaN NaN
31 prob_2092 5.0 NaN NaN
38 prob_4395 5.0 NaN combinatorics,constructive algorithms,dfs
43 prob_5653 5.0 NaN NaN
48 prob_2735 5.0 2750.0 NaN
52 prob_1054 5.0 2000.0 combinatorics,dp
64 prob_2610 5.0 NaN NaN
65 prob_1649 5.0 NaN NaN
70 prob_4675 5.0 NaN dp,games
74 prob_445 5.0 NaN NaN
81 prob_6481 5.0 2500.0 combinatorics,dp,implementation,number theory
134 prob_2964 5.0 2500.0 games
161 prob_948 5.0 2000.0 dp,games
182 prob_642 5.0 NaN NaN
当我再次运行problem_data.loc[problem_data['level_type'] == 5.0 , 'points'].fillna(value=2500, inplace=True)
时,输出与之前相同。
你能告诉为什么problem_data[problem_data['level_type'] == 5.0]
在这里不起作用吗?我该怎么做才能纠正它?
答案 0 :(得分:1)
fillna
无法在数据框子切片上就位。你会想要:
mask = problem_data['level_type'] == 5.0
problem_data.loc[mask, 'points'] = problem_data.loc[mask, 'points'].fillna(value=2500)
problem_data.loc[mask, 'points']
5 2500.0
31 2500.0
38 2500.0
43 2500.0
48 2750.0
52 2000.0
64 2500.0
65 2500.0
70 2500.0
74 2500.0
81 2500.0
134 2500.0
161 2000.0
182 2500.0
Name: points, dtype: float64