使用dataframe的KeyError异常

时间:2017-06-15 16:53:11

标签: python pandas dataframe

当我尝试执行下面的代码时,我得到remove_na_scores回溯。我尝试在apply()函数中的import pandas as pd import pprint shows = pd.read_csv('/Users/WilliamStevens/Downloads/netflix_shows.csv') pprint.pprint(shows.head()) shows.info() shows_df = shows.groupby(['ratingDescription']).mean() print(shows_df) missing_user_scores = shows[shows['user rating score'].isnull()] mean_scores = shows.groupby(['ratingDescription'])['user rating score'].mean() def remove_na_scores(row): if pd.isnull(row['user rating score']): return mean_scores[row['rating']] else: return row['user rating score'] shows['user rating score'] = shows.apply(remove_na_scores) print(shows['user rating score']) 之后更改轴,但是没有任何工作。

Traceback (most recent call last):
    File "netflix.py", line 28, in <module>
shows['user rating score'] = shows.apply(remove_na_scores)
    File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/frame.py", line 4163, in apply
return self._apply_standard(f, axis, reduce=reduce)
    File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/frame.py", line 4259, in _apply_standard
results[i] = func(v)
    File "netflix.py", line 23, in remove_na_scores
if pd.isnull(row['user rating score']):
    File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/core/series.py", line 601, in __getitem__
result = self.index.get_value(self, key)
    File "/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/site-packages/pandas/indexes/base.py", line 2169, in get_value
tz=getattr(series.dtype, 'tz', None))
    File "pandas/index.pyx", line 105, in pandas.index.IndexEngine.get_value (pandas/index.c:3567)
    File "pandas/index.pyx", line 113, in pandas.index.IndexEngine.get_value (pandas/index.c:3250)
    File "pandas/index.pyx", line 163, in pandas.index.IndexEngine.get_loc (pandas/index.c:4373)
KeyError: ('user rating score', 'occurred at index title')

完整的追溯如下:

{{1}}

1 个答案:

答案 0 :(得分:0)

该行:

if pd.isnull(row['user rating score']):

发生错误的位置。

发生这种情况是因为remove_na_scores沿错误的轴应用。将, axis=1添加到shows.apply可以解决您发布跟踪的问题:

>>>df = pd.DataFrame(np.random.rand(4,2), columns = ['title', 'user rating score'])
>>>df.apply(lambda x: x['user rating score'])
...
KeyError: ('user rating score', 'occurred at index title')

>>>df.apply(lambda x: x['user rating score'], axis=1)
0    0.083195
1    0.243666
2    0.572457
3    0.885327
dtype: float64

由于您提到指定轴不起作用,我想您在此之后会收到另一个错误。如果您描述新问题,我可以编辑我的帖子,但这应该解决所描述的问题。