尝试通过python中的熊猫“分组”该属性时出错

时间:2019-03-27 13:31:03

标签: python pandas pandas-groupby recommendation-engine

我想构建一个推荐系统并按照教程进行操作。我正在尝试对这些列进行分组,但是我遇到了一堆奇怪的错误,我不明白为什么。

import numpy as np
import pandas as pd
import math
import random
import sklearn

interactions_df = pd.read_csv('C:/Users/Rao/Desktop/Recommender System/users_interactions.csv')
interactions_df.head(3)

print(interactions_df.groupby(['personId', 'contentId']).size().groupby('personId').size())

我想要这个输出:

print (interactions_df.groupby(['personId', 'contentId']).size())
personId  contentId
W         a            1
          b            1
X         a            2
Y         a            2
Z         a            1
          b            1
dtype: int64

但是我得到了

TypeError                                 Traceback (most recent call 
last)
C:\Program Files\Anaconda3\lib\site-packages\pandas\indexes\multi.py in 
get_value(self, series, key)
    617             try:
--> 618                 return _index.get_value_at(s, k)
    619             except IndexError:

pandas\index.pyx in pandas.index.get_value_at (pandas\index.c:2549)()

pandas\src\util.pxd in util.get_value_at (pandas\index.c:15951)()

TypeError: 'str' object cannot be interpreted as an integer

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call 
last)
<ipython-input-25-22789f9d1e69> in <module>()
----> 1 print(interactions_df.groupby(['personId', 
'contentId']).size().groupby('personId').size())
      2 #print (interactions_df.groupby(['personId', 'contentId']).size())
      3 #print (interactions_df.groupby(['personId', 
'contentId']).size().groupby('personId').size())

C:\Program Files\Anaconda3\lib\site-packages\pandas\core\generic.py in 
groupby(self, by, axis, level, as_index, sort, group_keys, squeeze, 
**kwargs)
   3776         return groupby(self, by=by, axis=axis, level=level, 
as_index=as_index,
3777                        sort=sort, group_keys=group_keys, 
squeeze=squeeze,
-> 3778                        **kwargs)
   3779 
   3780     def asfreq(self, freq, method=None, how=None, 
normalize=False):


pandas\index.pyx in pandas.index.IndexEngine.get_value 
(pandas\index.c:3332)()

pandas\index.pyx in pandas.index.IndexEngine.get_value 
(pandas\index.c:3035)()

pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018) 
()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item 
(pandas\hashtable.c:12368)()

pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item 
(pandas\hashtable.c:12322)()

KeyError: 'personId'

0 个答案:

没有答案