用户按项目martrix pandas

时间:2017-01-26 02:08:16

标签: pandas recommendation-engine

我在推荐系统工作。我已按照this按项目矩阵制作用户。但是,我遇到了错误IndexError: index 8928358160 is out of bounds for axis 0 with size 5

以下是数据集的示例。

import pandas as pd
import numpy as np

df = pd.read_csv('APRIL.csv')
df = df.drop(['BASKETID'],1)
df = df.head(10)
df
Out[89]:
MEMBERID    SKU QTY
0   8928358161  37101163    2
1   8928358161  36618858    1
2   8928358161  40855129    1
3   8933444371  35010078    1
4   8932505053  36335949    1
5   8932505053  92100668    1
6   8932505053  36529730    2
7   8921161362  61814893    1
8   8915688100  34732853    1
9   8915688100  35122457    1


n_users = df.MEMBERID.unique().shape[0]
n_items = df.SKU.unique().shape[0]
print str(n_users) + ' users'
print str(n_items) + ' items'
5 users
10 items

ratings = np.zeros((n_users, n_items))
for row in df.itertuples():
    ratings[row[1]-1, row[2]-1] = row[3]
ratings
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
<ipython-input-92-0a393963bf4c> in <module>()
      1 ratings = np.zeros((n_users, n_items))
      2 for row in df.itertuples():
----> 3     ratings[row[1]-1, row[2]-1] = row[3]
      4 ratings

IndexError: index 8928358160 is out of bounds for axis 0 with size 5

我仍然不明白index 8928358160来自哪里。

1 个答案:

答案 0 :(得分:0)

为什么不将值转换为字符串? 虽然它是整数,但计算机可能会将其视为科学值,从而成为浮点值。

试试这个:

将cust_id和item_number转换为float值:

中的字符
mergedfinal['cust_id'] = mergedfinal['cust_id'].astype(str)
mergedfinal['item_number'] = mergedfinal['item_number'].astype(str)
mergedfinal['SKU'] = mergedfinal['SKU'].astype(str)

mergedfinal是我的数据框