正在学习创建一个用户-用户协作推荐系统,在这里我借助Python:MySql_Connector从MySql数据库中读取隐式数据。使用“购买”数据,我试图创建一个用户*项目评分矩阵,为此,我要将行(700,000行)透视成带有熊猫的列。当我对整个数据框运行数据透视时,出现以下错误。
“ ValueError:未堆叠的DataFrame太大,导致int32溢出”
import mysql.connector
import pandas as pd
import numpy as np
from mysql.connector import errorcode
def readData():
try:
mySQLConnection = mysql.connector.connect(host='localhost',
database='testdb',
user='user',
password='pwd')
cursor = mySQLConnection.cursor(prepared=True)
sql_select_query = """""" #Removed the select query
cursor.execute(sql_select_query)
record = cursor.fetchall()
return record
except mysql.connector.Error as error:
print("Failed to get record from database: {}".format(error))
finally:
# closing database connection.
if (mySQLConnection.is_connected()):
cursor.close()
mySQLConnection.close()
print("connection is closed")
data = readData()
df = pd.DataFrame(data,columns=['user_id','product_id','purchase_count'])
data_pivot = pd.pivot_table(df,index=['user_id'],columns=df['product_id'])
#print(data_pivot.to_string())
python_version:3.6 操作系统:Win7 内存:16GB pandas_version:0.24.2