OperationalError:(sqlite3.OperationalError)过多的SQL变量,而SQL使用数据帧

时间:2018-05-24 10:07:54

标签: python pandas dataframe

我有一个pandas数据框如下。

       activity         User_Id  \
0  VIEWED MOVIE  158d292ec18a49   
1  VIEWED MOVIE  158d292ec18a49   
2  VIEWED MOVIE  158d292ec18a49   
3  VIEWED MOVIE  158d292ec18a49   
4  VIEWED MOVIE  158e00978d7a6c   

                                         Media_Title Media_Type User_Rating  
0  20th Asian Athletics Championship-2013 Held At...                     NA  
1                                 Tu Majha Saangaati                     NA  
2                                       Home Cooking                     NA  
3                                         Mix Dil Se                     NA  
4                  Value, Virtues, Ethics & Morality                     NA

我正在尝试使用pandasql的sqldf包编写SQL查询,如下所示。

distinct_activity_user = pandasql.sqldf(" select User_Id from pmm_activity", locals())

我得到的错误是:

OperationalError: (sqlite3.OperationalError) too many SQL variables [SQL: 'INSERT INTO pmm_activity (activity, "User_Id", "Media_Title", "Media_Type", "User_Rating") VALUES

1 个答案:

答案 0 :(得分:0)

这可能是与列名中的空格有关的问题。当我尝试使用您提供的数据时,我遇到了这种情况。我有一个使用sqlite3的示例。这是一个示例,可以解决您的问题:

import sqlite3 as sql
import pandas as pd

file         = "..../movie.csv"
df = pd.read_csv(file, sep=";", dtype='unicode' )

这是数据框的样子

enter image description here

conn = sql.connect('movie2.db')
df.to_sql('movie', conn)
conn = sql.connect('movie2.db')
Movie = pd.read_sql('SELECT distinct "User_Id  "  FROM movie', conn)

enter image description here