试图将熊猫数据帧转移到mySQL数据集

时间:2016-12-07 22:06:03

标签: mysql pandas dataframe

我使用panda分析JSON文件:https://data.cityofnewyork.us/api/views/kpav-sd4t/rows.json?accessType=DOWNLOAD

一切顺利,直到我到最后将我的信息从熊猫转移到SQL。

我说:

df.to_sql('table', con, chunksize=20000)

但结果是

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
   1399             else:
-> 1400                 cur.execute(*args)
   1401             return cur

/usr/lib/python3/dist-packages/MySQLdb/cursors.py in execute(self, query, args)
    209                 query = query.decode(db.unicode_literal.charset)
--> 210             query = query % args
    211 

TypeError: not all arguments converted during string formatting

During handling of the above exception, another exception occurred:

DatabaseError                             Traceback (most recent call last)
<ipython-input-25-a485028ed4c0> in <module>()
----> 1 df.to_sql('table', con, chunksize=20000)

/usr/local/lib/python3.5/dist-packages/pandas/core/generic.py in to_sql(self, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
   1199         sql.to_sql(self, name, con, flavor=flavor, schema=schema,
   1200                    if_exists=if_exists, index=index, index_label=index_label,
-> 1201                    chunksize=chunksize, dtype=dtype)
   1202 
   1203     def to_pickle(self, path):

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in to_sql(frame, name, con, flavor, schema, if_exists, index, index_label, chunksize, dtype)
    468     pandas_sql.to_sql(frame, name, if_exists=if_exists, index=index,
    469                       index_label=index_label, schema=schema,
--> 470                       chunksize=chunksize, dtype=dtype)
    471 
    472 

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in to_sql(self, frame, name, if_exists, index, index_label, schema, chunksize, dtype)
   1499                             if_exists=if_exists, index_label=index_label,
   1500                             dtype=dtype)
-> 1501         table.create()
   1502         table.insert(chunksize)
   1503 

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in create(self)
    581 
    582     def create(self):
--> 583         if self.exists():
    584             if self.if_exists == 'fail':
    585                 raise ValueError("Table '%s' already exists." % self.name)

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in exists(self)
    569 
    570     def exists(self):
--> 571         return self.pd_sql.has_table(self.name, self.schema)
    572 
    573     def sql_schema(self):

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in has_table(self, name, schema)
   1511                  "WHERE type='table' AND name=%s;") % wld
   1512 
-> 1513         return len(self.execute(query, [name, ]).fetchall()) > 0
   1514 
   1515     def get_table(self, table_name, schema=None):

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
   1410             ex = DatabaseError(
   1411                 "Execution failed on sql '%s': %s" % (args[0], exc))
-> 1412             raise_with_traceback(ex)
   1413 
   1414     @staticmethod

/usr/local/lib/python3.5/dist-packages/pandas/compat/__init__.py in raise_with_traceback(exc, traceback)
    337         if traceback == Ellipsis:
    338             _, _, traceback = sys.exc_info()
--> 339         raise exc.with_traceback(traceback)
    340 else:
    341     # this version of raise is a syntax error in Python 3

/usr/local/lib/python3.5/dist-packages/pandas/io/sql.py in execute(self, *args, **kwargs)
   1398                 cur.execute(*args, **kwargs)
   1399             else:
-> 1400                 cur.execute(*args)
   1401             return cur
   1402         except Exception as exc:

/usr/lib/python3/dist-packages/MySQLdb/cursors.py in execute(self, query, args)
    208             if not PY2 and isinstance(query, bytes):
    209                 query = query.decode(db.unicode_literal.charset)
--> 210             query = query % args
    211 
    212         if isinstance(query, unicode):

DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting

我使用

连接到我的服务器
con = mdb.connect(host = 'localhost', 
                      user = 'root', 
                      passwd = 'dwdstudent2015', 
                      charset = 'utf8', use_unicode=True) 

engine = con

我不明白为什么它不起作用

我见过其他例子,但他们没有翻译

1 个答案:

答案 0 :(得分:1)

con的{​​{1}}参数可以是either a SQLAlchemy engine or an sqlite connection

如果您使用的是MySQL(和MySQLdb Python适配器),那么您必须连接 使用SQLAlchemy engine

来使用它
DataFrame.to_sql

请注意错误说明

import sqlalchemy as SA
engine = SA.create_engine('mysql+mysqldb://{u}:{p}@{h}/{d}'.format(
                          u=USER, p=PASSWORD, h=HOST, d=DATABASE'))
df.to_sql('table', engine, chunksize=20000)

此SQL语句引用sqlite_master,因为Pandas正在假设 连接是一个sqlite数据库。如果,Pandas将生成以sqlite为中心的SQL 通过连接。如果传递了SQL,它将使用SQLAlchemy生成SQL SQLAlchemy引擎。