pandas to_sql引发错误'标识符名称太长'

时间:2016-04-10 14:34:32

标签: python mysql pandas identifier

我正在使用Anaconda包管理器运行python 2.7。 pandas.to_sql 命令会引发标识符太长错误。

引发错误的代码行:

def write_to_sql(self, pdata):

    pdata.to_sql(self._tblname, self._db.get_connection(), flavor = 'mysql', 
        if_exists='replace', index = True, index_label = [COLUMN_ALLIANCERANK, COLUMN_ALLIANCEID, 
COLUMN_ALLIANCENAME, COLUMN_PLAYERID, COLUMN_NICK, COLUMN_LASTUPDATED])  

输入数据框 pdata 采用以下格式。除最后一个数字字段(101,102 ...)之外的所有字段都是数据帧中的索引。

COLUMN_ALLIANCERANK    ...    COLUMN_LASTUPDATED  
value a1                ...    value x1               101
value a2                ...    value x2               102

以下是错误转储(仅限相关部分)

  Traceback (most recent call last):
... ...
  File "D:\Workspace\python\lnk\datasourceActivityTrackerChange.py", line 92, in write_to_sql
    COLUMN_ALLIANCENAME, COLUMN_PLAYERID, COLUMN_NICK, COLUMN_LASTUPDATED])  
  File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 1003, in to_sql
    dtype=dtype)
  File "C:\Python27\lib\site-packages\pandas\io\sql.py", line 569, in to_sql
    chunksize=chunksize, dtype=dtype)
  File "C:\Python27\lib\site-packages\pandas\io\sql.py", line 1633, in to_sql
    table.create()
  File "C:\Python27\lib\site-packages\pandas\io\sql.py", line 690, in create
    self._execute_create()
  File "C:\Python27\lib\site-packages\pandas\io\sql.py", line 1400, in _execute_create
    conn.execute(stmt)
  File "C:\Python27\lib\site-packages\MySQLdb\cursors.py", line 205, in execute
    self.errorhandler(self, exc, value)
  File "C:\Python27\lib\site-packages\MySQLdb\connections.py", line 36, in defaulterrorhandler
    raise errorclass, errorvalue
_mysql_exceptions.OperationalError: (1059, "Identifier name 'ix_tbl_us3_activity_tracker_allianceRank_allianceId_allianceName_playerID_nick_lastUpdated' is too
long")

在互联网论坛上查看,似乎mysql将标识符限制为64个字符或更少。所以我一直在使用if_exists = 'append'而不是'replace'并直接在mysql中创建表,减少了to_sql参数中的表名和/或主/外键,基本上是为了规避错误。但这严重限制了我的灵活性以及更加混乱(在JSON文件中存储数据部分以避免这些错误),而不是应有的。

我的问题是 1.有没有另一种方法我可以使用if_exists = 'replace'但不限于使用短表/列名称以符合mysql标识符的< 64字符要求?
2.如果有更好的方法来实现这一目标,请分享。

1 个答案:

答案 0 :(得分:1)

您可以查看.../site-packages/pandas/io/sql.py中的源代码,该代码用于MySQL端的create index ...

ix_cols = [cname for cname, _, is_index in column_names_and_types
           if is_index]
if len(ix_cols):
    cnames = "_".join(ix_cols)
    cnames_br = ",".join([escape(c) for c in ix_cols])
    create_stmts.append(
        "CREATE INDEX " + escape("ix_" + self.name + "_" + cnames) +
        "ON " + escape(self.name) + " (" + cnames_br + ")")

IMO您要么自己在MySQL中创建索引,要么确保索引名称不超过64个字符。