将pandas dataframe导入mysql数据库时出现“未知列”错误

时间:2016-03-25 15:14:29

标签: python pandas sqlalchemy pythonanywhere

我正在尝试使用pandas和sqlalchemy导入位于PythonAnywhere中的csv文件,然后使用to_sql函数将数据帧转换为SQL表。

每一步都有效,除了最后一步。

house_prices = pd.read_csv("twenty_year_change_data.csv")
from sqlalchemy import create_engine
engine = create_engine("mysql+mysqlconnector://{username}:{password}@{hostname}/{databasename}", echo=False)
house_prices.to_sql(name='house_prices', con=engine, if_exists = 'append', index=False)

错误是:

File "/usr/local/lib/python2.7/dist-packages/mysql/connector/connection.py", line 508, in _handle_result
    raise errors.get_exception(packet)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) 1054 (42S22): Unknown column 'percent_change' in 'field list' u'INSERT INTO hous
e_prices (pcode_district, `1995`, `2015`, percent_change) VALUES (%(pcode_district)s, %(1995)s, %(2015)s, %(percent_change)s)' ({'20
15': 427500.0, '1995': 79700.0, 'pcode_district': 'AL1 1', 'percent_change': 436.38644919999996}, {'2015': 384250.0, '1995': 78125.0
, 'pcode_district': 'AL1 2', 'percent_change': 391.84}, {'2015': 306500.0, '1995': 66500.0, 'pcode_district': 'AL1 3', 'percent_chan
ge': 360.90225560000005}, {'2015': 575000.0, '1995': 98000.0, 'pcode_district': 'AL1 4', 'percent_change': 486.7346938999999}, {'201
5': 365250.0, '1995': 72000.0, 'pcode_district': 'AL1 5', 'percent_change': 407.2916667}, {'2015': 229950.0, '1995': 58500.0, 'pcode
_district': 'AL10 0', 'percent_change': 293.0769231}, {'2015': 245000.0, '1995': 59500.0, 'pcode_district': 'AL10 8', 'percent_change': 311.76470589999997}, {'2015': 270000.0, '1995': 58250.0, 'pcode_district': 'AL10 9', 'percent_change': 363.51931329999996}  ... displaying 10 of 8826 total bound parameter sets ...  {'2015': 28100000.0, '1995': nan, 'pcode_district': 'YO90 1', 'percent_change'
: nan}, {'2015': 210000.0, '1995': 58500.0, 'pcode_district': 'n', 'percent_change': 258.974359})

The original CSV can be seen at the following link.

我做错了什么?

更新:所以看起来我没有定义基表应该是什么样子。

from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Postcode(Base):

    __tablename__ = "h_prices"
    id = db.Column(db.Integer, primary_key=True)
    pcode = db.Column(db.String(15))
    price_1995 = db.Column(db.Integer)
    price_2015 = db.Column(db.Integer)
    change = db.Column(db.Numeric)

house_prices = pd.read_csv("twenty_year_change_data.csv")

engine = create_engine(SQLALCHEMY_DATABASE_URI, echo=False, pool_recycle=280)
Base.metadata.create_all(engine)
house_prices.to_sql(name="h_prices", con=engine, if_exists = 'append', index=False)

现在错误是:

File "/usr/local/lib/python2.7/dist-packages/mysql/connector/connection.py", line 508, in _handle_result
    raise errors.get_exception(packet)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) 1054 (42S22): Unknown column 'nan' in 'field list' u'INSERT INTO h_prices (pcode
_district, `1995`, `2015`, percent_change) VALUES (%(pcode_district)s, %(1995)s, %(2015)s, %(percent_change)s)' ({'2015': 427500.0, 
'1995': 79700.0, 'pcode_district': 'AL1 1', 'percent_change': 436.38644919999996}, {'2015': 384250.0, '1995': 78125.0, 'pcode_distri
ct': 'AL1 2', 'percent_change': 391.84}, {'2015': 306500.0, '1995': 66500.0, 'pcode_district': 'AL1 3', 'percent_change': 360.902255
60000005}, {'2015': 575000.0, '1995': 98000.0, 'pcode_district': 'AL1 4', 'percent_change': 486.7346938999999}, {'2015': 365250.0, '
1995': 72000.0, 'pcode_district': 'AL1 5', 'percent_change': 407.2916667}, {'2015': 229950.0, '1995': 58500.0, 'pcode_district': 'AL
10 0', 'percent_change': 293.0769231}, {'2015': 245000.0, '1995': 59500.0, 'pcode_district': 'AL10 8', 'percent_change': 311.7647058
9999997}, {'2015': 270000.0, '1995': 58250.0, 'pcode_district': 'AL10 9', 'percent_change': 363.51931329999996}  ... displaying 10 o
f 8826 total bound parameter sets ...  {'2015': 28100000.0, '1995': nan, 'pcode_district': 'YO90 1', 'percent_change': nan}, {'2015'
: 210000.0, '1995': 58

0 个答案:

没有答案