我正在尝试使用pandas和sqlalchemy导入位于PythonAnywhere中的csv文件,然后使用to_sql函数将数据帧转换为SQL表。
每一步都有效,除了最后一步。
house_prices = pd.read_csv("twenty_year_change_data.csv")
from sqlalchemy import create_engine
engine = create_engine("mysql+mysqlconnector://{username}:{password}@{hostname}/{databasename}", echo=False)
house_prices.to_sql(name='house_prices', con=engine, if_exists = 'append', index=False)
错误是:
File "/usr/local/lib/python2.7/dist-packages/mysql/connector/connection.py", line 508, in _handle_result
raise errors.get_exception(packet)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) 1054 (42S22): Unknown column 'percent_change' in 'field list' u'INSERT INTO hous
e_prices (pcode_district, `1995`, `2015`, percent_change) VALUES (%(pcode_district)s, %(1995)s, %(2015)s, %(percent_change)s)' ({'20
15': 427500.0, '1995': 79700.0, 'pcode_district': 'AL1 1', 'percent_change': 436.38644919999996}, {'2015': 384250.0, '1995': 78125.0
, 'pcode_district': 'AL1 2', 'percent_change': 391.84}, {'2015': 306500.0, '1995': 66500.0, 'pcode_district': 'AL1 3', 'percent_chan
ge': 360.90225560000005}, {'2015': 575000.0, '1995': 98000.0, 'pcode_district': 'AL1 4', 'percent_change': 486.7346938999999}, {'201
5': 365250.0, '1995': 72000.0, 'pcode_district': 'AL1 5', 'percent_change': 407.2916667}, {'2015': 229950.0, '1995': 58500.0, 'pcode
_district': 'AL10 0', 'percent_change': 293.0769231}, {'2015': 245000.0, '1995': 59500.0, 'pcode_district': 'AL10 8', 'percent_change': 311.76470589999997}, {'2015': 270000.0, '1995': 58250.0, 'pcode_district': 'AL10 9', 'percent_change': 363.51931329999996} ... displaying 10 of 8826 total bound parameter sets ... {'2015': 28100000.0, '1995': nan, 'pcode_district': 'YO90 1', 'percent_change'
: nan}, {'2015': 210000.0, '1995': 58500.0, 'pcode_district': 'n', 'percent_change': 258.974359})
The original CSV can be seen at the following link.
我做错了什么?
更新:所以看起来我没有定义基表应该是什么样子。
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
class Postcode(Base):
__tablename__ = "h_prices"
id = db.Column(db.Integer, primary_key=True)
pcode = db.Column(db.String(15))
price_1995 = db.Column(db.Integer)
price_2015 = db.Column(db.Integer)
change = db.Column(db.Numeric)
house_prices = pd.read_csv("twenty_year_change_data.csv")
engine = create_engine(SQLALCHEMY_DATABASE_URI, echo=False, pool_recycle=280)
Base.metadata.create_all(engine)
house_prices.to_sql(name="h_prices", con=engine, if_exists = 'append', index=False)
现在错误是:
File "/usr/local/lib/python2.7/dist-packages/mysql/connector/connection.py", line 508, in _handle_result
raise errors.get_exception(packet)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) 1054 (42S22): Unknown column 'nan' in 'field list' u'INSERT INTO h_prices (pcode
_district, `1995`, `2015`, percent_change) VALUES (%(pcode_district)s, %(1995)s, %(2015)s, %(percent_change)s)' ({'2015': 427500.0,
'1995': 79700.0, 'pcode_district': 'AL1 1', 'percent_change': 436.38644919999996}, {'2015': 384250.0, '1995': 78125.0, 'pcode_distri
ct': 'AL1 2', 'percent_change': 391.84}, {'2015': 306500.0, '1995': 66500.0, 'pcode_district': 'AL1 3', 'percent_change': 360.902255
60000005}, {'2015': 575000.0, '1995': 98000.0, 'pcode_district': 'AL1 4', 'percent_change': 486.7346938999999}, {'2015': 365250.0, '
1995': 72000.0, 'pcode_district': 'AL1 5', 'percent_change': 407.2916667}, {'2015': 229950.0, '1995': 58500.0, 'pcode_district': 'AL
10 0', 'percent_change': 293.0769231}, {'2015': 245000.0, '1995': 59500.0, 'pcode_district': 'AL10 8', 'percent_change': 311.7647058
9999997}, {'2015': 270000.0, '1995': 58250.0, 'pcode_district': 'AL10 9', 'percent_change': 363.51931329999996} ... displaying 10 o
f 8826 total bound parameter sets ... {'2015': 28100000.0, '1995': nan, 'pcode_district': 'YO90 1', 'percent_change': nan}, {'2015'
: 210000.0, '1995': 58