sqlalchemy:将html表插入mysql db

时间:2018-02-20 14:01:25

标签: python sqlalchemy

我是python(3)的新手,现在想了以下内容:

我正在尝试通过网站上的pandas收集数据,并希望将结果存储到mysql数据库中,如:

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("mysql://python:"+'pw'+"@localhost/test?charset=utf8")

url = r'http://www.boerse-frankfurt.de/devisen'
dfs = pd.read_html(url,header=0,index_col=0,encoding="UTF-8")
devisen = dfs[9] #Select the right table
devisen.to_sql(name='table_fx', con=engine, if_exists='append', index=False)

我收到以下错误:

....    _mysql.connection.query(self,query) sqlalchemy.exc.OperationalError:(_ mysql_exceptions.OperationalError)(1054,“未知列”\ n \ t \ t \ t \ t \ t \ t \ t \ t \ n \ t \ t \ t \ t \ t \ t \ t \ t \ t \ t \ ceeichichnung \ n \ t \ t \ t \ t \ t \ n \ t \ t \'t''字段列表'“)[SQL:'INSERT INTO tbl_fx(\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tBezeichnung\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tzum Vortag\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tLetzter Stand\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tTageshoch\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tTagestief\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t52-Wochenhoch\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\t52-Wochentief\n\t\t\t\t\t\t\t\n\t\t\t\t\n\t\t\t\t\t\t\t\n\t\t\t\t\t\t\t\tDatum\n\t\t\t\t\t\t\t\n\t\t\t\t\nAktionen\t\t\t\t)价值观( %s,%s,%s,%s,%s,%s,%s,%s,%s)'] [参数:(('VAE Dirham',' - 0,5421%',45321.0,45512.0 ,45306.0,46080.0,38550.0,'20 .02.2018 14:29:00',无),('Armenischer Dram',' - 0,0403%',5965339.0,5970149.0,5961011.0,6043443.0,5108265.0,'20 .02.2018 01 :12:00',无),....

sqlalchemy如何将各自的数据插入table_fx?问题是带有多个\ n和\ t的标题。

mysql表包含以下结构:

(    name varchar(10)COLLATE utf8_unicode_ci DEFAULT NULL,    bezeichnung varchar(150)COLLATE utf8_unicode_ci DEFAULT NULL,    diff_vortag varchar(20)COLLATE utf8_unicode_ci DEFAULT NULL,    last double DEFAULT NULL,    day_high double DEFAULT NULL,    day_low double DEFAULT NULL,    52_week_high double DEFAULT NULL,    52_week_low double DEFAULT NULL,    date_time varchar(20)COLLATE utf8_unicode_ci DEFAULT NULL,    unnamed varchar(200)COLLATE utf8_unicode_ci DEFAULT NULL )

非常欢迎任何帮助。

非常感谢您提前

安德烈亚斯

1 个答案:

答案 0 :(得分:1)

这应该这样做。如果转换为数据框,则可以先重命名列。您正在创建的“dfs”实体实际上是一个数据框实体列表。

import pandas as pd
from sqlalchemy import create_engine

engine = create_engine("mysql://python:"+'pw'+"@localhost/test?charset=utf8")

url = r'http://www.boerse-frankfurt.de/devisen'
dfs = pd.read_html(url,header=0,index_col=0,encoding="UTF-8")

devisen = dfs[9].dropna(axis=0, thresh=4) # Select right table and make a DF

devisen.columns = devisen.columns.str.strip() # Strip extraneous characters

devisen.to_sql(name='table_fx', con=engine, if_exists='append', index=False)