我正在尝试通过pandas to_sql将数据插入oracle数据库
from sqlalchemy import create_engine, types
import cx_Oracle
import pandas as pd
oracle_connection_string = ('oracle+cx_oracle://{username}:{password}@'+
cx_Oracle.makedsn('{hostname}',
'{port}', service_name='{service_name}'))
engine = create_engine(oracle_connection_string.format(
username='test',
password ='test',
hostname='test.server.com',
port='1521',
service_name='test.server.net'
)) ### Default encoding is utf-8 as i am using python3
df = pd.DataFrame(["Create Buying Opportunity as Underlying ",
"strategic optimisation to drive more",
"Deliver Growth Without Sacrificing Margins",
"while expanding operating margins"], columns=["column1"])
df.column1 = df.column1.str.encode('utf8')
###This is required because I have some non ascii characters in my text
####Below is the output of the dataframe - df
column1
0 b'Create Buying Opportunity as Underlying '
1 b'strategic optimisation to drive more'
2 b'Deliver Growth Without Sacrificing Margins'
3 b'while expanding operating margins'
如果我直接将数据推送到oracle DB,它可以正常工作,但唯一的挑战是,使用下面的代码永远插入145K记录会很困难
df.to_sql(name='TEST_TABLE', con=engine, if_exists='append', index=False)
由于上面的命令花费了更多的时间,根据其他来自github的用户的建议,我进行了如下更改
dtyp = {c:types.VARCHAR(df[c].str.len().max())
for c in df.columns[df.dtypes == 'object'].tolist()}
df.to_sql(name='TEST_TABLE', con=engine, if_exists='append', index=False, dtype=dtyp)
现在,数据将以“ 415041432053656D69636F6E647563746F7”的形式插入表中,而不是文本“作为基础的购买机会”