Python sqlalchemy类型varchar通过数据框to_sql而不是文本插入字母数字

时间:2019-07-01 16:56:53

标签: python sql oracle pandas sqlalchemy

我正在尝试通过pandas to_sql将数据插入oracle数据库

from sqlalchemy import create_engine, types
import cx_Oracle
import pandas as pd

oracle_connection_string = ('oracle+cx_oracle://{username}:{password}@'+
                           cx_Oracle.makedsn('{hostname}', 
                 '{port}', service_name='{service_name}'))

engine = create_engine(oracle_connection_string.format(
                        username='test',
                        password ='test',
                        hostname='test.server.com',
                        port='1521',
                        service_name='test.server.net'
                        )) ### Default encoding is utf-8 as i am using python3

df = pd.DataFrame(["Create Buying Opportunity as Underlying ",
"strategic optimisation to drive more",
"Deliver Growth Without Sacrificing Margins",
"while expanding operating margins"], columns=["column1"])

df.column1 = df.column1.str.encode('utf8')
###This is required because I have some non ascii characters in my text

####Below is the output of the dataframe - df
column1
0   b'Create Buying Opportunity as Underlying '
1   b'strategic optimisation to drive more'
2   b'Deliver Growth Without Sacrificing Margins'
3   b'while expanding operating margins'

如果我直接将数据推送到oracle DB,它可以正常工作,但唯一的挑战是,使用下面的代码永远插入145K记录会很困难

df.to_sql(name='TEST_TABLE', con=engine, if_exists='append', index=False)

由于上面的命令花费了更多的时间,根据其他来自github的用户的建议,我进行了如下更改

dtyp = {c:types.VARCHAR(df[c].str.len().max())
    for c in df.columns[df.dtypes == 'object'].tolist()}


df.to_sql(name='TEST_TABLE', con=engine, if_exists='append', index=False, dtype=dtyp)

现在,数据将以“ 415041432053656D69636F6E647563746F7”的形式插入表中,而不是文本“作为基础的购买机会”

0 个答案:

没有答案