使用cx_Oracle
连接器将pd大熊猫读入df可以正常工作,例如:
import pandas as pd
import cx_Oracle
conn_str = u'username/password@host:port/service_name'
conn = cx_Oracle.connect(conn_str)
tablequery="select * from largetable where rownum <= 5000000"
pd_df = pd.read_sql(tablequery, conn)
但是尝试将此表读入Dask数据框...
import dask.dataframe as dd
sqlalchemy_uri_orcl = "oracle:////username:password@host:port//service_name"
来自here的uri,适用于Windows 10和:
dask_df = dd.read_sql_table(table = tablequery, uri = sqlalchemy_uri_orcl, index = "IDX")
从here进行dd调用会产生以下错误:
Error message: DatabaseError: (cx_Oracle.DatabaseError) ORA-12545: Connect failed because target host or object does not exist
在不将uri中的'/'转义的情况下,错误略有不同:
NoSuchTableError:
如果需要,不确定如何将cx_Oracle连接器传递给dask调用
谢谢
答案 0 :(得分:0)
我不确定您的URI是否正确。在other answers中,他们使用以下内容,我想也可以与Dask一起使用:
host=hostname
port=port
sid='sid'
user='username'
password='password'
sid = cx_Oracle.makedsn(host, port, sid=sid)
uri = 'oracle://{user}:{password}@{sid}'.format(
user=user,
password=password,
sid=sid
)