我有此代码:
import teradata
import dask.dataframe as dd
login = login
pwd = password
udaExec = teradata.UdaExec (appName="CAF", version="1.0",
logConsole=False)
session = udaExec.connect(method="odbc", DSN="Teradata",
USEREGIONALSETTINGS='N', username=login,
password=pwd, authentication="LDAP");
连接正常。
我想要一个简单的数据框。我已经尝试过了:
sqlStmt = "SOME SQL STATEMENT"
df = dd.read_sql_table(sqlStmt, session, index_col='id')
我收到此错误消息:
AttributeError: 'UdaExecConnection' object has no attribute '_instantiate_plugins'
有人有建议吗?
谢谢。
答案 0 :(得分:0)
read_sql_table
需要一个SQLalchemy连接字符串,而不是传递时的“会话”。我还没有听说过可以通过sqlalchemy使用teradata,但是显然至少可以安装MDN,并且可能还使用通用ODBC驱动程序来安装其他解决方案。
但是,您可能希望通过delayed
使用更直接的方法,例如
from dask import delayed
# make a set of statements for each partition
statements = [sqlStmt + " where id > {} and id <= {}".format(bounds)
for bounds in boundslist] # I don't know syntax for tera
def get_part(statement):
# however you make a concrete dataframe from a SQL statement
udaExec = ..
session = ..
df = ..
return dataframe
# ideally you should provide the meta and divisions info here
df = dd.from_delayed([delayed(get_part)(stm) for stm in statements],
meta= , divisions=)
我们将很高兴听到您的成功。