Question

目的：

利用read_sql_query()的{{1}}和to_sql()方法，我的pandas脚本的目标是通过读取一个服务器执行从一个服务器到另一个服务器的多个表的ETL。 python 3.7个文件。两种方法中的连接参数都使用.sql的{{1}}模块。

引发错误：

成功提取并加载第一组表和交易记录后，出现了第四个错误。

create_engine

请参阅下面的更多详细信息。

过程：

每个提取的表都被写为一个SQL事务，由sqlalchemy拆分为一个SQL文件。

 sqlalchemy.exc.DBAPIError: (pyodbc.Error) ('HY090', '[HY090] [Microsoft][ODBC Driver Manager] Invalid string or buffer length (0) (SQLExecDirectW)')

;

ODBC参数。这两个服务器都是MS-SQL服务器。根据我的研究，我认为我的错误来自'ExtractTables.sql' SET NOCOUNT ON SELECT [ID1] , [Name] , [LastUpdated] , [UpdatedBy] INTO #table1 FROM DB1.dbo.table1 SELECT * FROM #table1 ; SET NOCOUNT ON SELECT [ID1] , [ID2] , [Descr] INTO #table2 FROM DB1.dbo.table2 SELECT * FROM #table2参数。

create_engine

Python脚本遍历每个命令，并对每个命令使用fast_executemany提取，并使用'connection.py' import pyodbc import urllib from sqlalchemy import create_engine #Use trusted connection to connect to server. fast_executemany is mssql specific. Allows for large data loads. params_H = urllib.parse.quote_plus("DRIVER=ODBC Driver 17 for SQL Server;SERVER=SERVER1;DATABASE=DB1;Trusted_Connection=yes") engine_H = create_engine(f'mssql+pyodbc:///?odbc_connect={params_H}', fast_executemany=True) params_b = urllib.parse.quote_plus("DRIVER=ODBC Driver 17 for SQL Server;SERVER=SERVER2;DATABASE=DB2;Trusted_Connection=yes") engine_b= create_engine(f'mssql+pyodbc:///?odbc_connect={params_b}', fast_executemany=True)加载每个表。

read_sql_query()

错误：

出于StackOverflow的目的，我将我的sql代码限制为2个批处理，但是它有4个批处理。前3个已成功传递了读写权限。但是，第四个引发了写入错误。错误表批次与其余错误表的主要区别在于其大小（700万行x 8列），第二高的是（150万x 6列）。

故障排除：

我研究过的所有错误都指向ODBC连接问题。这两台服务器都是64位的，我使用的是pydobc 4.025，并且我仅测试了提取整数值的字段。第一个能够成功加载的事务告诉我大多数事务都在工作，但最后一个有阻止加载的问题。我假设大小由to_sql()和'LoadTables.py' import pandas as pd import conn def readSQLFile_makeTables(filename): # Open and read file open_file = open(filename, 'r') sql_file = open_file.read() open_file.close() #all SQL commands (split on ';') sql_commands = sql_file.split(';') # Execute every command from file sql_tables = ['stg_table1', 'stg_table2', 'stg_table3'] i = 0 for command in sql_commands: table = pd.read_sql_query(command, con=conn.engine_H) print(table) table.to_sql(sql_tables[i], con=conn.engine_b, chunksize=5000, index=False, if_exists='append') i += 1 print('think this ran')处理，并且错误指向绑定参数。

https://github.com/mkleehammer/pyodbc/issues/548

跟踪：

chunksize=5000

pyodbc错误：“ HY090”，无效的字符串或缓冲区长度（0）

0 个答案: