我使用python(版本3.4.4),pandas(版本0.19.1)和sqlalchemy(版本1.1.4)以便从大型SQL表中进行chunkwise读取,预处理这些块并将其写入一个不同的SQL表。
使用pd.read_sql_query(verses_sql, conn, chunksize=10)
连续chunkwise读取,其中pd
是pandas导入,verses_sql
是SQL查询,conn
是DB-API连接,如果我这样做,则工作正常:< / p>
import pandas as pd
from sqlalchemy import create_engine
engine = create_engine('mssql+pymssql://<username>:<password>@<database>:1433/<FirstTable>')
conn = engine.connect()
verses_sql = '''SELECT [KA_Lang] FROM [dbo].[<FirstTable>]'''
for chunk in pd.read_sql_query(verses_sql, conn, chunksize=10):
chunk['KA_Lang'] = chunk['KA_Lang'].str.replace(r'[^a-zA-Z\u00C0-\u02AF]'," ")
chunk['KA_Lang'] = chunk['KA_Lang'].str.replace(r'\s\s+', " ")
chunk['KA_Lang'] = chunk['KA_Lang'].str.lower()
print(chunk['KA_Lang'].head(1))
问题在于:如果我尝试在第二个SQL表中编写预处理的块chunk['KA_Lang']
,请将其称为SecondTable
,仅将称为传递了10个元素的大块。迭代在那里停止。以下是改编的代码:
import pandas as pd
from sqlalchemy import create_engine
from sqlalchemy import Table, Column, Integer, String, MetaData
engine = create_engine('mssql+pymssql://<username>:<password>@<database>:1433/<FirstTable>')
conn = engine.connect()
verses_sql = '''SELECT [KA_Lang] FROM [dbo].[<FirstTable>]'''
for chunk in pd.read_sql_query(verses_sql, conn, chunksize=10):
chunk['KA_Lang'] = chunk['KA_Lang'].str.replace(r'[^a-zA-Z\u00C0-\u02AF]'," ")
chunk['KA_Lang'] = chunk['KA_Lang'].str.replace(r'\s\s+', " ")
chunk['KA_Lang'] = chunk['KA_Lang'].str.lower()
print(chunk['KA_Lang'].head(1))
chunk.to_sql('<SecondTable>', conn, if_exists= 'append', index= False)
conn.close()
如何从一个SQL表中连续读取一个块并将其写入另一个SQL表?如果我包括:chunk.to_sql('<SecondTable>', conn, if_exists= 'append', index= False)
?
答案 0 :(得分:1)
经过几天尝试不同的解决方法后,我解决了这个问题。这很容易。为了从一个SQL表中连续读取一个块并将其写入另一个SQL表,需要定义两个不同的连接:
engine = create_engine('mssql+pymssql://<username>:<password>@<database>:1433/<FirstTable>')
engine1 = create_engine('mssql+pymssql://<username>:<password>@<database>:1433/<FirstTable>')
conn = engine.connect()
conn1 = engine1.connect()
代码行,其中chunk
写在第二个表中,需要适应:
chunk.to_sql('<SecondTable>', conn1, if_exists= 'append', index= False)
完成!