将数据库A上转储的数据插入数据库B的最有效方法是什么?通常我会使用mysqldump
来完成这样的任务,但由于复杂查询,我不得不采取不同的方法。目前我有以下效率低下的解决方案:
from sqlalchemy import create_engine, Column, INTEGER, CHAR, VARCHAR
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
SessFactory = sessionmaker()
print('## Configure database connections')
db_one = create_engine('mysql://root:pwd1@127.0.0.1/db_one', echo=True).connect()
sess_one = SessFactory(bind=db_one)
db_two = create_engine('mysql://root:pwd2@127.0.0.2/db_two', echo=True).connect()
sess_two = SessFactory(bind=db_two)
## Declare query to dump data
dump_query = (
'SELECT A.id, A.name, B.address '
'FROM table_a A JOIN table_b B '
'ON A.id = B.id_c WHERE '
'A.deleted = 0'
)
print('## Fetch data on db_one')
data = db_one.execute(dump_query).fetchall()
## Declare table on db_two
class cstm_table(Base):
__tablename__ = 'cstm_table'
pk = Column(INTEGER, primary_key=True)
id = Column(CHAR(36), nullable=False)
name = Column(VARCHAR(150), default=None)
address = Column(VARCHAR(150), default=None)
print('## Recreate "cstm_table" on db_two')
cstm_table.__table__.drop(bind=db_two, checkfirst=True)
cstm_table.__table__.create(bind=db_two)
print('## Insert dumped data into the "cstm_table" on db_two')
for row in data:
insert = cstm_table.__table__.insert().values(row)
db_two.execute(insert)
这会依次执行100K插入(可怕)。
我也尝试过:
with db_two.connect() as conn:
with conn.begin() as trans:
row_as_dict = [dict(row.items()) for row in data]
try:
conn.execute(cstm_table.__table__.insert(), row_as_dict)
except:
trans.rollback()
raise
else:
trans.commit()
但是在插入~20行后我得到错误:
OperationalError: (_mysql_exceptions.OperationalError) (2006, 'MySQL server has gone away')
以下也做了这项工作,但我不确定它是最有效的:
sess_two.add_all([cstm_table(**dict(row.items())) for row in data])
sess_two.flush()
sess_two.commit()