我正在使用sqlalchemy写入一个mysql数据库,我在其中索引一些文件并存储它们的内容。我需要编写文件,然后将具有外键的索引条目写入files
表。但是,sqlalchemy似乎不按顺序发出INSERT
语句。
这是一个最小的功能性示例,说明使用模拟随机数据的问题(减去配置文件,其中包含服务器特定信息):
索引/ ORM.py:
#!/bin/env python2.7
from __future__ import print_function
import os
from sqlalchemy import Column, ForeignKey, Integer, String
from sqlalchemy.dialects.mysql import LONGBLOB, INTEGER
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship, sessionmaker
from sqlalchemy import create_engine
from Index import load_cfg
class Base(object):
"""
Basic MySQL table settings
"""
__table_args__ = {
'mysql_engine': 'InnoDB',
'mysql_collate': 'latin1_general_cs'
}
Base = declarative_base(cls=Base)
class CoverageIndex(Base):
"""
Class for coverage_index table objects
"""
__tablename__ = 'coverage_index'
filename = Column(String(45), primary_key=True)
#filename = Column(String(45), ForeignKey("files.filename"), primary_key=True)
sequence_id = Column(String(45), primary_key=True, index=True)
def __init__(self, filename, sequence_id):
self.filename = filename
self.sequence_id = sequence_id
class FileRow(Base):
"""
Class for files stored in db
"""
__tablename__ = 'files'
filename = Column(String(45), primary_key=True)
contents = Column(LONGBLOB)
def __init__(self, filename, contents):
self.filename = filename
self.contents = contents
cfg = load_cfg()
db_string = 'mysql://%(user)s:%(passwd)s@%(host)s/%(db)s' % cfg['db_config']
engine = create_engine(db_string, echo=True)
Base.metadata.create_all(engine)
if __name__ == '__main__':
pass
index.py:
#!/usr/bin/env python2.7
from __future__ import print_function
import os
import sys
from sqlalchemy import Column, ForeignKey, Integer, String
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine
from sqlalchemy.exc import IntegrityError
from Index.ORM import Base, CoverageIndex, FileRow, engine as db_engine
if __name__ == '__main__':
import string, random
data = {}
for i in range(0,10):
file = 'file' + str(i)
data[file] = {
'seqs': ['seqa' + str(i), 'seqb' + str(i)],
'contents': '\n'.join([''.join([random.choice(string.letters) for x in range (0, 80)]) for y in range (0, 2500)])}
#print (data)
Base.metadata.bind = db_engine
DBSession = sessionmaker(bind=db_engine)
session = DBSession()
for file, datum in data.iteritems():
file_query = session.query(FileRow).filter(FileRow.filename == file)
if file_query.count() > 0:
session.query(CoverageIndex).filter(CoverageIndex.filename == file).delete(synchronize_session='fetch')
file_query.delete(synchronize_session='fetch')
for i in datum['seqs']:
# Write to DB
fqc = file_query.count()
print ("No. of files: " + str(fqc))
if fqc == 0:
print ("Adding: ")
fr = FileRow(
filename = file,
contents = datum['contents']
)
session.add(fr)
cov = CoverageIndex(
filename = file,
sequence_id = i)
session.add(cov)
try:
session.commit()
except:
#print ("SQL Commit Failed: %s" % file)
session.rollback()
session.close()
raise
session.close()
这是一次运行输出的一部分。我想提请您注意第2018-03-13 16:05:40,291
行和...,292
:
...
2018-03-13 16:05:40,287 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2018-03-13 16:05:40,288 INFO sqlalchemy.engine.base.Engine SELECT count(*) AS count_1
FROM (SELECT files.filename AS files_filename, files.contents AS files_contents
FROM files
WHERE files.filename = %s) AS anon_1
2018-03-13 16:05:40,288 INFO sqlalchemy.engine.base.Engine ('file1',)
2018-03-13 16:05:40,290 INFO sqlalchemy.engine.base.Engine SELECT count(*) AS count_1
FROM (SELECT files.filename AS files_filename, files.contents AS files_contents
FROM files
WHERE files.filename = %s) AS anon_1
2018-03-13 16:05:40,290 INFO sqlalchemy.engine.base.Engine ('file1',)
No. of files: 0
Adding:
2018-03-13 16:05:40,291 INFO sqlalchemy.engine.base.Engine INSERT INTO coverage_index (filename, sequence_id) VALUES (%s, %s)
2018-03-13 16:05:40,291 INFO sqlalchemy.engine.base.Engine ('file1', 'seqa1')
2018-03-13 16:05:40,292 INFO sqlalchemy.engine.base.Engine INSERT INTO files (filename, contents) VALUES (%s, %s)
2018-03-13 16:05:40,292 INFO sqlalchemy.engine.base.Engine ('file1', 'BkTsRJTcNEigPFjofFxDmwVZDXRAsPECawRUjiFZTDGWWoLZzLnGlCwQQeAFyXhLqKjPAJmme
mFNfVzF\nJlZSvwGAdoImTnBAmcrSdMRDvxNYnnMfbQXdfuXulqufiIYpqjFUgfElZSrVkvBvPTg ... (204700 characters truncated) ... trwtYOycEOuDTVxsXeGoNYKAqHlE
LGPqcimwzwAFAEsCZGBBnGzYMHgabgnGZaGmQsn\nSNjYvBwSVdXVKbmJpKdSHSXCDKKvDlkyLxOxsEfOtmlCRruqzaiPhYRocKZQEJSVrtSHncFMBMTEpWUX')
2018-03-13 16:05:40,310 INFO sqlalchemy.engine.base.Engine SELECT count(*) AS count_1
FROM (SELECT files.filename AS files_filename, files.contents AS files_contents
FROM files
WHERE files.filename = %s) AS anon_1
2018-03-13 16:05:40,310 INFO sqlalchemy.engine.base.Engine ('file1',)
No. of files: 1
2018-03-13 16:05:40,311 INFO sqlalchemy.engine.base.Engine INSERT INTO coverage_index (filename, sequence_id) VALUES (%s, %s)
2018-03-13 16:05:40,311 INFO sqlalchemy.engine.base.Engine ('file1', 'seqb1')
2018-03-13 16:05:40,312 INFO sqlalchemy.engine.base.Engine COMMIT
...
在这里,您可以看到sqlalchemy在插入coverage_index
对象之前插入了files
。我认为这是因为文件对象更大并且需要一些时间来准备,因此引擎决定首先异步运行后面的INSERT
。
但是,首先需要插入files
条目,因为filename
中的coverage_index
应该是files
的外键。 (如果我使用定义的外键约束执行此操作,则抛出异常)
我知道我可以在添加到files
后提交,但我希望files
和coverage_index
INSERT
在同一个交易中,这样他们就会保持同步
所以问题是,有没有办法强制sqlalchemy在事务中同步执行?
答案 0 :(得分:0)
不确定这是 最佳方式,但它似乎实现了我想要的目标:
将所有对象更改刷新到数据库。
将所有挂起的对象创建,删除和修改作为INSERT,DELETE,UPDATE等写出。操作由Session的工作单元依赖性解算器自动排序。
数据库操作将在当前事务上下文中发出,并且不会影响事务的状态,除非发生错误,在这种情况下将回滚整个事务。您可以在事务中随意刷新()以将更改从Python移动到数据库的事务缓冲区
感谢:
Is SQLAlchemy saves order in adding objects to session?
http://www.aosabook.org/en/sqlalchemy.html - 第20.9节工作单元