在遇到数据库完整性错误(SQLAlchemy的要求)之后调用SQLAlchemy的session.rollback()
会导致所有会话对象被丢弃。这包括由先前选择创建的对象。
查看生成的SQL,我们的查询数据被捕获在一个隐式事务中,直到我们执行session.close
之后才完成。
Begin (implicit) transaction
SELECT Data
# Should be end transaction here
# Should Start new Transaction
INSERT New Data
# hit integrity error
rollback
try:
self.session.add(insertable['cluster'])
self.session.commit()
except IntegrityError:
print('Prevented from inserting duplicate to cluster table')
self.session.rollback()
# insertable is now an empty object and can't be used without repopulating
这意味着在最坏的情况下,我们在每个插页上都会出现完整性错误,我们需要为每个插页重新选择行ID。
session.commit
选择def find_entry(self, inpath):
inpath = inpath.rstrip('/')
inpath = '/'.join(inpath.split('/')[:-1])
entry = self.session.query(
models.Gather).filter_by(path=inpath).first()
self.session.commit()
return entry
但是这并没有真正提交会话事务,因此当我们回滚时,我们仍然会丢失数据。
出于性能原因,我们有一组代码从数据库中选择数据,从所述DB中的两个表中获取行ID,然后在插入时使用所选行ID作为外键。
这主要是为了让我们不必查询每个插页。
问题在于,如果我们在插入时遇到约束或完整性错误,那么我们必须执行session.rollback
。我们发现这个session.rollback
正在杀死我们之前的查询,即使它们应该在逻辑上处于不同的交易中。
除了我们选择的数据,如果我们有成功插入的对象,我们想引用他们的ID,这些也会在sessions.rollback
之后删除。
class DBInserter:
def __init__(self):
# connection info here
Session = sessionmaker(bind=self.engine)
self.session = Session()
def __del__(self):
self.session.close()
self.engine.dispose()
def find_entry(self, inpath):
inpath = inpath.rstrip('/')
inpath = '/'.join(inpath.split('/')[:-1])
entry = self.session.query(
models.Gather).filter_by(path=inpath).first()
return entry
def build_insertable(self, jsonin, gather, lnn):
"""
Build sqlalchemy object that is ready to be inserted into the db
"""
object_dict = {}
cluster = models.Cluster(
guid = jsonin.get('guid'),
)
gather_id = None
if gather:
gather_id = gather.gather_id
cluster.gather_id = gather_id
gather.cluster_guid = jsonin.get('guid')
gather.cluster_name = jsonin.get('name'),
object_dict['gather'] = gather
node_gather = models.NodeGather(
gather_id = gather_id,
lnn = lnn,
checksum = jsonin.get('checksum'),
checksum_valid = jsonin.get('checksum_valid'),
compliance = jsonin.get('compliance'),
encoding = jsonin.get('encoding'),
joinmode = jsonin.get('joinmode'),
master = jsonin.get('master'),
maxid = jsonin.get('maxid'),
timezone = jsonin.get('timezone'),
)
object_dict['cluster'] = cluster
object_dict['node_gather'] = node_gather
return object_dict
def insert(self, insertable):
"""
insert prepared sqlalchemy object into the db
"""
# Not doing batch inserts until we get single case to work properly.
try:
self.session.add(insertable['cluster'])
self.session.commit()
except IntegrityError:
print('Prevented from inserting duplicate to cluster table')
self.session.rollback()
if insertable.get('gather'):
try:
self.session.add(insertable.get('gather'))
self.session.commit()
except IntegrityError:
print('Prevented from inserting duplicate to gather table')
self.session.rollback()
def __call__(self, jsonin, path):
path = path.rstrip('/')
lnn = int(path.split('/')[-1].split('-')[-1])
out = self.build_insertable(jsonin, self.find_entry(path), lnn)
return out
2016-03-08 01:38:52,438 INFO sqlalchemy.engine.base.Engine SELECT gather.gather_id AS gather_gather_id, gather.cluster_guid AS gather_cluster_guid, gather.path AS gather_path, gather
.cluster_name AS gather_cluster_name, gather.gather_date AS gather_gather_date, gather.unfurl_start AS gather_unfurl_start, gather.unfurl_end AS gather_unfurl_end, gather.upload_date
AS gather_upload_date, gather.source_lnn AS gather_source_lnn, gather.last_full AS gather_last_full, gather.path_exists AS gather_path_exists, gather.type AS gather_type
FROM gather
WHERE gather.path = %(path_1)s
LIMIT %(param_1)s
2016-03-08 01:38:52,438 INFO sqlalchemy.engine.base.Engine {'param_1': 1, 'path_1': '/mnt/logs/REALPAGE/2015-12-14-005'}
2016-03-08 01:38:52,440 INFO sqlalchemy.engine.base.Engine COMMIT
2016-03-08 01:38:52,441 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2016-03-08 01:38:52,441 INFO sqlalchemy.engine.base.Engine SELECT gather.gather_id AS gather_gather_id, gather.cluster_guid AS gather_cluster_guid, gather.path AS gather_path, gather
.cluster_name AS gather_cluster_name, gather.gather_date AS gather_gather_date, gather.unfurl_start AS gather_unfurl_start, gather.unfurl_end AS gather_unfurl_end, gather.upload_date
AS gather_upload_date, gather.source_lnn AS gather_source_lnn, gather.last_full AS gather_last_full, gather.path_exists AS gather_path_exists, gather.type AS gather_type
FROM gather
WHERE gather.gather_id = %(param_1)s
2016-03-08 01:38:52,441 INFO sqlalchemy.engine.base.Engine {'param_1': 'd284c3f7983f94bac95e024038820f05475feddb2f24aec2cb52d42c343194dd'}
2016-03-08 01:38:52,444 INFO sqlalchemy.engine.base.Engine INSERT INTO cluster (guid, site_id) VALUES (%(guid)s, %(site_id)s)
2016-03-08 01:38:52,444 INFO sqlalchemy.engine.base.Engine {'guid': '00074309e06ace7817523b06c7cbf76f7c08', 'site_id': None}
2016-03-08 01:38:52,445 INFO sqlalchemy.engine.base.Engine ROLLBACK
请注意,select和rollback之间没有提交。
HALP。