使用字典中的值批量更新数据库

时间:2018-04-03 08:22:18

标签: python sqlalchemy flask-sqlalchemy

我尝试进行批量更新,但失败了。这就是我的表格的样子

class Hashes(db.Model):
    __tablename__ = 'Hashes'
    id = db.Column(db.Integer, primary_key=True)
    hash_val = db.Column(db.String(1024), unique=True)
    hash_salt = db.Column(db.String(256))
    hash_plain = db.Column(db.String(256))

对象显示为字典:

[
    {'hash_val': '40350254ba198f1efcc9f8dc042fd15b', 'hash_plain': '287742velornesjo'},
    {'hash_val': 'a75b1ef3e16f0a5cae736e48137d7c8b', 'hash_plain': 'Mister King'},
    ...
]

这就是我试图将它们保存在DB中的方式:

dbobjects = [
    Hashes(hash_val=x['hash_val'], hash_plain=x['hash_plain']) for x in hash_good
]
db.session.bulk_insert_objects(dbobjects, update_changed_only=True)
db.session.commit()

我得到的错误是 IntegrityError ,通过查看完整的Traceback,似乎我没有生成任何UPDATE语句而只生成INSERT。如何解决?

Traceback (most recent call last)
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1997, in __call__
return self.wsgi_app(environ, start_response)
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1985, in wsgi_app
response = self.handle_exception(e)
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1540, in handle_exception
reraise(exc_type, exc_value, tb)
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1982, in wsgi_app
response = self.full_dispatch_request()
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1614, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1517, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/home/user/envs/project/lib/python2.7/site-packages/flask/app.py", line 1612, in full_dispatch_request
rv = self.dispatch_request()
File "/home/user/envs/project/lib/python2.7/site-packages/flask_debugtoolbar/__init__.py", line 125, in dispatch_request
return view_func(**req.view_args)
File "/app/views.py", line 317, in upload
db.session.bulk_save_objects(dbobjects, update_changed_only=True)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/orm/scoping.py", line 153, in do
return getattr(self.registry(), name)(*args, **kwargs)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2461, in bulk_save_objects
return_defaults, update_changed_only, False)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2625, in _bulk_save_mappings
transaction.rollback(_capture_exception=True)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 66, in __exit__
compat.reraise(exc_type, exc_value, exc_tb)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 2620, in _bulk_save_mappings
isstates, return_defaults, render_nulls)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 69, in _bulk_insert
bookkeeping=return_defaults)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 830, in _emit_insert_statements
execute(statement, multiparams)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 948, in execute
return meth(self, multiparams, params)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 269, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1060, in _execute_clauseelement
compiled_sql, distilled_params
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1200, in _execute_context
context)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1413, in _handle_dbapi_exception
exc_info
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 203, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb, cause=cause)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1170, in _execute_context
context)
File "/home/user/envs/project/lib/python2.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 105, in do_executemany
rowcount = cursor.executemany(statement, parameters)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/cursors.py", line 192, in executemany
self._get_db().encoding)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/cursors.py", line 229, in _do_execute_many
rows += self.execute(sql + postfix)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/cursors.py", line 165, in execute
result = self._query(query)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/cursors.py", line 321, in _query
conn.query(q)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/connections.py", line 860, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/connections.py", line 1061, in _read_query_result
result.read()
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/connections.py", line 1349, in read
first_packet = self.connection._read_packet()
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/connections.py", line 1018, in _read_packet
packet.check_error()
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/connections.py", line 384, in check_error
err.raise_mysql_exception(self._data)
File "/home/user/envs/project/lib/python2.7/site-packages/pymysql/err.py", line 107, in raise_mysql_exception
raise errorclass(errno, errval)
IntegrityError: (pymysql.err.IntegrityError) (1062, u"Duplicate entry '40350254ba198f1efcc9f8dc042fd15b' for key 'hash_val'") [SQL: u'INSERT INTO `Hashes` (hash_val, hash_plain) VALUES (%(hash_val)s, %(hash_plain)s)'] [parameters: ({'hash_val': '40350254ba198f1efcc9f8dc042fd15b', 'hash_plain': '287742velornesjo'}, {'hash_val': 'a75b1ef3e16f0a5cae736e48137d7c8b', 'hash_plain': 'Mister King'}, {'hash_val': '8ecd03e1fe66c8ee543ff048298af20c', 'hash_plain': 'Farringdon456@'}, {'hash_val': '932d08bfbf87d40ed903f7629a8b3afe', 'hash_plain': '2chilledwater'}, {'hash_val': '8eb47a40e5fdc0e7818d32e0c7fba5b9', 'hash_plain': '1867327El10'}, {'hash_val': '06516de2e5a707621fa6d847e667ce84', 'hash_plain': 'wisky_chocolat$'}, {'hash_val': '7fd1a5943d5b47d98190fc9888131669', 'hash_plain': 'apolloniaime'}, {'hash_val': '93060afe8e7d49dcab6f12c3a9dd146a', 'hash_plain': 'maronmilan89'} ... displaying 10 of 33 total bound parameter sets ... {'hash_val': '14a356c3c4c3775ab183ea24c6a527ce', 'hash_plain': 'dik14424905'}, {'hash_val': 'a2abb9bdff9c730888e64129b89d36aa', 'hash_plain': 'imago2798'})] (Background on this error at: http://sqlalche.me/e/gkpj)

2 个答案:

答案 0 :(得分:1)

根据Ilia E.在评论中提出的建议,我使用了以下解决方案

engine = create_engine(app.config['SQLALCHEMY_DATABASE_URI'], echo=False)
DBSession = scoped_session(sessionmaker())
DBSession.remove()
DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
DBSession.bulk_update_mappings(
    Hashes,
    z
)
DBSession.commit()

'z'是我的新列表,包含不同的词典。我在这里遇到的一个问题是我需要sqlalchemy的主键,因此需要准备两个词典列表并合并它们。 第一个是众所周知的:

[
    {'hash_val': '40350254ba198f1efcc9f8dc042fd15b', 'hash_plain': '287742velornesjo'},
    {'hash_val': 'a75b1ef3e16f0a5cae736e48137d7c8b', 'hash_plain': 'Mister King'},
]

第二个,由

制作
ids=hashesindb=db.session.query(Hashes.id, Hashes.hash_val).filter(Hashes.hash_val.in_(x['hash_val'] for x in hash_good)).all()

给出了以下值

[
    {'hash_val': '40350254ba198f1efcc9f8dc042fd15b', 'id': '10'},
    {'hash_val': 'a75b1ef3e16f0a5cae736e48137d7c8b', 'id': '10'},

]

然后合并:

def merge_lists(l1, l2, key):
    # https://mmxgroup.net/2012/04/12/merging-python-list-of-dictionaries-based-on-specific-key/
    """ returns new list with dictionaries merged from l1 and l2 using key as common value """
    merged = {}
    for item in l1+l2:
        if item[key] in merged:
            merged[item[key]].update(item)
        else:
            merged[item[key]] = item
    return [val for (_, val) in merged.items()]

就是这样。 UPDATE语句有效,也许可以进一步改进(性能方面)。

编辑:再次,根据Illja的评论和大力支持,这里有一个更清洁,更快速的解决方案,不需要查询ID或合并词典。这是字典:

[
    {'b_hash_val': '40350254ba198f1efcc9f8dc042fd15b', 'b_hash_plain': '287742velornesjo'},
    {'b_hash_val': 'a75b1ef3e16f0a5cae736e48137d7c8b', 'b_hash_plain': 'Mister King'},
]

这里是干净的UPDATE语句

engine = create_engine(app.config['SQLALCHEMY_DATABASE_URI'], echo=False)
conn = engine.connect()
stmt = Hashes.__table__.update().\
    where(Hashes.hash_val == bindparam('b_hash_val')).\
    values({
        'hash_plain': bindparam('b_hash_plain')
    })
conn.execute(stmt, hash_good)

答案 1 :(得分:0)

从我在SQLAlchemy文档中看到的内容,bulk_insert_objects()不会尝试区分现有对象和新对象。因此,您在映射中遇到重复哈希键的错误,因为您的SQL结构表示您不能在表中包含重复的hash_val。

可能你可以使用bulk_save_objects(),就我可以从文档中获取而言,它将在需要时生成适当的INSERT / UPDATE语句:

http://docs.sqlalchemy.org/en/latest/orm/session_api.html#sqlalchemy.orm.session.Session.bulk_insert_mappings

另请注意,最初的问题似乎与您的映射本身有关。它有重复的条目似乎没有意义。如果hash_val是相等的,则意味着hash_plains相等(并且你根本不应该更新散列)或者哈希算法是坏的并且为不同的hash_plains产生相同的hash_vals。