在Sqlalchemy中减慢bulk_save_objects

时间:2016-12-23 07:53:08

标签: python postgresql sqlalchemy

所以我遇到一个非常慢的插入问题,我正在插入223个项目,执行需要20多秒。关于我做错了什么以及为什么这么慢的建议?使用Postgresql 9.4.8

这是表格架构:

                                   Table "public.trial_locations"
   Column    |          Type          |                          Modifiers                           
-------------+------------------------+--------------------------------------------------------------
 id          | integer                | not null default nextval('trial_locations_id_seq'::regclass)
 status      | character varying(255) | 
 trial_id    | integer                | 
 location_id | integer                | 
 active      | boolean                | 
Indexes:
    "trial_locations_pkey" PRIMARY KEY, btree (id)
    "trial_locations_unique_1" UNIQUE CONSTRAINT, btree (trial_id, location_id)
Foreign-key constraints:
    "trial_locations_location_id_fkey" FOREIGN KEY (location_id) REFERENCES locations(id)
    "trial_locations_trial_id_fkey" FOREIGN KEY (trial_id) REFERENCES trials(id)

代码行

for key, unique_new_location in unique_locations_hash.iteritems():
        trial_location_inserts.append(TrialLocations(trial_id = current_trial.id, location_id = unique_new_location['location_id'], active = True, status = unique_new_location['status']))

LOG_OUTPUT('==========PRE BULK==========', True)
db_session.bulk_save_objects(trial_location_inserts)
LOG_OUTPUT('==========POST BULK==========', True)

这是来自sqlalchemy echo的日志

2016-12-23 07:37:52.570: ==========PRE BULK==========
2016-12-22 23:37:52,572 INFO sqlalchemy.engine.base.Engine INSERT INTO trial_locations (status, trial_id, location_id, active) VALUES (%(status)s, %(trial_id)s, %(location_id)s, %(active)s)
2016-12-22 23:37:52,572 INFO sqlalchemy.engine.base.Engine ({'status': u'Completed', 'active': True, 'location_id': 733, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 716, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1033, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1548, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1283, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1556, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 4271, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1567, 'trial_id': 126625}  ... displaying 10 of 223 total bound parameter sets ...  {'status': u'Completed', 'active': True, 'location_id': 1528, 'trial_id': 126625}, {'status': u'Completed', 'active': True, 'location_id': 1529, 'trial_id': 126625})
2016-12-23 07:38:14.270: ==========POST BULK==========

编辑:

另外为了比较,我在Sqlalchemy核心重写了它

if len(trial_location_inserts) > 0:
LOG_OUTPUT('==========PRE BULK==========', True)
engine.execute(
  TrialLocations.__table__.insert().values(
    trial_location_core_inserts
  )
)
# db_session.bulk_save_objects(trial_location_inserts)
LOG_OUTPUT('==========POST BULK==========', True)

它在0.028秒内运行

2016-12-23 08:11:26.097: ==========PRE BULK==========
...
2016-12-23 08:11:27.025: ==========POST BULK==========

我想将它保持在会话中以便交易,但如果Core是唯一的方式我猜它就是它

帮助表示赞赏!

0 个答案:

没有答案