Django pickle.dumps(model.query)命中db

时间:2016-07-27 07:52:23

标签: python django

我尝试挑选django Query对象以将其保存在Redis中。

materials = Material.objects.prefetch_related('tags_applied').prefetch_related('materialdata_set').prefetch_related('source')
materials_ids = MaterialData.objects.filter(tag_id__in=tags).values_list('material_id', flat=True)
materials = materials.filter(pk__in=materials_ids)
key_name = SAMPLES_UUID + ':' + str(redis_uuid)
redis_cl.set_key(key_name, pickle.dumps(materials.query))
redis_cl.expire(key_name, SAMPLES_TIMEOUT)

这是来自debug_panel的跟踪(我使用延迟分页): 来源查询是:

  

SELECT“san_material”。“id”,“san_material”。“created_at”,   “san_material”。“title”,“san_material”。“作者”,“san_material”。“url”,   “san_material”。“publication_datetime”,“san_material”。“text”,   “san_material”。“size”,“san_material”。“source_id”,   “san_material”。“material_type”,“san_material”。“updated_at”,   “san_material”。“status”,“san_material”。“elastic_sync”,   “san_material”。“tokens”,“san_material”。“detection_datetime”,   “san_material”。 “ARTICLE_TITLE”   “san_material”。 “publication_datetime_article”   “san_material”。“author_article”,“san_material”。“highlight_data”FROM   “san_material”WHERE(“san_material”。“detection_datetime”BETWEEN   '2016-07-01T00:00:00 + 03:00':: timestamptz AND   '2016-07-27T10:39:00 + 03:00':: timestamptz AND“san_material”。“id”IN   (SELECT U0。“material_id”FROM“san_materialdata”U0 WHERE U0。“tag_id”   IN(660)))订购“san_material”。“detection_datetime”DESC LIMIT 51

但它是子查询命中db:

  

SELECT U0。“material_id”FROM“san_materialdata”U0 WHERE U0。“tag_id”   IN(660)

在这里:

/home/maxx/analize/san/utils.py in wrapper(82)
  result = method_to_decorate(*args, **kwds)
/home/maxx/analize/san/views/flux.py in flux(111)
  redis_cl.set_key(key_name, pickle.dumps(materials.query))
/usr/lib/python2.7/pickle.py in dumps(1393)
  Pickler(file, protocol).dump(obj)
/usr/lib/python2.7/pickle.py in dump(225)
  self.save(obj)
/usr/lib/python2.7/pickle.py in save(333)
  self.save_reduce(obj=obj, *rv)
/usr/lib/python2.7/pickle.py in save_reduce(421)
  save(state)
/usr/lib/python2.7/pickle.py in save(288)
  f(self, obj) # Call unbound method with explicit self
/usr/lib/python2.7/pickle.py in save_dict(657)
  self._batch_setitems(obj.iteritems())
/usr/lib/python2.7/pickle.py in _batch_setitems(675)
  save(v)
/usr/lib/python2.7/pickle.py in save(333)
  self.save_reduce(obj=obj, *rv)
/usr/lib/python2.7/pickle.py in save_reduce(421)
  save(state)
/usr/lib/python2.7/pickle.py in save(288)
  f(self, obj) # Call unbound method with explicit self
/usr/lib/python2.7/pickle.py in save_dict(657)
  self._batch_setitems(obj.iteritems())
/usr/lib/python2.7/pickle.py in _batch_setitems(675)
  save(v)
/usr/lib/python2.7/pickle.py in save(288)
  f(self, obj) # Call unbound method with explicit self
/usr/lib/python2.7/pickle.py in save_list(604)
  self._batch_appends(iter(obj))
/usr/lib/python2.7/pickle.py in _batch_appends(620)
  save(x)
/usr/lib/python2.7/pickle.py in save(333)
  self.save_reduce(obj=obj, *rv)
/usr/lib/python2.7/pickle.py in save_reduce(421)
  save(state)
/usr/lib/python2.7/pickle.py in save(288)
  f(self, obj) # Call unbound method with explicit self
/usr/lib/python2.7/pickle.py in save_dict(657)
  self._batch_setitems(obj.iteritems())
/usr/lib/python2.7/pickle.py in _batch_setitems(675)
  save(v)
/usr/lib/python2.7/pickle.py in save(308)
  rv = reduce(self.proto)
/home/maxx/venv/analize/lib/python2.7/copy_reg.py in _reduce_ex(84)
  dict = getstate()

我该如何解决?

p.s我在def _batch_setitems中测量了节省时间的参数:

('Save obj time:', 2.5215649604797363, 'arg:', 'rhs')
('Save obj time:', 2.5219039916992188, 'arg:', 'children')
('Save obj time:', 2.5219550132751465, 'arg:', 'where')

它的3次乘2.5秒。为什么呢?

1 个答案:

答案 0 :(得分:1)

Django查询是懒惰的查询,但让我解释一下你所写的内容:

materials = Material.objects.prefetch_related('tags_applied'
    ).prefetch_related('materialdata_set').prefetch_related('source')


materials_ids = MaterialData.objects.filter(tag_id__in=tags).values_list('material_id', flat=True)

# till now materials_id is queryset, means it will not hit DB.
# as soon it execute next line of code it will hit db, because in next line you are using materials_ids. 

materials = materials.filter(pk__in=materials_ids)

# So you can avoid hiting db if you are not required to use materials 
key_name = SAMPLES_UUID + ':' + str(redis_uuid)
redis_cl.set_key(key_name, pickle.dumps(materials.query))
redis_cl.expire(key_name, SAMPLES_TIMEOUT)

您可以通过在django中使用正确的连接来纠正此问题:

我猜你 MaterialData 模型有材料作为材料模型的外键。

materials = MaterialData.objects.filter(tag_id__in=tags).prefetch_related(
'material__tags_applied'
).prefetch_related('material__materialdata_set').prefetch_related('material__source').values(*all values realted to mateials you can put here by adding materials__ before each material field *)

# to fetch foreign key attribue you use field followed by duble underscore

key_name = SAMPLES_UUID + ':' + str(redis_uuid)
redis_cl.set_key(key_name, pickle.dumps(materials.query))