我使用的是django和PostgreSQL,我在这个查询中遇到了很大的性能问题,最多需要8-10秒。
我有一个模型"出版物"存储Instagram出版物的位置。我试图在某个城市内获取这些出版物,但这种关系不是很直接,所以查询是:
instagram_publications = Publication.objects.filter(location__spot__city__name=location)
所以在我们的模型中:出版物[FK] - >位置[FK] - >现货[FK] - >市。所有这些模型都继承自TimeStampedModel。
由于搜索是按城市名称进行的,因此我在City.name设置中添加了索引db_index = True但没有任何更改。
我正在分析这个调用解释的查询,我看到与排序相关的大成本。它似乎按创建日期和最后修改日期对行进行了排序,这些行是从TimestampedModel继承的字段,我认为这种类型是不必要的,但我不确定如何避免它。
[PERFORMANCE ANALYSIS]> City filter Instagram
Sort (cost=256874.73..257992.17 rows=446975 width=233) (actual time=294.240..343.831 rows=290637 loops=1)
Sort Key: instanalysis_publication.modified DESC, instanalysis_publication.created DESC
Sort Method: external merge Disk: 60400kB
-> Nested Loop (cost=1.00..114091.50 rows=446975 width=233) (actual time=0.055..110.515 rows=290637 loops=1)
-> Nested Loop (cost=0.57..516.27 rows=2767 width=4) (actual time=0.044..3.145 rows=3374 loops=1)
-> Nested Loop (cost=0.28..39.28 rows=504 width=4) (actual time=0.038..0.323 rows=829 loops=1)
-> Seq Scan on instanalysis_city (cost=0.00..1.10 rows=1 width=4) (actual time=0.011..0.013 rows=1 loops=1)
Filter: ((name)::text = 'Durban'::text)
Rows Removed by Filter: 7
-> Index Scan using instanalysis_spot_c7141997 on instanalysis_spot (cost=0.28..33.14 rows=504 width=8) (actual time=0.024..0.208 rows=829 loops=1)
Index Cond: (city_id = instanalysis_city.id)
-> Index Scan using instanalysis_instagramlocation_e72b53d4 on instanalysis_instagramlocation (cost=0.29..0.89 rows=6 width=8) (actual time=0.001..0.003 rows=4 loops=829)
Index Cond: (spot_id = instanalysis_spot.id)
-> Index Scan using instanalysis_publication_e274a5da on instanalysis_publication (cost=0.43..36.20 rows=485 width=233) (actual time=0.002..0.019 rows=86 loops=3374)
Index Cond: (location_id = instanalysis_instagramlocation.id)
Planning time: 0.809 ms
Execution time: 355.928 ms
似乎也在光盘上进行排序,我猜这是因为有数千行,所以可能无法在内存中完成。
TimeStampedModel类来自django_extras包,并且在Meta中定义了排序:
class TimeStampedModel(models.Model):
""" TimeStampedModel
An abstract base class model that provides self-managed "created" and
"modified" fields.
"""
created = CreationDateTimeField(_('created'))
modified = ModificationDateTimeField(_('modified'))
def save(self, **kwargs):
self.update_modified = kwargs.pop('update_modified', getattr(self, 'update_modified', True))
super(TimeStampedModel, self).save(**kwargs)
class Meta:
get_latest_by = 'modified'
ordering = ('-modified', '-created',)
abstract = True
有没有办法改进它或者可能避免分类步骤?
由于
答案 0 :(得分:0)
我终于找到了避免排序的方法,在发布模型中覆盖了Meta类,只设置了ordering = None。
class Meta:
# Important: Override ordering inherited from TimeStampedModel to improve performance
ordering = None
现在我没有看到排序步骤的成本,但是当我运行它并测量时间iseems需要花费相同的时间时,无论EXPLAIN ANALIZE说什么。
[PERFORMANCE ANALYSIS]> City filter Instagram
Nested Loop (cost=1.00..110920.28 rows=452607 width=233) (actual time=0.069..96.667 rows=290637 loops=1)
-> Nested Loop (cost=0.57..516.27 rows=2767 width=4) (actual time=0.056..2.476 rows=3374 loops=1)
-> Nested Loop (cost=0.28..39.28 rows=504 width=4) (actual time=0.047..0.254 rows=829 loops=1)
-> Seq Scan on instanalysis_city (cost=0.00..1.10 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=1)
Filter: ((name)::text = 'Durban'::text)
Rows Removed by Filter: 7
-> Index Scan using instanalysis_spot_c7141997 on instanalysis_spot (cost=0.28..33.14 rows=504 width=8) (actual time=0.029..0.170 rows=829 loops=1)
Index Cond: (city_id = instanalysis_city.id)
-> Index Scan using instanalysis_instagramlocation_e72b53d4 on instanalysis_instagramlocation (cost=0.29..0.89 rows=6 width=8) (actual time=0.001..0.002 rows=4 loops=829)
Index Cond: (spot_id = instanalysis_spot.id)
-> Index Scan using instanalysis_publication_e274a5da on instanalysis_publication (cost=0.43..34.99 rows=491 width=233) (actual time=0.002..0.016 rows=86 loops=3374)
Index Cond: (location_id = instanalysis_instagramlocation.id)
Planning time: 1.385 ms
Execution time: 103.446 ms