在django psql查询中排序性能问题

时间:2017-03-16 17:47:42

标签: python django performance postgresql

我使用的是django和PostgreSQL,我在这个查询中遇到了很大的性能问题,最多需要8-10秒。

我有一个模型"出版物"存储Instagram出版物的位置。我试图在某个城市内获取这些出版物,但这种关系不是很直接,所以查询是:

instagram_publications = Publication.objects.filter(location__spot__city__name=location)

所以在我们的模型中:出版物[FK] - >位置[FK] - >现货[FK] - >市。所有这些模型都继承自TimeStampedModel。

由于搜索是按城市名称进行的,因此我在City.name设置中添加了索引db_index = True但没有任何更改。

我正在分析这个调用解释的查询,我看到与排序相关的大成本。它似乎按创建日期和最后修改日期对行进行了排序,这些行是从TimestampedModel继承的字段,我认为这种类型是不必要的,但我不确定如何避免它。

[PERFORMANCE ANALYSIS]> City filter Instagram
Sort  (cost=256874.73..257992.17 rows=446975 width=233) (actual time=294.240..343.831 rows=290637 loops=1)
  Sort Key: instanalysis_publication.modified DESC, instanalysis_publication.created DESC
  Sort Method: external merge  Disk: 60400kB
  ->  Nested Loop  (cost=1.00..114091.50 rows=446975 width=233) (actual time=0.055..110.515 rows=290637 loops=1)
        ->  Nested Loop  (cost=0.57..516.27 rows=2767 width=4) (actual time=0.044..3.145 rows=3374 loops=1)
              ->  Nested Loop  (cost=0.28..39.28 rows=504 width=4) (actual time=0.038..0.323 rows=829 loops=1)
                    ->  Seq Scan on instanalysis_city  (cost=0.00..1.10 rows=1 width=4) (actual time=0.011..0.013 rows=1 loops=1)
                          Filter: ((name)::text = 'Durban'::text)
                          Rows Removed by Filter: 7
                    ->  Index Scan using instanalysis_spot_c7141997 on instanalysis_spot  (cost=0.28..33.14 rows=504 width=8) (actual time=0.024..0.208 rows=829 loops=1)
                          Index Cond: (city_id = instanalysis_city.id)
              ->  Index Scan using instanalysis_instagramlocation_e72b53d4 on instanalysis_instagramlocation  (cost=0.29..0.89 rows=6 width=8) (actual time=0.001..0.003 rows=4 loops=829)
                    Index Cond: (spot_id = instanalysis_spot.id)
        ->  Index Scan using instanalysis_publication_e274a5da on instanalysis_publication  (cost=0.43..36.20 rows=485 width=233) (actual time=0.002..0.019 rows=86 loops=3374)
              Index Cond: (location_id = instanalysis_instagramlocation.id)
Planning time: 0.809 ms
Execution time: 355.928 ms

似乎也在光盘上进行排序,我猜这是因为有数千行,所以可能无法在内存中完成。

TimeStampedModel类来自django_extras包,并且在Meta中定义了排序:

class TimeStampedModel(models.Model):
    """ TimeStampedModel
    An abstract base class model that provides self-managed "created" and
    "modified" fields.
    """
    created = CreationDateTimeField(_('created'))
    modified = ModificationDateTimeField(_('modified'))

    def save(self, **kwargs):
        self.update_modified = kwargs.pop('update_modified', getattr(self, 'update_modified', True))
        super(TimeStampedModel, self).save(**kwargs)

    class Meta:
        get_latest_by = 'modified'
        ordering = ('-modified', '-created',)
        abstract = True

有没有办法改进它或者可能避免分类步骤?

由于

1 个答案:

答案 0 :(得分:0)

我终于找到了避免排序的方法,在发布模型中覆盖了Meta类,只设置了ordering = None。

class Meta:
    #  Important: Override ordering inherited from TimeStampedModel to improve performance
    ordering = None

现在我没有看到排序步骤的成本,但是当我运行它并测量时间iseems需要花费相同的时间时,无论EXPLAIN ANALIZE说什么。

[PERFORMANCE ANALYSIS]> City filter Instagram
Nested Loop  (cost=1.00..110920.28 rows=452607 width=233) (actual time=0.069..96.667 rows=290637 loops=1)
  ->  Nested Loop  (cost=0.57..516.27 rows=2767 width=4) (actual time=0.056..2.476 rows=3374 loops=1)
        ->  Nested Loop  (cost=0.28..39.28 rows=504 width=4) (actual time=0.047..0.254 rows=829 loops=1)
              ->  Seq Scan on instanalysis_city  (cost=0.00..1.10 rows=1 width=4) (actual time=0.014..0.016 rows=1 loops=1)
                    Filter: ((name)::text = 'Durban'::text)
                    Rows Removed by Filter: 7
              ->  Index Scan using instanalysis_spot_c7141997 on instanalysis_spot  (cost=0.28..33.14 rows=504 width=8) (actual time=0.029..0.170 rows=829 loops=1)
                    Index Cond: (city_id = instanalysis_city.id)
        ->  Index Scan using instanalysis_instagramlocation_e72b53d4 on instanalysis_instagramlocation  (cost=0.29..0.89 rows=6 width=8) (actual time=0.001..0.002 rows=4 loops=829)
              Index Cond: (spot_id = instanalysis_spot.id)
  ->  Index Scan using instanalysis_publication_e274a5da on instanalysis_publication  (cost=0.43..34.99 rows=491 width=233) (actual time=0.002..0.016 rows=86 loops=3374)
        Index Cond: (location_id = instanalysis_instagramlocation.id)
Planning time: 1.385 ms
Execution time: 103.446 ms