Question

我正在使用Postgres 9.3运行Django 1.7，并使用runserver运行。我的数据库中有大约200米的行或大约80GB的数据。我正在尝试调试为什么Postgres中相同的查询速度相当快，但Django速度慢。

数据结构如下：

class Chemical(models.Model):
    code = models.CharField(max_length=9, primary_key=True)
    name = models.CharField(max_length=200)

class Prescription(models.Models):
    chemical = models.ForeignKey(Chemical)
    ... other fields

使用C归类和合适的索引设置数据库：

                                   Table "public.frontend_prescription"
  Column       |          Type           |                             Modifiers
 id                | integer                 | not null default nextval('frontend_prescription_id_seq'::regclass)
 chemical_id       | character varying(9)    | not null
 Indexes:
    "frontend_prescription_pkey" PRIMARY KEY, btree (id)
    "frontend_prescription_a69d813a" btree (chemical_id)
    "frontend_prescription_chemical_id_4619f68f65c49a8_like" btree (chemical_id varchar_pattern_ops)

这是我的观点：

def chemical(request, bnf_code):
    c = get_object_or_404(Chemical, bnf_code=bnf_code)
    num_prescriptions = Prescription.objects.filter(chemical=c).count()
    context = {
        'num_prescriptions': num_prescriptions
    }
    return render(request, 'chemical.html', context)

瓶颈是.count()。呼叫。 Django调试工具栏显示所花费的时间是2647ms（在下面的“时间”标题下），但是EXPLAIN部分建议所花费的时间应该是621ms（在底部）：

screenshot of debug toolbar

更奇怪的是，如果我直接在Postgres中运行相同的查询，它似乎只需要200-300毫秒：

# explain analyze select count(*) from frontend_prescription where chemical_id='0212000AA';

QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=279495.79..279495.80 rows=1 width=0) (actual time=296.318..296.318 rows=1 loops=1)
   ->  Bitmap Heap Scan on frontend_prescription  (cost=2104.44..279295.83 rows=79983 width=0) (actual time=162.872..276.439 rows=302389 loops=1)
         Recheck Cond: ((chemical_id)::text = '0212000AA'::text)
         ->  Bitmap Index Scan on frontend_prescription_a69d813a  (cost=0.00..2084.44 rows=79983 width=0) (actual time=126.235..126.235 rows=322252 loops=1)
               Index Cond: ((chemical_id)::text = '0212000AA'::text)
 Total runtime: 296.591 ms

所以我的问题是：在调试工具栏中，EXPLAIN语句与Django中的实际性能不同。它比Postgres中的原始查询更慢。

为什么会出现这种差异？我应该如何调试/改善我的Django应用程序的性能？

更新：这是另一个随机的例子：EXPLAIN 350ms，渲染超过10,000！帮助，这使我的Django应用程序几乎无法使用。

enter image description here

更新2：这是另一个慢速（在Django中为40秒，在EXPLAIN中为600毫秒...）查询的“性能分析”面板。如果我正确地阅读它，它表明我的视图中的每个SQL调用花了13秒......这是瓶颈吗？

enter image description here

奇怪的是，对于返回大量结果的查询，配置文件调用速度很慢，因此我认为延迟不是适用于每次调用的Django连接开销。

更新3：我尝试在原始SQL中重写视图，并且在某些时候性能现在更好，尽管我仍然看到大约一半时间的慢查询。（我每次都必须创建并重新创建游标，否则我得到InterfaceError并且有关光标死机的消息 - 不确定这是否对调试有用。我已设置CONN_MAX_AGE=1200 。）无论如何，这表现不错，但显然它很容易被注射等写成：

cursor = connection.cursor()
query = "SELECT * from frontend_chemical WHERE code='%s'" % code
c = cursor.execute(query)
c = cursor.fetchone()
cursor.close()

cursor = connection.cursor()
query = "SELECT count(*) FROM frontend_prescription WHERE chemical_id="
query += "'" + code + "';"
cursor.execute(query)
num_prescriptions = cursor.fetchone()[0]
cursor.close()

context = {
    'chemical': c,
    'num_prescriptions': num_prescriptions
}
return render(request, 'chemical.html', context)

Answer 1

当Django运行查询时，很可能需要从磁盘读取数据。但是，当您检查查询为何缓慢时，由于先前的查询，数据已经在内存中。

最简单的解决方案是购买更多内存或更快的io系统。

Answer 2

在开发计算机上不可靠的分析代码（在评论中显示，桌面上运行的各种事情可能会干扰）。它也不会向您展示使用django-debug-toolbar激活检查运行时的真实性能。如果您对这件事在野外的表现感兴趣，您必须在预期的基础设施上运行它并轻轻一点地进行测量。

def some_view(request):
    search = get_query_parameters(request)
    before = datetime.datetime.now()
    result = ComplexQuery.objects.filter(**search)
    print "ComplexQuery took",datetime.datetime.now() - before
    return render(request, "template.html", {'result':result})

然后，您需要多次运行以预热缓存，然后才能进行任何类型的测量。结果会因设置而异。您可以使用连接池进行预热，postgres在同一类型的后续查询中更快，django也可能设置有一些本地缓存，所有这些都需要旋转才能确定它之前那查询。

所有的分析工具都会报告时间而不考虑他们自己的内省减速，你必须采取相对的方法并使用DDT（或我最喜欢的这些问题：django-devserver）来识别请求处理程序中的热点一贯表现糟糕。另一个值得注意的工具：linesman。设置和维护有点麻烦，但确实很有用。

我一直负责相当大的设置（数据库大小为几十GB）并且没有看到像这样的简单查询严重搁浅。首先要弄清楚你是否确实遇到了问题（不仅仅是runserver毁了你的一天），然后使用工具找到那个热点，然后进行优化。

Django查询比相同的Postgres查询慢40倍？

2 个答案: