python / django:嵌套for循环对于查询集遍历来说真的很慢

时间:2018-05-08 06:32:49

标签: python django performance django-models python-performance

我有两个型号名为机器性能

class machine(models.Model):
    machine_type = models.CharField(null=True, max_length=10)
    machine_no = models.IntegerField(null=True)    
    machine_name = models.CharField(null=True,max_length=255)
    machine_sis = models.CharField(null=True, max_length=255)
    store_code = models.IntegerField(null=True)
    created = models.DateTimeField(auto_now_add=True)

class Performance(models.Model):
    machine_no = models.IntegerField(null=True)
    power = models.IntegerField(null=True)
    store_code = models.IntegerField(null=True)
    created = models.DateTimeField(auto_now_add=True)

对于每个计算机,性能模型中有多个字段,我必须找到效果模型的计数数据库中有power = some_integer的行。以下是我的观点:

machines = machine.objects.filter(machine_type="G",machine_sis="919")
# let's say machine.count() sometimes is 100
# for each of this machine i need to calculate the number of machines which have power = 100 in performance model. 
# so what i did first was but was really slow
for obj in machines:
  print performance.objects.filter(machine_no=obj.machine_no,power=100).count()
# my second approach was faster than first approach
for obj in machines:
    data = performance.objects.filter(machine_no=obj.machine_no,power=100)
    counter = 0
    for p in data: # ***** lets say this loop is called star-loop
        if p.power == 100:
            counter +=1

我的问题: 当我必须在性能模型中检查100台机器的功率=某事时,速度真的很慢。

其他信息: 我没有在性能模型中使用外键,因为实际架构更复杂,我不能使用机器号或任何东西作为外键,因为在唯一识别每台机器时我需要多列机器。 此外,这个项目正在生产中,我不会有太大的机会。我正在使用django 1.11,python 2.7和postresql rds实例。我已经增加了网络性能,从aws租用更好的实例。另外,我用了时间

3 个答案:

答案 0 :(得分:3)

您可以在Python端进行计数和过滤:

from collections import Counter

c = Counter(performance.objects.filter(power=100).
            values_list('machine_no', flat=True))


m = machine.objects.filter(machine_type="G",machine_sis="919")
    .values_list('machine_no', flat=True)

result = sum(v for k,v in c.items() if k in m)
  

如果我需要功率= 100并且还有单独的机器列表   功率= 99?我必须使用两个单独的Counter()函数   查询?

不,只需使用Q object将过滤器添加到同一查询中,然后计算两个不同的结果,如下所示:

from collections import Counter
from django.db.models import Q

c = Counter(performance.objects.filter(power=100 | Q(power=99)).
            values_list('machine_no', 'power'))

m = machine.objects.filter(machine_type="G",machine_sis="919")
    .values_list('machine_no', flat=True)

result_100 = sum(v for k,v in c.items() if k[0] in m and k[1] = 100)
result_99  = sum(v for k,v in c.items() if k[0] in m and k[1] = 99)

答案 1 :(得分:0)

这看起来像是N + 1选择的问题。您可以执行以下操作以减少查询计数:

machines = machine.objects.filter(machine_type="G",machine_sis="919")
machine_nos = machine.values_list('machine_no', flat=True)
performance = performance.objects.filter(machine_no__in=machine_nos, power=100)

这会将查询次数减少到最多三次

答案 2 :(得分:0)

您可以使用原始查询。它可能是这样的。请更新以准确使用您的数据库表名称。

 machine.objects.raw(select * from machine as b
      join (select count(id), machine_no from performance where power=100 
      group by machine_no) as a
      on b.id = a.machine_no
      where b.machine_type="G" and b.machine_sis="919")