Django聚合需要很多时间

时间:2018-06-26 09:34:45

标签: python django

我有一个定义为波纹管的模型

class Image(model.Models):
    # Stages
    STAGE_TRAIN = 'train'
    STAGE_VAL = 'val'
    STAGE_TEST = 'test'
    STAGE_TRASH = 'trash'

    STAGE_CHOICES = (
        (STAGE_TRAIN, 'Train'),
        (STAGE_VAL, 'Validation'),
        (STAGE_TEST, 'Test'),
        (STAGE_TRASH, 'Trash'),
    )
    stage = models.CharField(max_length=5, choices=STAGE_CHOICES, default=STAGE_TRAIN)
    commit = models.ForeignKey(Commit, on_delete=models.CASCADE, related_name="images", related_query_name="image")

在我的数据库中,我有170k张图片,并且我尝试使用一个端点来按阶段计数所有图片

目前我有类似的东西

base_query = Image.objects.filter(commit=commit_uuid).only('id', 'stage')
count_query = base_query.aggregate(count_train=Count('id', filter=Q(stage='train')),
                                   count_val=Count('id', filter=Q(stage='val')),
                                   count_trash=Count('id', filter=Q(stage='trash')))

但是大约需要40秒,当我尝试在shell中查看SQL请求时,我看起来有些正常

{'sql': 'SELECT COUNT("image"."id") FILTER (WHERE "image"."stage" = \'train\') AS "count_train", COUNT("image"."id") FILTER (WHERE "image"."stage" = \'val\') AS "count_val", COUNT("image"."id") FILTER (WHERE "image"."stage" = \'trash\') AS "count_trash" FROM "image" WHERE "image"."commit_id" = \'333681ff-886a-42d0-b88a-5d38f1e9fe94\'::uuid', 'time': '42.140'}

另一个奇怪的事情是,如果我使用以下方式更改聚合函数

count_query = base_query.aggregate(count_train=Count('id', filter=Q(stage='train')&Q(commit=commit_uuid)),
                                           count_val=Count('id', filter=Q(stage='val')&Q(commit=commit_uuid)),
                                           count_trash=Count('id', filter=Q(stage='trash')&Q(commit=commit_uuid)))

当我这样做时,查询速度快两倍(仍为20秒),并且在显示SQL时,我看到提交的过滤器是在FILTER内完成的

所以我有两个问题:

  • 我可以做一些其他事情来提高查询速度吗?还是应该将计数存储在某个位置并在每次更改图像时更改值?

  • 我期望查询首先对提交ID进行过滤,然后对stage进行过滤,但我感觉它反过来已经完成

2 个答案:

答案 0 :(得分:2)

1)您可以使用public class MainActivity extends AppCompatActivity { LocationManager locationManager ; LocationListener locationListener ; public void onRequestPermissionsResult(int requestCode, @NonNull String[] permissions, @NonNull int[] grantResults) { super.onRequestPermissionsResult(requestCode, permissions, grantResults); if (requestCode==1){ if (grantResults.length > 0 && grantResults[0] == PackageManager.PERMISSION_GRANTED) { if (ContextCompat.checkSelfPermission(this,Manifest.permission.ACCESS_FINE_LOCATION) == PackageManager.PERMISSION_GRANTED){ locationManager.requestLocationUpdates(LocationManager.GPS_PROVIDER, 0, 0, locationListener); }} else { locationManager.requestLocationUpdates(LocationManager.GPS_PROVIDER,0,0,locationListener); } }} @SuppressLint("MissingPermission") @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.activity_main); locationManager = (LocationManager)this.getSystemService(LOCATION_SERVICE); locationListener = new LocationListener() { @Override public void onLocationChanged(Location location) { Log.i("Location : ", String.valueOf(location.getLatitude()+" "+location.getLongitude())); Toast.makeText(MainActivity.this, String.valueOf(location.getLatitude() + " " + location.getLongitude()), Toast.LENGTH_SHORT).show(); } @Override public void onStatusChanged(String s, int i, Bundle bundle) { } @Override public void onProviderEnabled(String s) { } @Override public void onProviderDisabled(String s) { } }; if (Build.VERSION.SDK_INT<23){ locationManager.requestLocationUpdates(LocationManager.GPS_PROVIDER,0,0,locationListener); }else { if (ContextCompat.checkSelfPermission(this,Manifest.permission.ACCESS_FINE_LOCATION) != PackageManager.PERMISSION_GRANTED){ ActivityCompat.requestPermissions(this, new String[]{Manifest.permission.ACCESS_FINE_LOCATION},1); }else{ locationManager.requestLocationUpdates(LocationManager.GPS_PROVIDER,0,0,locationListener); } } } } 选项添加字段索引

06-26 13:29:33.019 19954-19954/com.example.arabtech.myapplication 
I/Location:: 31.205798333333338 29.924498333333336
06-26 13:29:43.021 19954-19954/com.example.arabtech.myapplication 
I/Location :: 31.205798333333338 29.924498333333336
06-26 13:29:53.051 19954-19954/com.example.arabtech.myapplication 
I/Location :: 31.205798333333338 29.924498333333336
06-26 13:30:03.019 19954-19954/com.example.arabtech.myapplication 
I/Location :: 31.205798333333338 29.924498333333336
06-26 13:30:13.015 19954-19954/com.example.arabtech.myapplication 
I/Location :: 31.205798333333338 29.924498333333336
06-26 13:30:23.044 19954-19954/com.example.arabtech.myapplication 

index_together选项(参见https://docs.djangoproject.com/en/2.0/ref/models/options/#django.db.models.Options.indexes

class Image(model.Models):
    class Meta:
         index_together = [['stage'], ['stage', 'commit']]

2)您不需要查找indexes

class Image(model.Models):
    class Meta:
        indexes = [models.Index(fields=['stage', 'commit'])]

答案 1 :(得分:1)

我会在您的模型中尝试以下方法:

stage = models.CharField(max_length=5, choices=STAGE_CHOICES, default=STAGE_TRAIN, index=True)

通过将索引添加到暂存器中,您应该避免全表扫描。