在Django PostgreSQL中索引JSONField

时间:2019-02-22 09:43:34

标签: django postgresql

我正在使用带有属性的简单模型,该属性将该对象的所有数据存储在JSONField中。将其视为将NoSQL数据传输到我的PostgreSQL数据库的方法。有点像这样:

from django.contrib.postgres.fields import JSONField   

class Document(models.Model):
    content = JSONField()

每个Document对象在其content字段中(或多或少)具有相同的键,因此我正在使用这些键查询和排序这些文档。对于查询和排序,我使用的是Django的annotate()函数。我最近遇到了这个问题:

https://docs.djangoproject.com/en/2.1/ref/contrib/postgres/indexes/

我也知道PostgreSQL使用JSONB,这显然是可索引的。所以我的问题是:我可以以某种方式索引我的content字段以使我的读取操作更快地用于复杂查询吗?如果是这样,那我该怎么办呢?我链接的文档页面没有示例。

2 个答案:

答案 0 :(得分:4)

对于那些希望为特定键编制索引的用户,请创建原始sql迁移:

  1. 运行./manage.py makemigrations --empty yourApp,其中yourApp是您要为其更改索引的模型的应用程序。

  2. 即编辑迁移

operations = [
    migrations.RunSQL("CREATE INDEX idx_name ON your_table((json_field->>'json_key'));")
]

其中idx_name是索引的名称,your_table是您的表,json_field是您的JSONField,在这种情况下,json_key是您要索引的键

应该这样做,但是要验证一切运行正常,请运行以下sql:

SELECT
    indexname,
    indexdef
FROM
    pg_indexes
WHERE
    tablename = '<your-table>';

看看您的索引是否在那里。

答案 1 :(得分:0)

有一种更通用的 Django 原生方式。您可以使用以下 custom Migration Operation

class CreateJsonbObjectKeyIndex(Operation):

    reversible = True

    def __init__(self, model_name, field, key, index_type='btree', concurrently=False, name=None):
        self.model_name = model_name
        self.field = field
        self.key = key
        self.index_type = index_type
        self.concurrently = concurrently
        self.name = name

    def state_forwards(self, app_label, state):
        pass

    def get_names(self, app_label, schema_editor, from_state, to_state):
        table_name = from_state.apps.get_model(app_label, self.model_name)._meta.db_table
        index_name = schema_editor.quote_name(
            self.name or schema_editor._create_index_name(table_name, [f'{self.field}__{self.key}'])
        )
        return table_name, index_name

    def database_forwards(self, app_label, schema_editor, from_state, to_state):
        table_name, index_name = self.get_names(app_label, schema_editor, from_state, to_state)
        schema_editor.execute(f"""
            CREATE INDEX {'CONCURRENTLY' if self.concurrently else ''} {index_name} 
            ON {table_name}
            USING {self.index_type}
            (({self.field}->'{self.key}'));
        """)

    def database_backwards(self, app_label, schema_editor, from_state, to_state):
        _, index_name = self.get_names(app_label, schema_editor, from_state, to_state)
        schema_editor.execute(f"DROP INDEX {index_name};")

    def describe(self):
        return f'Creates index for JSONB object field {self.field}->{self.key} of {self.model_name} model'

    @property
    def migration_name_fragment(self):
        return f'create_index_{self.model_name}_{self.field}_{self.key}'

用法示例:

from django.db import migrations

from util.migration import CreateJsonbObjectKeyIndex


class Migration(migrations.Migration):
    atomic = False  # Required if concurrently=True for 0 downtime background index creation

    dependencies = [
        ('app_label', '00XX_prev_migration'),
    ]

    operations = [
        migrations.SeparateDatabaseAndState(
            database_operations=[
                # Operation to run custom SQL command. Check the output of `sqlmigrate` to see the auto-generated SQL
                CreateJsonbObjectKeyIndex(
                    model_name='User', field='meta', key='adid', index_type='HASH',
                    concurrently=True,
                )
            ],
        )
    ]

已使用 Django-2.2 和 AWS Postgres RDS 进行测试,但应与其他 Django 兼容