django rawsql postgres嵌套json / jsonb查询

时间:2018-03-13 18:42:44

标签: django postgresql psycopg2 django-orm

我在postgres上有一个名为data的jsonb结构,其中每一行(大约有300万行)看起来像这样:

[
    {
        "number": 100,
        "key": "this-is-your-key",
        "listr": "20 Purple block, THE-CITY, Columbia",
        "realcode": "LA40",
        "ainfo": {
            "city": "THE-CITY",
            "county": "Columbia",
            "street": "20 Purple block",
            "var_1": ""
        },
        "booleanval": true,
        "min_address": "20 Purple block, THE-CITY, Columbia LA40"
    },
    .....
]

我想以最快的方式查询min_address字段。在Django我尝试使用:

APModel.objects.filter(data__0__min_address__icontains=search_term)

但这需要很长时间才能完成(同样,“THE-CITY”是大写的,所以,我必须在这里使用icontains。我尝试像这样放到rawsql:

cursor.execute("""\
    SELECT * FROM "apmodel_ap_model" 
    WHERE ("apmodel_ap_model"."data" 
    #>> array['0', 'min_address'])
    @> %s \
    """,\
    [json.dumps([{'min_address': search_term}])]
)

但这会引发我奇怪的错误,如:

LINE 4:       @> '[{"min_address": "some lane"}]'       
              ^
HINT:  No operator matches the given name and argument type(s). You might need to add explicit type casts.

我想知道使用rawsql游标查询字段min_address的最快方法是什么。

1 个答案:

答案 0 :(得分:0)

最新答案,可能不再对OP有所帮助。另外我也不是Postgres / JSONB的专家,所以这可能是一个糟糕的主意。

给出此设置;

so49263641=# \d apmodel_ap_model;
         Table "public.apmodel_ap_model"
 Column | Type  | Collation | Nullable | Default
--------+-------+-----------+----------+---------
 data   | jsonb |           |          |

so49263641=# select * from apmodel_ap_model ;
                                           data
-------------------------------------------------------------------------------------------
 [{"number": 1, "min_address": "Columbia"}, {"number": 2, "min_address": "colorado"}]
 [{"number": 3, "min_address": "  columbia "}, {"number": 4, "min_address": "California"}]
(2 rows)

以下查询将对象从data数组“扩展”到各个行。然后,它将模式匹配应用于min_address字段。

so49263641=# SELECT element->'number' as number, element->'min_address' as min_address 
    FROM apmodel_ap_model ap, JSONB_ARRAY_ELEMENTS(ap.data) element 
    WHERE element->>'min_address' ILIKE '%col%';
 number |  min_address
--------+---------------
 1      | "Columbia"
 2      | "colorado"
 3      | "  columbia "
(3 rows)

但是,我怀疑它将min_address值强制转换为文本后再进行模式匹配。

编辑:有关在索引JSONB数据中进行搜索https://stackoverflow.com/a/33028467/1284043

的一些很好的建议