如何使用oracle文本索引从json中提取数据

时间:2017-05-29 14:15:13

标签: json oracle

我有一个表,它有一个Oracle文本索引。我创建了索引,因为我需要一个额外的快速搜索。该表包含JSON数据。 Oracle json_textcontains的工作效果非常差,所以我尝试使用CONTAINS(如果我们查看查询计划,json_textcontains实际上会重写为CONTAINS)。

我希望通过给定的class_type和id的值来查找所有jsons,但是Oracle看起来遍布JSON而没有看到class_type和id应该在一个JSON部分中,即它处理的JSON不像结构化数据,而是像一个巨大的字符串。 / p>

格式良好的JSON如下所示:

{
     "class":[
              {
               "class_type":"ownership",
               "values":[{"nm":"id","value":"1"}]
              },
              {
               "class_type":"country",
               "values":[{"nm":"id","value":"640"}]
              },
              ,
              {
               "class_type":"features",
               "values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]
              }
             ]
    }    

不应该找到的第二个看起来像这样:

{
     "class":[
              {
               "class_type":"ownership",
               "values":[{"nm":"id","value":"18"}]
              },
              {
               "class_type":"country",
               "values":[{"nm":"id","value":"11"}]
              },
              ,
              {
               "class_type":"features",
               "values":[{"nm":"id","value":"7"},{"nm":"id","value":"640"}]
              }
             ]
    }

请参阅如何重现我想要实现的目标:

create table perso.json_data(id number, data_val blob);

insert into perso.json_data 

values(
1,
utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_type":"country","values":[{"nm":"id","value":"640"}]},{"class_type":"features","values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]}]}')
);


insert into perso.json_data values(
2,
utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"18"}]},{"class_type":"country","values":[{"nm":"id","value":"11"}]},{"class_type":"features","values":[{"nm":"id","value":"7"},{"nm":"id","value":"640"}]}]}')
)
;

commit;


ALTER TABLE perso.json_data
ADD CONSTRAINT check_is_json
 CHECK (data_val IS JSON (STRICT));

 CREATE INDEX perso.json_data_idx ON json_data (data_val)
 INDEXTYPE IS CTXSYS.CONTEXT
 PARAMETERS ('section group CTXSYS.JSON_SECTION_GROUP SYNC (ON COMMIT)');


select *
from perso.json_data

where ctxsys.contains(data_val, '(640 INPATH(/class/values/value)) and (country inpath (/class/class_type))')>0    

查询返回2行但我希望只得到id = 1的记录。

如何在不使用JSON_TABLE的情况下使用能够在没有突出显示的错误的情况下进行搜索的全文索引?

没有选项以关系格式放置数据。

提前致谢。

1 个答案:

答案 0 :(得分:1)

请不要直接使用文本索引来尝试解决此类问题。它不是为它设计的......

在12.2.0.1.0中,这应该对你有用(是的,它确实使用了封面下的文本索引的专用版本,但它也应用了选择性的后置过滤以确保结果是正确的)..

SQL> create table json_data(id number, data_val blob)
  2  /

Table created.

SQL> insert into json_data values(
  2    1,utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_type":"cou
ntry","values":[{"nm":"id","value":"640"}]},{"class_type":"features","values":[{"nm":"id","value":"15"},{"nm":"id","valu
e":"20"}]}]}')
  3  )
  4  /

1 row created.


Execution Plan
----------------------------------------------------------

--------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |           |     1 |   100 |     1   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL | JSON_DATA |       |       |            |          |
--------------------------------------------------------------------------------------

SQL> insert into json_data values(
  2    2,utl_raw.cast_to_raw('{"class":[{"class_type":"ownership","values":[{"nm":"id","value":"18"}]},{"class_type":"co
untry","values":[{"nm":"id","value":"11"}]},{"class_type":"features","values":[{"nm":"id","value":"7"},{"nm":"id","value
":"640"}]}]}')
  3  )
  4  /

1 row created.


Execution Plan
----------------------------------------------------------

--------------------------------------------------------------------------------------
| Id  | Operation                | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
--------------------------------------------------------------------------------------
|   0 | INSERT STATEMENT         |           |     1 |   100 |     1   (0)| 00:00:01 |
|   1 |  LOAD TABLE CONVENTIONAL | JSON_DATA |       |       |            |          |
--------------------------------------------------------------------------------------

SQL> commit
  2  /

Commit complete.

SQL> ALTER TABLE json_data
  2  ADD CONSTRAINT check_is_json
  3   CHECK (data_val IS JSON (STRICT))
  4  /

Table altered.

SQL> CREATE SEARCH INDEX json_SEARCH_idx ON json_data (data_val) for JSON
  2  /

Index created.

SQL> set autotrace on explain
SQL> --
SQL> set lines 256 trimspool on pages 50
SQL> --
SQL> select ID, json_query(data_val, '$' PRETTY)
  2    from JSON_DATA
  3  /

        ID
----------
JSON_QUERY(DATA_VAL,'$'PRETTY)
------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------
----------------
         1
{
  "class" :
  [
    {
      "class_type" : "ownership",
      "values" :
      [
        {
          "nm" : "id",
          "value" : "1"
        }
      ]
    },
    {
      "class_type" : "country",
      "values" :
      [
        {
          "nm" : "id",
          "value" : "640"
        }
      ]
    },
    {
      "class_type" : "features",
      "values" :
      [
        {
          "nm" : "id",
          "value" : "15"
        },
        {
          "nm" : "id",
          "value" : "20"
        }
      ]
    }
  ]
}

         2
{
  "class" :
  [

        ID
----------
JSON_QUERY(DATA_VAL,'$'PRETTY)
------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------
----------------
    {
      "class_type" : "ownership",
      "values" :
      [
        {
          "nm" : "id",
          "value" : "18"
        }
      ]
    },
    {
      "class_type" : "country",
      "values" :
      [
        {
          "nm" : "id",
          "value" : "11"
        }
      ]
    },
    {
      "class_type" : "features",
      "values" :
      [
        {
          "nm" : "id",
          "value" : "7"
        },
        {
          "nm" : "id",
          "value" : "640"
        }
      ]
    }
  ]
}



Execution Plan
----------------------------------------------------------
Plan hash value: 3213740116

-------------------------------------------------------------------------------
| Id  | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT  |           |     2 |  4030 |     3   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL| JSON_DATA |     2 |  4030 |     3   (0)| 00:00:01 |
-------------------------------------------------------------------------------

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

SQL> select ID, to_clob(data_val)
  2    from json_data
  3   where JSON_EXISTS(data_val,'$?(exists(@.class?(@.values.value == $VALUE && @.class_type == $TYPE)))' passing '640'
 as "VALUE", 'country' as "TYPE")
  4  /

        ID TO_CLOB(DATA_VAL)
---------- --------------------------------------------------------------------------------
         1 {"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_
           type":"country","values":[{"nm":"id","value":"640"}]},{"class_type":"features","
           values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]}]}



Execution Plan
----------------------------------------------------------
Plan hash value: 3248304200

-----------------------------------------------------------------------------------------------
| Id  | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT            |                 |     1 |  2027 |     4   (0)| 00:00:01 |
|*  1 |  TABLE ACCESS BY INDEX ROWID| JSON_DATA       |     1 |  2027 |     4   (0)| 00:00:01 |
|*  2 |   DOMAIN INDEX              | JSON_SEARCH_IDX |       |       |     4   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter(JSON_EXISTS2("DATA_VAL" FORMAT JSON , '$?(exists(@.class?(@.values.value
              == $VALUE && @.class_type == $TYPE)))' PASSING '640' AS "VALUE" , 'country' AS "TYPE"
              FALSE ON ERROR)=1)
   2 - access("CTXSYS"."CONTAINS"("JSON_DATA"."DATA_VAL",'{640} INPATH
              (/class/values/value) and {country} INPATH (/class/class_type)')>0)

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

SQL> select ID, TO_CLOB(DATA_VAL)
  2    from JSON_DATA d
  3   where exists (
  4           select 1
  5             from JSON_TABLE(
  6                    data_val,
  7                    '$.class'
  8                    columns (
  9                      CLASS_TYPE VARCHAR2(32) PATH '$.class_type',
 10                      NESTED PATH '$.values.value'
 11                      columns (
 12                        "VALUE"  VARCHAR2(32) path '$'
 13                      )
 14                    )
 15                  )
 16            where CLASS_TYPE = 'country' and "VALUE" = '640'
 17        )
 18  /

        ID TO_CLOB(DATA_VAL)
---------- --------------------------------------------------------------------------------
         1 {"class":[{"class_type":"ownership","values":[{"nm":"id","value":"1"}]},{"class_
           type":"country","values":[{"nm":"id","value":"640"}]},{"class_type":"features","
           values":[{"nm":"id","value":"15"},{"nm":"id","value":"20"}]}]}



Execution Plan
----------------------------------------------------------
Plan hash value: 1621266031

-------------------------------------------------------------------------------------
| Id  | Operation               | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT        |           |     1 |  2027 |    32   (0)| 00:00:01 |
|*  1 |  FILTER                 |           |       |       |            |          |
|   2 |   TABLE ACCESS FULL     | JSON_DATA |     2 |  4054 |     3   (0)| 00:00:01 |
|*  3 |   FILTER                |           |       |       |            |          |
|*  4 |    JSONTABLE EVALUATION |           |       |       |            |          |
-------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - filter( EXISTS (SELECT 0 FROM JSON_TABLE( :B1, '$.class' COLUMNS(
              "CLASS_TYPE" VARCHAR2(32) PATH '$.class_type' NULL ON ERROR , NESTED PATH
              '$.values.value' COLUMNS( "VALUE" VARCHAR2(32) PATH '$' NULL ON ERROR ) ) )
              "P" WHERE "CTXSYS"."CONTAINS"(:B2,'({country} INPATH (/class/class_type))
              and ({640} INPATH (/class/values/value))')>0 AND "P"."CLASS_TYPE"='country'
              AND "P"."VALUE"='640'))
   3 - filter("CTXSYS"."CONTAINS"(:B1,'({country} INPATH
              (/class/class_type)) and ({640} INPATH (/class/values/value))')>0)
   4 - filter("P"."CLASS_TYPE"='country' AND "P"."VALUE"='640')

Note
-----
   - dynamic statistics used: dynamic sampling (level=2)

SQL>