对复杂jsonb

时间:2017-06-28 05:10:10

标签: postgresql full-text-search jsonb postgresql-9.6

我有非常复杂的jsonb列,嵌套数组和对象。我需要对它进行全文搜索。 json的例子:

{
"buyer": {
    "email": "1010001419test@ekbseo.ru",
    "person": {
        "phone": "1010001419",
        "taxId": "590202081324",
        "address": "г Москва, ул Авиаторов, д 34 ",
        "lastName": "Зайцева",
        "passport": {
            "issuer": "йцукйцук",
            "deptCode": "123241",
            "issueDate": [
                1111,
                11,
                11
            ],
            "numAndSeries": "0001212810"
        },
        "birthDate": [
            1952,
            2,
            18
        ],
        "firstName": "Зоя",
        "birthPlace": "фывфыв",
        "patronymic": "Антоновна",
        "citizenship": "Россия"
    }
},
"dealNo": "05-0000004",
"created": [
    2017,
    3,
    6
],
"services": [
    "SGR"
],
"transactId": "602032128",
"dealDetails": {
    "secondary": {
        "deposit": 200000,
        "sellers": [
            {
                "bank": {
                    "bic": "044525225",
                    "city": "Москва",
                    "name": "ПУБЛИЧНОЕ АКЦИОНЕРНОЕ ОБЩЕСТВО \"СБЕРБАНК РОССИИ\"",
                    "correspondentAccount": "30101810400000000225"
                },
                "email": "dsfs@sdf.ru",
                "amount": 4800000,
                "person": {
                    "phone": "1234132512",
                    "taxId": "590202081324",
                    "address": "г Москва, ул Марьинский Парк, д 45 стр 1 ",
                    "lastName": "Трутненко",
                    "passport": {
                        "issuer": "",
                        "deptCode": "",
                        "issueDate": [
                            -999999999,
                            1,
                            1
                        ],
                        "numAndSeries": ""
                    },
                    "birthDate": [
                        1111,
                        11,
                        11
                    ],
                    "firstName": "ываыаы",
                    "birthPlace": "фывфыв",
                    "patronymic": null,
                    "citizenship": "Россия"
                },
                "account": "48213412341234234234"
            }
        ],
        "propertyAddress": "г Москва, ул Вавилова, д 19 "
    }
},
"bankContacts": {
    "bankOfficeId": 3561,
    "mortgageManager": {
        "casId": 88928,
        "email": "sbtestmik1@yandex.ru",
        "phone": "79853622342",
        "lastName": "Дзержински",
        "firstName": "Макар",
        "patronymic": "Олегович"
    },
    "mortgageDeptHead": {
        "casId": 88923,
        "email": "sbtestrcik@yandex.ru",
        "phone": "72384798798",
        "lastName": "Михрюткин",
        "firstName": "Валентин",
        "patronymic": "Геннадьевич"
    }
},
"contractInfo": {
    "city": "Москва",
    "price": 5000000,
    "cadastralNum": "65:65:76876:876",
    "contractDate": [
        1111,
        11,
        11
    ]
},
"creditContract": {
    "number": "41221312",
    "ownCapital": 1000000,
    "loanCapital": 4000000
}

}

实际上我需要搜索deal_nobuyer.person.phonebuyer.person.address.**.(all text values here)dealDetails.secondary.sellers[].(all text values here)bankContacts.(all text values here) 执行此操作的最佳方式是什么?

我使用postgresql 9.6

2 个答案:

答案 0 :(得分:2)

这就是我非常相似的任务。

DB表看起来像:

 CREATE TABLE sites (
   id text NOT NULL,
   doc jsonb,
   PRIMARY KEY (id)
 )

我们存储在doc列中的数据是一个复杂的嵌套JSONB数据:

   {
      "_id": "123",
      "type": "Site",
      "identification": "Custom ID",
      "title": "SITE 1",
      "address": "UK, London, Mr Tom's street, 2",
      "buildings": [
          {
               "uuid": "12312",
               "identification": "Custom ID",
               "name": "BUILDING 1",
               "deposits": [
                   {
                      "uuid": "12312",
                      "identification": "Custom ID",             
                      "audits": [
                          {
                             "uuid": "12312",         
                              "sample_id": "SAMPLE ID"                
                          }
                       ]
                   }
               ]
          } 
       ]
    }

因此JSONB的结构如下:

SITE 
  -> ARRAY OF BUILDINGS
     -> ARRAY OF DEPOSITS
       -> ARRAY OF AUDITS

我们需要通过每种类型的条目中的某些值实现全文搜索:

SITE (identification, title, address)
BUILDING (identification, name)
DEPOSIT (identification)
AUDIT (sample_id)

SQL查询应仅在这些字段值中运行全文搜索。

答案 1 :(得分:1)

到目前为止,我发现的最佳方式是:

  1. 创建tsvector返回函数,它可以像你这样使用json:

    创建或替换功能deal_tsvector(deal_no text,data jsonb)   返回tsvector AS $$ 开始   返回to_tsvector(' russian',deal_no ||'' || data :: text); END

  2. 就像这样创建索引:

    创建索引如果不是EXISTS idx_deal_fts交易使用杜松子酒(deal_tsvector(deal_no,request_json));