如何在创建映射和索引时在Elastic Search PHP中创建批量父子关系(join)

时间:2018-01-10 07:08:54

标签: php elasticsearch

我想创建具有父子关系的文档。

我有以下数据,

parent_id = null

的父文档数据
{
            "id": 1,                
            "workflow_name": "Diwali",
            "list_name": "number",
            "list_id": "798",
            "msgType": "Promotional - National",
            "sender_id": "MANISH",
            "submit_date": "2017-11-06 14:09:56",
            "dlrdatetime": "2017-11-06 14:10:06",
            "split_count": 1,
            "error_code": "Waiting",
            "error_text": "-",
            "currency_used": "0.2000",
            "text_type": "text",
            "error_code_status": null,
            "origin_type": "1",
            "api_response_id": 1,
            "response": null,
            "parent_id": null,
            "is_test": 0,
            "link": null,
            "type": 2,
            "message_text": "Hi This is text message",
            "status": null,
            "winner_branch": null,
            "instance_id": "724e540394481746",
            "created_at": "2017-11-06 14:10:06",
            "updated_at": "2017-11-06 14:10:06",
            "branch_id": 0
        }

parent_id = 1

的子文档数据
 {
            "id": 1,                
            "workflow_name": "Diwali",
            "list_name": "number",
            "list_id": "798",
            "msgType": "Promotional - National",
            "sender_id": "MANISH",
            "submit_date": "2017-11-06 14:09:56",
            "dlrdatetime": "2017-11-06 14:10:06",
            "split_count": 1,
            "error_code": "Waiting",
            "error_text": "-",
            "currency_used": "0.2000",
            "text_type": "text",
            "error_code_status": null,
            "origin_type": "1",
            "api_response_id": 1,
            "response": null,
            "parent_id": 1,
            "is_test": 0,
            "link": null,
            "type": 2,
            "message_text": "Hi This is text message",
            "status": null,
            "winner_branch": null,
            "instance_id": "724e540394481746",
            "created_at": "2017-11-06 14:10:06",
            "updated_at": "2017-11-06 14:10:06",
            "branch_id": 0
        }

所以我有一对多的关系,一方父母有很多孩子。

批量映射的示例代码段:

$mapping['index'] = 'response_packets_index_v5';
    $mapping['body'] = array(
        'mappings' => array(
            'response_packets_v5' => array(
                'properties' => [
                    'id' => [
                        'type' => 'integer'
                    ],
                    'workflow_id' => [
                        'type' => 'integer'
                    ],                                            
                    'parent_id' => [
                        'type' => 'integer',                        
                    ],                       
                    'instance_id' => [
                        'type' => 'text'
                    ],
                    'created_at' => [
                        'type' => 'date',
                        'format' => 'yyyy-MM-dd HH:mm:ss'
                    ],
                    'updated_at' => [
                        'type' => 'date',
                        'format' => 'yyyy-MM-dd HH:mm:ss'
                    ],
                    'branch_id' => [
                        'type' => 'integer'
                    ]
                ]
            )
        )
    );

    $client->indices()->create($mapping);

批量索引的代码段:

for ($i = 0; $i <= $count; $i++) {
        $params['body'][] = [
            'index' => [
                '_index' => 'response_packets_index_v5',
                '_type' => response_packets_v5',                    
                'routing' => 'company',
            ]
        ];

        $params['body'][] = $documentData[$i];
 }
return $client->bulk($params);   

我已经阅读了这篇文章但对我没有帮助:https://www.elastic.co/guide/en/elasticsearch/reference/current/parent-join.html

任何人都可以帮助我解决这个问题,非常感谢。

系统详情:

操作系统ubuntu 16.04,

PHP Version 7.1,

ES-PHP客户端版本6.0

1 个答案:

答案 0 :(得分:0)

我在您的规范中看到的一个问题是您没有定义父子关系。为此,您需要使用“join”数据类型。也就是说,在映射中,您需要创建一个字段,指定您的文档是父类型还是子类型。检查以下内容:

"properties": {
            "doc_identifier" : {
                "type": "join",
                "relations": {
                    "parent_doc": "child_doc"
                }
            },
            "workflow_id" : {"type" : "keyword"},
            other fields...
}

此外,您不需要创建“parent_id”字段,而是需要在子项中指定de parent_id,如下所示:

{"doc_identifier" : {"name" : "child_doc", "parent":"parent_id_value"}, "workflow_id": "foo", ..., other fields,...}

最后,在索引父级时,您不需要添加来自子级的信息。