我将索引数据从ES 5.0(父子)重新索引到ES 6.2(加入类型)
索引ES 5.0中的数据作为父子文档存储在不同的类型中,对于reindex我已经在我的新集群中创建了基于6.2的新索引/映射。
父文档完美地重新索引到新索引,但子文档抛出错误如下
{
"index": "index_two",
"type": "_doc",
"id": "AVpisCkMuwDYFnQZiFXl",
"cause": {
"type": "mapper_parsing_exception",
"reason": "failed to parse",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "[routing] is missing for join field [field_relationship]"
}
},
"status": 400
}
用于重新索引数据的脚本
{
"source": {
"remote": {
"host": "http://myescluster.com:9200",
"socket_timeout": "1m",
"connect_timeout": "20s"
},
"index": "index_two",
"type": ["actions"],
"size": 5000,
"query":{
"bool":{
"must":[
{"term": {"client_id.raw": "cl14ous0ydao"}}
]
}
}
},
"dest": {
"index": "index_two",
"type": "_doc"
},
"script": {
"params": {
"jdata": {
"name": "actions"
}
},
"source": "ctx._routing=ctx._routing;ctx.remove('_parent');params.jdata.parent=ctx._source.user_id;ctx._source.field_relationship=params.jdata"
}
}
我已经在无痛脚本中传递了路由字段,因为文档是源索引的动态。
目的地索引的映射
{
"index_two": {
"mappings": {
"_doc": {
"dynamic_templates": [
{
"template_actions": {
"match_mapping_type": "string",
"mapping": {
"fields": {
"raw": {
"index": true,
"ignore_above": 256,
"type": "keyword"
}
},
"type": "text"
}
}
}
],
"date_detection": false,
"properties": {
"attributes": {
"type": "nested"
}
},
"cl_other_params": {
"type": "nested"
},
"cl_triggered_ts": {
"type": "date"
},
"cl_utm_params": {
"type": "nested"
},
"end_ts": {
"type": "date"
},
"field_relationship": {
"type": "join",
"eager_global_ordinals": true,
"relations": {
"users": [
"actions",
"segments"
]
}
},
"ip_address": {
"type": "ip"
},
"location": {
"type": "geo_point"
},
"processed_ts": {
"type": "date"
},
"processing_time": {
"type": "date"
},
"products": {
"type": "nested",
"properties": {
"traits": {
"type": "nested"
}
}
},
"segment_id": {
"type": "integer"
},
"start_ts": {
"type": "date"
}
}
}
}
}
我的示例源文档
{
"_index": "index_two",
"_type": "actions",
"_id": "AVvKUYcceQCc2OyLKWZ9",
"_score": 7.4023576,
"_routing": "cl14ous0ydaob71ab2a1-837c-4904-a755-11e13410fb94",
"_parent": "cl14ous0ydaob71ab2a1-837c-4904-a755-11e13410fb94",
"_source": {
"user_id": "cl14ous0ydaob71ab2a1-837c-4904-a755-11e13410fb94",
"client_id": "cl14ous0ydao",
"session_id": "CL-e0ec3941-6dad-4d2d-bc9b",
"source": "betalist",
"action": "pageview",
"action_type": "pageview",
"device": "Desktop",
"ip_address": "49.35.14.224",
"location": "20.7333 , 77",
"attributes": [
{
"key": "url",
"value": "https://www.google.com/",
"type": "string"
}
],
"products": []
}
}
答案 0 :(得分:0)
我遇到了同样的问题,并且在弹性搜索讨论中进行搜索,发现this有效:
POST
_reindex
{
"source": {
"index": "old_index",
"type": "actions"
},
"dest": {
"index": "index_two"
},
"script": {
"source": """
ctx._type = "_doc";
String routingCode = ctx._source.user_id;
Map join = new HashMap();
join.put('name', 'actions');
join.put('parent', routingCode);
ctx._source.put('field_relationship', join);
ctx._parent = null;
ctx._routing = new StringBuffer(routingCode)"""
}
}
希望这会有所帮助:)。
答案 1 :(得分:0)
我想指出的是,连接字段通常不需要路由,但是如果在创建父级之前创建子级,那么您将面临这个问题。
建议先重新索引所有父母,然后再重新索引孩子。