ElasticSearch 2.4.1上的MapperParsingException不一致

时间:2016-11-11 04:30:00

标签: elasticsearch mapping

我正在为索引进行动态映射,发现一个人一直在工作而另一个人没有工作,即使它们基本相同。

MapperParsingException[failed to parse]; nested: IllegalArgumentException[mapper [$source.attributes.th.values.display] of different type, current_type [string], merged_type [double]];

这个有效

{
    "_id": "581b883cfb54c66569adfc6c",
    "$source": {
        "attributes": {
            "th": [
                {
                    "values": [
                        {
                            "display": "13.726133,100.5731003",
                            "value": "13.726133,100.5731003"
                        }
                    ],
                    "_v": 4,
                    "type": "geo",
                    "_dt": "com.7leaf.framework.Attribute",
                    "slug": "lat-long",
                    "key": "Lat / Long"
                },
                {
                    "values": [
                        {
                            "display": 34,
                            "value": 34
                        }
                    ],
                    "_v": 4,
                    "type": "number",
                    "_dt": "com.7leaf.framework.Attribute",
                    "slug": "number-of-floors",
                    "key": "จำนวนชั้น"
                }
            ]
        }
    }
}

这不是

{
    "_id": "5824bce9fb54c611b092eec6",
    "$source": {
        "attributes": {
            "th": [
                {
                    "values": [
                        {
                            "display": "13.726133,100.5731003",
                            "value": "13.726133,100.5731003"
                        }
                    ],
                    "type": "geo",
                    "_dt": "com.7leaf.framework.Attribute",
                    "_v": 4,
                    "slug": "lat-long",
                    "key": "Lat / Long"
                },
                {
                    "values": [
                        {
                            "display": 34,
                            "value": 34
                        }
                    ],
                    "type": "number",
                    "_dt": "com.7leaf.framework.Attribute",
                    "_v": 4,
                    "slug": "number-of-floors",
                    "key": "จำนวนชั้น"
                }
            ]
        }
    }
}

可能出现什么问题? "显示"和"价值"字段可以是任何类型。我只是不了解第一个索引的工作原理,而不是第二个索引。它没有多大意义。任何指针都很受欢迎。

这是有效的映射的映射。它是自动生成的。

"values": {
  "properties": {
    "display": {
      "type": "string",
      "fields": {
        "english": {
          "type": "string",
          "analyzer": "english"
        },
        "raw": {
          "type": "string",
          "index": "not_analyzed",
          "ignore_above": 256
        }
      }
    },
    "value": {
      "type": "string",
      "fields": {
        "english": {
          "type": "string",
          "analyzer": "english"
        },
        "raw": {
          "type": "string",
          "index": "not_analyzed",
          "ignore_above": 256
        }
      }
    }
  }
}

对于那些不相信第一个实际工作的人。这是截图。我在该索引中有很多文档。

enter image description here

这是我用于批量重建索引的Java代码。没什么特别的。

public BulkResponse bulkIndex(List<JSONObject> entries){
    if(client == null) return null;

    BulkRequestBuilder bulkRequest = client.prepareBulk();

    for(JSONObject document : entries){
        String indexName = getIndexName(
                document.getString(Constants.DATABASE), document.getString(Constants.COLLECTION));

        String id = document.getString(Constants.ID + "@$oid");

        bulkRequest.add(client.prepareIndex(
                indexName, document.getString(Constants.COLLECTION), id)
                .setSource(document.toMap()));
    }

    return bulkRequest.get();
}

这是来自ElasticSearch的堆栈跟踪:

MapperParsingException[failed to parse]; nested: IllegalArgumentException[mapper [$source.attributes.th.values.display] of different type, current_type [string], merged_type [double]];
    at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:156)
    at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:309)
    at org.elasticsearch.index.shard.IndexShard.prepareIndex(IndexShard.java:580)
    at org.elasticsearch.index.shard.IndexShard.prepareIndexOnPrimary(IndexShard.java:559)
    at org.elasticsearch.action.index.TransportIndexAction.prepareIndexOperationOnPrimary(TransportIndexAction.java:211)
    at org.elasticsearch.action.index.TransportIndexAction.executeIndexRequestOnPrimary(TransportIndexAction.java:223)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardIndexOperation(TransportShardBulkAction.java:327)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:120)
    at org.elasticsearch.action.bulk.TransportShardBulkAction.shardOperationOnPrimary(TransportShardBulkAction.java:68)
    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryPhase.doRun(TransportReplicationAction.java:657)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:287)
    at org.elasticsearch.action.support.replication.TransportReplicationAction$PrimaryOperationTransportHandler.messageReceived(TransportReplicationAction.java:279)
    at org.elasticsearch.transport.RequestHandlerRegistry.processMessageReceived(RequestHandlerRegistry.java:77)
    at org.elasticsearch.transport.TransportService$4.doRun(TransportService.java:376)
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: mapper [$source.attributes.th.values.display] of different type, current_type [string], merged_type [double]
    at org.elasticsearch.index.mapper.FieldMapper.doMerge(FieldMapper.java:378)
    at org.elasticsearch.index.mapper.core.StringFieldMapper.doMerge(StringFieldMapper.java:382)
    at org.elasticsearch.index.mapper.FieldMapper.merge(FieldMapper.java:364)
    at org.elasticsearch.index.mapper.FieldMapper.merge(FieldMapper.java:53)
    at org.elasticsearch.index.mapper.object.ObjectMapper.doMerge(ObjectMapper.java:528)
    at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:501)
    at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:60)
    at org.elasticsearch.index.mapper.object.ObjectMapper.doMerge(ObjectMapper.java:528)
    at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:501)
    at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:60)
    at org.elasticsearch.index.mapper.object.ObjectMapper.doMerge(ObjectMapper.java:528)
    at org.elasticsearch.index.mapper.object.ObjectMapper.merge(ObjectMapper.java:501)
    at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:271)
    at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:308)
    at org.elasticsearch.index.mapper.DocumentParser.parseAndMergeUpdate(DocumentParser.java:740)
    at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:354)
    at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:254)
    at org.elasticsearch.index.mapper.DocumentParser.parseObjectOrField(DocumentParser.java:308)
    at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:328)
    at org.elasticsearch.index.mapper.DocumentParser.parseObject(DocumentParser.java:254)
    at org.elasticsearch.index.mapper.DocumentParser.parseDocument(DocumentParser.java:124)
    ... 18 more

解决方法

我添加了上面提到的映射作为默认模板的一部分,我能够解决它。但是,我不知道它为什么会起作用。我现在可以在同一个字段中存储任何类型的属性。

{
  "template": "*",
  "mappings": {
    "_default_": {
      "dynamic_templates": [
        {
          "strings": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "fields": {
                "raw": {
                  "type":  "string",
                  "index": "not_analyzed",
                  "ignore_above": 256
                },
                "english": {
                  "type":  "string",
                  "analyzer": "english"
                }
              }
            }
          }
        }
      ],
      "properties": {
          "$source": {
            "properties": {
              "attributes": {
                "properties": {
                  "en": {
                    "properties": {
                      "values": {
                        "properties": {
                          "display": {
                            "type": "string",
                            "fields": {
                              "english": {
                                "type": "string",
                                "analyzer": "english"
                              },
                              "raw": {
                                "type": "string",
                                "index": "not_analyzed",
                                "ignore_above": 256
                              }
                            }
                          },
                          "value": {
                            "type": "string",
                            "fields": {
                              "english": {
                                "type": "string",
                                "analyzer": "english"
                              },
                              "raw": {
                                "type": "string",
                                "index": "not_analyzed",
                                "ignore_above": 256
                              }
                            }
                          }
                        }
                      }
                    }
                  },
                  "th": {
                    "properties": {
                      "values": {
                        "properties": {
                          "display": {
                            "type": "string",
                            "fields": {
                              "english": {
                                "type": "string",
                                "analyzer": "english"
                              },
                              "raw": {
                                "type": "string",
                                "index": "not_analyzed",
                                "ignore_above": 256
                              }
                            }
                          },
                          "value": {
                            "type": "string",
                            "fields": {
                              "english": {
                                "type": "string",
                                "analyzer": "english"
                              },
                              "raw": {
                                "type": "string",
                                "index": "not_analyzed",
                                "ignore_above": 256
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
       }
    }
  }
}

有关如何使其动态化的任何指针,所以我不需要为每种语言指定它吗?在上面的示例中,它适用于英语(en)和泰语(th),我计划支持40多种语言,并且我不想为每种语言添加更多映射

2 个答案:

答案 0 :(得分:1)

您的哪些文档首先创建索引很重要!如果进入索引的第一个文档是&#34;正确的&#34;在同一字段的类型之间没有混淆的情况下,然后创建索引,让我们说出一个用于显示的字符串字段。然后你可以很好地索引&#34;有问题的&#34;没有问题。在不存在的索引中逐个索引文档(以便自动创建它)并查看行为的差异。我真的怀疑你的批量索引代码一次只插入一个文档。

如果要发送大量文档,则无法确定实际触发创建索引的文档。一些文档将被发送到一个分片,一些文档将被发送到另一个分片,依此类推。将存在从分片到主节点的消息与映射的混合,以及第一个&#34;胜利&#34;可能是&#34;错误&#34;一个或者#34;正确&#34;一。正如您已经测试过的那样,您需要一个动态映射模板来控制它。

我的建议是使用更具针对性的模板,该模板使用通配符来覆盖您拥有的所有语言。如果您的语言的不同之处只是其中一个字段的名称 - then等 - 那么请使用path_match,然后使用您拥有的模板,但更简洁没有重复相同的映射:

PUT /_template/language_control
{
  "template": "language_control*",
  "mappings": {
    "my_type": {
      "dynamic_templates": [
        {
          "display_values": {
            "path_match": "$source.attributes.*.values.*",
            "mapping": {
              "type": "string",
              "fields": {
                "english": {
                  "type": "string",
                  "analyzer": "english"
                },
                "raw": {
                  "type": "string",
                  "index": "not_analyzed",
                  "ignore_above": 256
                }
              }
            }
          }
        }
      ]
    }
  }
}

如果您查看的path_match将涵盖$source.attributes.th.values.display$source.attributes.en.values.display以及$source.attributes.th.values.value$source.attributes.en.values.value等语言中的其他值。

答案 1 :(得分:0)

您可以做的一件事是为所有要映射到字符串的类型扩充动态模板,因为您可能不会对它们进行查询。

它基本上归结为将模板修改为下面的模板,这将与您上面显示的文档一起使用。请注意,我只为long类型添加了动态映射,但您也可以为其他类型添加类似的映射。最重要的是确保映射器在索引文档时不会发现任何歧义。

{
  "template": "*",
  "mappings": {
    "_default_": {
      "dynamic_templates": [
        {
          "strings": {
            "match_mapping_type": "string",
            "mapping": {
              "type": "string",
              "fields": {
                "raw": {
                  "type": "string",
                  "index": "not_analyzed",
                  "ignore_above": 256
                },
                "english": {
                  "type": "string",
                  "analyzer": "english"
                }
              }
            }
          }
        },
        {
          "longs": {
            "match_mapping_type": "long",
            "mapping": {
              "type": "string"
            }
          }
        }
      ]
    }
  }
}

另请注意,我从来没有能够索引第一个文档,这对我来说总是失败。正如@Andrei正确地说的那样,当批量索引文档时,你无法保证哪个文档首先被索引到哪个分片中,因此你无法保证哪个字段将会赢得&#34; win&#34;,即将首先编入索引并设置字段类型。