使用分析器查询嵌套字段会导致错误

时间:2017-11-02 15:50:18

标签: elasticsearch nested

我尝试使用同义词分析器来处理我已经在工作的弹性搜索类型。这是我的serviceEntity的映射:

{
"serviceentity" : {
    "properties":{          
        "ServiceLangProps" : {
            "type" : "nested",
            "properties" : {
                "NAME"          : {"type" : "string", "search_analyzer": "synonym"},
                "LONG_TEXT"     : {"type" : "string", "search_analyzer": "synonym"},
                "DESCRIPTION"   : {"type" : "string", "search_analyzer": "synonym"},
                "MATERIAL"      : {"type" : "string", "search_analyzer": "synonym"},
                "LANGUAGE_ID"   : {"type" : "string", "include_in_all": false}
            }
        },
        "LinkProps" : {
            "type" : "nested",
            "properties" : {
                "TITLE" : {"type" : "string", "search_analyzer": "synonym"},
                "LINK"  : {"type" : "string"},
                "LANGUAGE_ID"   : {"type" : "string", "include_in_all": false}
            }
        },
        "MediaProps" : {
            "type" : "nested",
            "properties" : {
                "TITLE"     : {"type" : "string", "search_analyzer": "synonym"},
                "FILENAME"  : {"type" : "string"},
                "LANGUAGE_ID"   : {"type" : "string", "include_in_all": false}
            }
        }
    }
}

}

这些是我的设置

 {
  "analysis": {
    "filter": {
      "synonym": {
        "ignore_case": "true",
        "type": "synonym",
        "synonyms": [
          "lorep, spaceship",
          "ipsum, planet"
        ]
      }
    },
    "analyzer": {
      "synonym": {
        "filter": [
          "lowercase",
          "synonym"
        ],
        "tokenizer": "whitespace"
      }
    }
  }
}

当在尝试搜索任何内容时,我收到此错误:

Caused by: org.elasticsearch.index.query.QueryParsingException: [nested] nested object under path [ServiceLangProps] is not of nested type

我不明白为什么。如果我没有在我的设置中添加任何分析仪,一切正常。

我使用java API与elasticsearch实例进行通信。因此,对于多匹配查询,我的代码看起来像这样:

MultiMatchQueryBuilder multiMatchBuilder = QueryBuilders.multiMatchQuery(fulltextSearchString, QUERY_FIELDS).analyzer("synonym");

java API创建的查询字符串如下所示:

{
  "query" : {
    "bool" : {
      "must" : {
        "bool" : {
          "should" : [ {
            "nested" : {
              "query" : {
                "bool" : {
                  "must" : [ {
                    "match" : {
                      "ServiceLangProps.LANGUAGE_ID" : {
                        "query" : "DE",
                        "type" : "boolean"
                      }
                    }
                  }, {
                    "multi_match" : {
                      "query" : "lorem",
                      "fields" : [ "ServiceLangProps.NAME", "ServiceLangProps.DESCRIPTION", "ServiceLangProps.MATERIALKURZTEXT", "ServiceLangProps.DESCRIPTION_RICHTEXT" ],
                      "analyzer" : "synonym"
                    }
                  } ]
                }
              },
              "path" : "ServiceLangProps"
            }
          }, {
            "nested" : {
              "query" : {
                "bool" : {
                  "must" : [ {
                    "match" : {
                      "LinkProps.LANGUAGE_ID" : {
                        "query" : "DE",
                        "type" : "boolean"
                      }
                    }
                  }, {
                    "match" : {
                      "LinkProps.TITLE" : {
                        "query" : "lorem",
                        "type" : "boolean"
                      }
                    }
                  } ]
                }
              },
              "path" : "LinkProps"
            }
          }, {
            "nested" : {
              "query" : {
                "bool" : {
                  "must" : [ {
                    "match" : {
                      "MediaProps.LANGUAGE_ID" : {
                        "query" : "DE",
                        "type" : "boolean"
                      }
                    }
                  }, {
                    "match" : {
                      "MediaProps.TITLE" : {
                        "query" : "lorem",
                        "type" : "boolean"
                      }
                    }
                  } ]
                }
              },
              "path" : "MediaProps"
            }
          } ]
        }
      },
      "filter" : {
        "bool" : { }
      }
    }
  }
}

如果我在LinkProps或MediaProps上尝试,我会为相应的嵌套对象获得相同的错误。

编辑:我使用的是弹性搜索版本2.4.6

3 个答案:

答案 0 :(得分:0)

检查查询字符串以及了解正在使用的ES版本也会有所帮助。

我看不到synonyms_path以及你使用嵌套类型的事实会导致错误。

你可能已经看过这个,但万一你没有

https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-synonym-tokenfilter.html

答案 1 :(得分:0)

我创建了一个我尝试做的最小例子。

我的映射如下所示:

{
    "serviceentity" : {
        "properties":{          
            "LinkProps" : {
                "type" : "nested",
                "properties" : {
                    "TITLE" : {"type" : "string", "search_analyzer": "synonym"},
                    "LINK"  : {"type" : "string"},
                    "LANGUAGE_ID"   : {"type" : "string", "include_in_all": false}
                }
            }
        }
    }
}

我在JAVA代码中对同义词分析器的设置:

XContentBuilder builder = jsonBuilder()
        .startObject()
            .startObject("analysis")
                .startObject("filter")
                    .startObject("synonym") // The name of the analyzer
                        .field("type", "synonym") // The type (derivate)
                        .field("ignore_case", "true") 
                        .array("synonyms", synonyms) // The synonym list
                    .endObject()
                .endObject()
                .startObject("analyzer")
                    .startObject("synonym")
                        .field("tokenizer", "whitespace")
                        .array("filter", "lowercase", "synonym")
                    .endObject()
                .endObject()
            .endObject()
        .endObject();

ElasticSearch Head Chrome插件吐出的元数据如下所示:

{
  "analysis": {
    "filter": {
      "synonym": {
        "ignore_case": "true",
        "type": "synonym",
        "synonyms": [
          "Test, foo",
          "Title, bar"
        ]
      }
    },
    "analyzer": {
      "synonym": {
        "filter": [
          "lowercase",
          "synonym"
        ],
        "tokenizer": "whitespace"
      }
    }
  }
}

当我现在使用搜索查询来查找"测试"我得到的错误与我在第一篇文章中提到的相同。这是查询

{
  "query": {
    "bool": {
      "must": {
        "nested": {
          "path": "LinkProps",
          "query": {
            "multi_match": {
              "query": "Test",
              "fields": [
                "LinkProps.TITLE",
                "LinkProps.LINK"
              ],
              "analyzer": "synonym"
            }
          }
        }
      }
    }
  }
}

导致此错误

{
  "error": {
    "root_cause": [
      {
        "type": "query_parsing_exception",
        "reason": "[nested] nested object under path [LinkProps] is not of nested type",
        "index": "minimal",
        "line": 1,
        "col": 44
      }
    ],
    "type": "search_phase_execution_exception",
    "reason": "all shards failed",
    "phase": "query",
    "grouped": true,
    "failed_shards": [
      {
        "shard": 0,
        "index": "minimal",
        "node": "6AhE4RCIQwywl49h0Q2-yw",
        "reason": {
          "type": "query_parsing_exception",
          "reason": "[nested] nested object under path [LinkProps] is not of nested type",
          "index": "minimal",
          "line": 1,
          "col": 44
        }
      }
    ]
  },
  "status": 400
}

当我用

检查分析仪时
GET http://localhost:9200/minimal/_analyze?text=foo&analyzer=synonym&pretty=true

我得到了正确答案

{
  "tokens": [
    {
      "token": "foo",
      "start_offset": 0,
      "end_offset": 3,
      "type": "word",
      "position": 0
    },
    {
      "token": "test",
      "start_offset": 0,
      "end_offset": 3,
      "type": "SYNONYM",
      "position": 0
    }
  ]
}

所以分析仪似乎设置正确。我搞砸了映射吗?我想这个问题不是因为我有嵌套对象或是吗?

答案 2 :(得分:0)

我刚试过这个

{
  "query": {
    "bool": {
      "must": {
        "query": {
          "multi_match": {
            "query": "foo",
            "fields": [
              "LinkProps.TITLE",
              "LinkProps.LINK"
            ],
            "analyzer": "synonym"
          }
        }
      }
    }
  }
}

如您所见,我删除了“嵌套”包装器

"nested": {
          "path": "LinkProps",
          ...
}

现在至少在一些结果中起作用(不确定,如果这些最终会得到正确的结果)。我正在尝试将其应用于原始项目,如果这也有用,请随时发布。