Elasticsearch multi_match查询无法使用同义词和cross_fields

时间:2017-07-17 15:25:19

标签: elasticsearch

具有cross_fiels类型和同义词的Elasticsearch多重匹配查询未按预期工作。

我有以下配置:

{
    "my_index": {
        "mappings": {
            "my_mapping": {
                "properties": {
                    "@timestamp": {
                        "type": "date"
                    },
                    "@version": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "field1": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    },
                    "field2": {
                        "type": "text",
                        "fields": {
                            "keyword": {
                                "type": "keyword",
                                "ignore_above": 256
                            }
                        }
                    }
        },
        "settings": {
            "index": {
                "analysis": {
                    "filter": {
                        "my_synonym_filter": {
                            "type": "synonym",
                            "synonyms": [
                                "matthew,matt,matty",
                                "thomas,tom,thom,tommy"
                            ]
                        }
                    },
                    "analyzer": {
                        "my_synonyms": {
                            "filter": [
                                "lowercase",
                                "my_synonym_filter"
                            ],
                            "tokenizer": "standard"
                        }
                    }
                }
            }
        }
    }
}

以下查询:

{
    "query":{  
        "bool":{  
            "should":[  
               {  
                  "multi_match":{  
                     "fields":[  
                        "field1^8",
                        "field2^2"
                     ],
                     "query":"Matt And Tom Oldfield",
                     "type":"cross_fields",
                     "analyzer": "my_synonyms"
                  }
               }
            ]
        }
     }
 }

但是当我执行查询时,它没有将同义词扩展到每个字段,所以如果我分析查询,解释如下:

(Synonym(field1:matt field1:matthew field1:matty) blended(terms:[field1:and^8.0, field2:and^2.0]) Synonym(field1:thom field1:thomas field1:tom field1:tommy) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

所以,如果我有汤姆·奥德菲尔德"在field1和" Matt Oldfield"在field2中,查询与该结果不匹配,因为您可以看到它只扩展了同义词,但是扩展了第一个字段(field1)而不是另一个字段。

如果我从查询中删除分析器,那么它将匹配文档与" Tom Oldfield"在field1和" Matt Oldfield"在field2中,查询说明如下:

(blended(terms:[field1:matt^8.0, field2:matt^2.0]) blended(terms:[field1:and^8.0, field2:and^2.0]) blended(terms:[field1:tom^8.0, field2:tom^2.0]) blended(terms:[field1:oldfield^8.0, field2:oldfield^2.0]))

有没有办法让同义词扩展到每个字段?

1 个答案:

答案 0 :(得分:1)

我无法使用弹性5.5.0在我的env上重现您的问题。 这是我的MVCE设置:

{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "my_synonym_filter": {
            "type": "synonym",
            "synonyms": [
              "matthew,matt,matty",
              "thomas,tom,thom,tommy"
            ]
          }
        },
        "analyzer": {
          "my_synonyms": {
            "filter": [
              "lowercase",
              "my_synonym_filter"
            ],
            "tokenizer": "standard"
          }
        }
      }
    }
  },
  "mappings": {
    "my_mapping": {
      "properties": {
        "field1": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        },
        "field2": {
          "type": "text",
          "fields": {
            "keyword": {
              "type": "keyword",
              "ignore_above": 256
            }
          }
        }
      }
    }
  }
}

以下文件已编入索引:

{ "field1": "Tom Oldfield", "field2": "Matt Oldfield"}

在提供的查询ES上创建以下Lucene query

((field1:matt)^8.0 | (field1:matthew)^8.0 | (field1:matty)^8.0 | (field2:matt)^2.0 | (field2:matthew)^2.0 | (field2:matty)^2.0) 
((field1:and)^8.0 | (field2:and)^2.0) 
((field1:tom)^8.0 | (field1:thomas)^8.0 | (field1:thom)^8.0 | (field1:tommy)^8.0 | (field2:tom)^2.0 | (field2:thomas)^2.0 | (field2:thom)^2.0 | (field2:tommy)^2.0) 
((field1:oldfield)^8.0 | (field2:oldfield)^2.0))

其中为每个字段展开同义词。