状态（＆＃34; A＆＃34; OR＆＃34;我＆＃34;状态）

Question

我目前正致力于从SOLR v3迁移到Elasticsearch v5.11。我的问题是，如何将以下查询字符串转换为Elasticsearch匹配/匹配短语等效。这甚至可能吗？

(entityName:(john AND lewis OR "john lewis") 
OR entityNameText:(john AND lewis OR "john lewis")) 
AND (status( "A" OR "I" status))

我试过这样做，到目前为止只有第一组括号，但它看起来不正确：

{
"bool": {
    "should": [
        [{
            "bool": {
                "should": [
                    [{
                        "match_phrase": {
                            "entityName": "john lewis"
                        }
                    }]
                ],
                "must": [
                    [{
                        "match": {
                            "entityName": {
                                "query": "john lewis",
                                "operator": "and"
                            }
                        }
                    }]
                ]
            }
        }, {
            "bool": {
                "should": [
                    [{
                        "match_phrase": {
                            "entityNameText": "john lewis"
                        }
                    }]
                ],
                "must": [
                    [{
                        "match": {
                            "entityNameText": {
                                "query": "john lewis",
                                "operator": "and"
                            }
                        }
                    }]
                ]
            }
        }]
    ]
}

}

由于

更新

entityName和entityNameText都被映射为带有自定义分析器的文本类型，用于搜索和查询。状态被映射为关键字类型。

Answer 1

为将来有趣的人发布答案。不完全确定为什么，但我使用ES Query DSL写了两个替代查询，发现它们等同于原始的Lucene查询，返回完全相同的结果。不确定这是ES查询DSL的赞成还是反对。

原始Lucene查询：

{
"query": {
    "query_string" : {
        "query" : "entityName:(john AND Lewis OR \"john Lewis\") OR entityNameText:(john AND Lewis OR \"john Lewis\")"
    }
}

}

查询备选方案1：

{
"bool": {
    "should": [
        [{
            "bool": {
                "should": [
                    [{
                        "match": {
                            "entityName": {
                                "query": "john Lewis",
                                "operator": "and"
                            }
                        }
                    }, {
                        "match_phrase": {
                            "entityName": "john Lewis"
                        }
                    }]
                ]
            }
        }, {
            "bool": {
                "should": [
                    [{
                        "match": {
                            "entityNameText": {
                                "query": "john Lewis",
                                "operator": "and"
                            }
                        }
                    }, {
                        "match_phrase": {
                            "entityNameText": "john Lewis"
                        }
                    }]
                ]
            }
        }]
    ]
}
}

查询备选方案2

{
"bool": {
    "should": [
        [{
            "multi_match": {
                "query": "john Lewis",
                "type": "most_fields",
                "fields": ["entityName", "entityNameText"],
                "operator": "and"
            }
        }, {
            "multi_match": {
                "query": "john Lewis",
                "type": "phrase",
                "fields": ["entityName", "entityNameText"]
            }
        }]
    ]
}
}

使用此映射：

{
"entity": {
    "dynamic_templates": [{
        "catch_all": {
            "match_mapping_type": "*",
            "mapping": {
                "type": "text",
                "store": true,
                "analyzer": "phonetic_index",
                "search_analyzer": "phonetic_query"
            }
        }
    }],
    "_all": {
        "enabled": false
    },
    "properties": {
        "entityName": {
            "type": "text",
            "store": true,
            "analyzer": "indexed_index",
            "search_analyzer": "indexed_query",
            "fields": {
                "entityNameLower": {
                    "type": "text",
                    "analyzer": "lowercase"
                },
                "entityNameText": {
                    "type": "text",
                    "store": true,
                    "analyzer": "text_index",
                    "search_analyzer": "text_query"
                },
                "entityNameNgram": {
                    "type": "text",
                    "analyzer": "ngram_index",
                    "search_analyzer": "ngram_query"
                },
                "entityNamePhonetic": {
                    "type": "text",
                    "analyzer": "ngram_index",
                    "search_analyzer": "ngram_query"
                }
            }
        },
        "status": {
            "type": "keyword",
            "norms": false,
            "store": true
        }
    }
}
}

Answer 2

答案取决于您是如何指定映射的，但我假设您已经零客户映射。

让我们首先分解不同的部分，然后我们将它们全部重新组合在一起。

状态（＆＃34; A＆＃34; OR＆＃34;我＆＃34;状态）

这是一个＆＃34;术语＆＃34;查询，将其视为SQL＆＃34; IN＆＃34;子句。

  "terms": {
    "status": [
      "a",
      "i"
    ]
  }

entityName :(约翰和刘易斯OR＆＃34;约翰刘易斯＆＃34;）

ElasticSearch将字符串字段分解为不同的部分。我们可以通过使用另一个＆＃34; term＆＃34;来利用这个优势。查询。我们不需要将它指定为3个不同的部分，ES将处理它。

"terms": {
              "entityName": [
                "john",
                "lewis"
              ]
            }

entityNameText :(约翰和刘易斯OR＆＃34;约翰刘易斯＆＃34;））

与上面的逻辑完全相同，只是搜索不同的字段

＆＃34; terms＆＃34;：{ ＆＃34; entityNameText＆＃34;：[ ＆＃34;约翰＆＃34 ;, ＆＃34;路易斯＆＃34; ] }

AND vs OR

在ES查询中。并且＆＃34;必须＆＃34;或者=＆＃34;应该＆＃34;。

全部放在一起

GET test1/type1/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "terms": {
            "status": [
              "a",
              "i"
            ]
          }
        },
        {
          "bool": {
            "should": [
              {
                "terms": {
                  "entityName": [
                    "john",
                    "lewis"
                  ]
                }
              },
              {
                "terms": {
                  "entityNameText": [
                    "john",
                    "lewis"
                  ]
                }
              }
            ]
          }
        }
      ]
    }
  }
}

下面是我用来测试查询的完整设置的链接。

https://gist.github.com/jayhilden/cf251cd751ef8dce7a57df1d03396778

如何将Lucene查询字符串转换为Elasticsearch Match / Match_Prefix等效

2 个答案:

状态（＆＃34; A＆＃34; OR＆＃34;我＆＃34;状态）

entityName :(约翰和刘易斯OR＆＃34;约翰刘易斯＆＃34;）

entityNameText :(约翰和刘易斯OR＆＃34;约翰刘易斯＆＃34;））

AND vs OR

全部放在一起