Question

我的 Elasticsearch 服务器上有10个以上的索引。

每个索引具有1个或多个字段，它们具有不同类型的 Analyzers ：关键字，标准， ngram 等...

对于全局搜索，我使用的是multi_match，而没有指定任何显式的fields。

对于查询，我正在使用elasticsearch-dsl库，代码如下：

def search_for_index(indice, term, num_of_result=10):
    s = Search(index=indice).sort({"_score": "desc"})
    s = s[:num_of_result]
    s = s.query('multi_match', query=term, operator='and')
    response = s.execute()
    return response.to_dict()['hits']['hits']

我得到很好的结果，并且搜索工作正常，但是有时有人输入更长的文本，却出现maxClauseCount错误。

例如，当搜索词 term等于：

时，搜索会引发错误

term=We are working on your request and will keep you posted at the earliest.

或者其他任何较长的文本也会引发相同的错误。

您能帮我找出一种更好的全局搜索方法，以便避免这种错误吗？

Answer 1

首先-此限制是原因的原因。您拥有的布尔条款越多-搜索量就越大。将其视为每个子句的文档ID的交叉（AND）或联接（OR）子集。这是非常繁重的操作，因此，最初它限制为1024个子句。

一般建议是尝试减少要搜索的字段数。也许您的字段不包含文本数据，或者仅包含一些内部ID。您可以在多次比赛查询中通过明确指定 fields 部分来清除它们。

如果您仍然决定采用当前方法，并且使用的是 Elasticsearch 5.5 + 及更高版本，则可以通过在 elasticsearch.yml 中添加以下行来更改它们并重新启动实例。

indices.query.bool.max_clause_count: 250000

如果您使用的是 Elasticsearch 的 pre-5 版本，则该设置称为index.query.bool.max_clause_count

使用multi_match查询时如何克服maxClauseCount错误

1 个答案: