应用错误收集

我使用以下代码查找文档的术语频率。

POST myindex/mydoc/1/_termvectors?fields=fields.bodyText&pretty=true
{
    "term_statistics":true,
    "filter":{

        "max_doc_freq":300,
        "min_doc_freq":50
    }
}

在我的索引中有100万份文件。如何更有效地为每个文档运行此统计信息？有效地我的意思是：例如：doc 1中的单词the也可以出现在doc 2中，所以当我运行doc 2的统计数据时，不需要再次计算the（假设我的索引）尚未针对每个文档进行更新。）

如何在Elasticsearch中有效地运行术语统计？

0 个答案: