Question

以下是示例数据：

在type blog_comments中，我有一些评论数据，其结构如下：

{"blog_id": 1, "comments": "Apple", "comment_id": 1}

对于#1和#2博客，此type blog_comments共有6条评论：

{"blog_id": 1, "comments": "Apple", "comment_id": 1}
{"blog_id": 1, "comments": "Orange", "comment_id": 2}
{"blog_id": 1, "comments": "Fruit", "comment_id": 3}
{"blog_id": 2, "comments": "Apple", "comment_id": 1}
{"blog_id": 2, "comments": "Orange", "comment_id": 2}
{"blog_id": 2, "comments": "Earth", "comment_id": 3}

Question: Is it possible using some "magic" queries to get＃1 as the result when I searching "Apple Fruit" and get＃2 when I search "Apple Earth" ?

我正在考虑将每个博客的所有评论加入到一个新记录（新类型）中，然后对这种新类型进行搜索。但是有太多的评论（大约12,000,000条评论），这些评论已经被索引到弹性搜索中，所以最好尽可能多地使用这些评论。

Answer 1

理想情况下，您需要更改索引的映射，以便能够从一篇博文中搜索所有评论。您无法真正搜索文档，并且说一个特定的博客ID（文档中的字段）同时匹配多个文档。 Elasticsearch知道如何匹配同一文档中的多个字段，而不是多个字段。

但有一种解决方法。但这取决于你需要用这个查询做什么，除了回到JUST博客ID。

GET /blog/entries/_search?search_type=count
{
  "query": {
    "match": {
      "comments": "Apple Earth"
    }
  },
  "aggs": {
    "unique": {
      "terms": {
        "field": "blog_id",
        "min_doc_count": 2
      }
    }
  }
}

上面的查询将返回如下内容：

"aggregations": {
      "unique": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 0,
         "buckets": [
            {
               "key": 2,
               "doc_count": 2
            }
         ]
      }
   }

查询的想法是仅返回blog_id（"key":2下的buckets），因此您会看到类型terms的聚合。根据您搜索的字词数（Apple Earth计算两个字词数），您可以将min_doc_count设置为字词数。这意味着，您要在最少两个文档中搜索apple earth。您的示例与实际执行的操作之间的区别在于，它将返回一个文档，例如apple earth comments，而不只是apple在一个文档中earth另一个。

但是，正如我所说，理想情况下，您需要更改索引的映射。

如何使用elasticsearch按结果从组中搜索某些单词？

1 个答案: