Question

我试图获得所有共享同一个最顶级祖先的文档，其中一个孩子可以是多个文档的父母，祖父母，祖父母等。

所以，让我说我有这样的结构（借鉴https://www.elastic.co/guide/en/elasticsearch/reference/5.6/parent-join.html）：

   (parent)
   question
    /    \
   /      \
comment  answer
(child)  (child)

在代码中：

PUT my_index
{
  "settings": {
    "mapping.single_type": true
  },
  "mappings": {
    "doc": {
      "properties": {
        "my_join_field": {
          "type": "join",
          "relations": {
            "question": ["answer", "comment"]
          }
        }
      }
    }
  }
}

但是，人们可以永远理论上回答评论和评论答案。所以说我有一个问题，结构如下：

                               (id: 1)
                               question
                              /        \
                             /          \
                        answer          answer
                       (id: 5)          (id: 8)
                       /     \              |
                      /       \             |
                   comment  answer        answer
                  (id: 15) (id: 12)      (id: 9)
                  /    \        |          /   \  
                 /      \       |         /     \ 
              answer  answer  comment  answer  answer 
             (id: 16)(id: 17) (id: 19) (id: 10)(id: 11)

我如何获得所有文件（ids 1,5,8,9,10,11,12,15,16,17,19），只知道id 9？

Answer 1

以下是Elasticsearch documentation的摘录：

Four common techniques are used to manage relational data in Elasticsearch:

Application-side joins
Data denormalization
Nested objects
Parent/child relationships 

Often the final solution will require a mixture of a few of these techniques.

正如 Val 建议您可以通过引入两个字段来实现应用程序端连接：“top_most_ansestor”和“parent”。这是一个非常合理和简单的解决方案，因为它不需要Elasticsearch连接字段。

但是，您可能希望结合技术。

如果您想使用连接字段，那么您可以考虑将最顶层的祖先定义为所有子节点，孙子节点等的父节点，并在应用程序中维护树的层次结构。从Elasticsearch的角度来看，你将拥有一棵浅而宽的树（单亲与许多叶子）

question(id 1): [ids 1, 5, 8, 9, 10, 11, 12, 15, 16, 17, 19]

整个树的检索将通过单个请求完成。您的应用程序将以不同的方式查看您的文档，如上所述：深树。例如。对于文件9，你将有

_id:9 {"parent":8,"text":"some text", "type":"answer"}

您应该使用哪种技术取决于其他要求和您的偏好。越简越好。

elasticsearch：获取共享同一个最顶级祖先的所有文档

1 个答案: