Question

我正在尝试对正在研究的项目进行弹性搜索，但是仍然对如何合并两种类型的文档有所限制。

例如，如果我有10个文件作为酒店空房率，而10个文件都是去酒店所在目的地的航班。

通常在MySQL中，我会根据日期，酒店住宿时间和航班等信息进行加入。

我将如何以机票价格中最便宜的10个返回酒店文档？

Answer 1

我能想到的最接近做您想要做的事情的是Composite Aggregations。这不是真正的联接，但它可以使您接近所需的内容。

规定：

字段在索引之间必须具有相同的名称
您将不得不整理汇总的结果
所有结果字段（您关心的字段）都是某种形式的聚合

这是一个最小的示例（在Kibana Console中被破解）：

使用文档：

POST my-test1/_doc/_bulk
{"index": {}}
{"entityID":"entity1", "value": 12}
{"index": {}}
{"entityID":"entity1", "value": 22}
{"index": {}}
{"entityID":"entity2", "value": 2}
{"index": {}}
{"entityID":"entity2", "value": 12}


POST my-test2/_doc/_bulk
{"index": {}}
{"entityID":"entity1", "otherValue": 5}
{"index": {}}
{"entityID":"entity1", "otherValue": 1}
{"index": {}}
{"entityID":"entity2", "otherValue": 3}
{"index": {}}
{"entityID":"entity2", "otherValue": 7}

我们将围绕公共实体字段entityID

进行汇总

GET my-test*/_search
{
  "size": 0,
  "aggs": {
    "by-entity": {
      "composite": {
        "sources": [
          {
            "entityID": {
              "terms": {
                "field": "entityID.keyword"
              }
            }
          }
        ]
      },
      "aggs": {
        "value": {
          "avg": {
            "field": "value"
          }
        },
        "otherValue": {
          "avg": {
            "field": "otherValue"
          }
        }
      }
    }
  }
}

这将导致响应：

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 10,
    "successful" : 10,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 8,
    "max_score" : 0.0,
    "hits" : [ ]
  },
  "aggregations" : {
    "by-entity" : {
      "after_key" : {
        "entityID" : "entity2"
      },
      "buckets" : [
        {
          "key" : {
            "entityID" : "entity1"
          },
          "doc_count" : 4,
          "otherValue" : {
            "value" : 3.0
          },
          "value" : {
            "value" : 17.0
          }
        },
        {
          "key" : {
            "entityID" : "entity2"
          },
          "doc_count" : 4,
          "otherValue" : {
            "value" : 5.0
          },
          "value" : {
            "value" : 7.0
          }
        }
      ]
    }
  }
}

您可以围绕许多不同的字段和不同的存储桶聚合创建一个复合聚合。因此，您可以为terms创建一个hotel_id聚合，并将其与date_histogram周围的timestamp合并。

Answer 2

Elasticsearch doesn't have cross-index joins（与大多数文档数据库一样）。如果必须在ES中执行此操作，通常可以通过在索引时对数据进行规范化来执行此操作。如果您无法执行此操作，则必须执行多个查询。

如果需要进行关系查询，最好使用MySQL或Postgres等关系数据库。

弹性搜索-如何在两种类型之间联接数据？

2 个答案: