Question

我有以下索引：

+-----+-----+-------+
| oid | tag | value |
+-----+-----+-------+
|  1  | t1  |  aaa  |
|  1  | t2  |  bbb  |
|  2  | t1  |  aaa  |
|  2  | t2  |  ddd  |
|  2  | t3  |  eee  |
+-----+-----+-------+

其中： oid - 对象ID，标记 - 属性名称，值 - 属性值。

映射：

"mappings": {
    "document": {
        "_all": { "enabled": false },
        "properties": {
            "oid": { "type": "integer" },
            "tag": { "type": "text" }
            "value": { "type": "text" },
        }
    }
}

这个简单的结构允许存储任意数量的对象属性，通过一个属性或更多使用OR逻辑运算符进行搜索非常简单。例如。获取对象oid的位置：

(tag='t1' AND value='aaa') OR (tag='t2' AND value='ddd')

ES查询：

{
  "_source": { "includes":["oid"] },
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              { "term": { "tag": "t1" } },
              { "term": { "value": "aaa" } }
            ]
          }
        },
        {
          "bool": {
            "must": [
              { "term": { "tag": "t2" } },
              { "term": { "value": "ddd" } }
            ]
          }
        }
      ],
      "minimum_should_match": "1"
    }
  }
}

但很难使用AND逻辑运算符搜索两个或更多属性。所以问题是如何通过AND运算符将两个子查询连接到两个不同的记录。例如。获取对象oid的位置：

(tag='t1' AND value='aaa') AND (tag='t2' AND value='ddd')

在这种情况下，结果必须是：{“oid”：“2”}

搜索数据包含两个不同的记录，并且在上一个示例中应用MUST而不是SHOULD在这种情况下不返回任何内容。

我在SQL中有两个我需要的东西：

SELECT i1.[oid]
FROM [index] i1 INNER JOIN [index] i2 ON i1.oid = i2.oid
WHERE
    (i1.tag='t1' AND i1.value='aaa')
    AND
    (i2.tag='t2' AND i2.value='ddd')

---------

SELECT [oid] FROM [index] WHERE tag='t1' AND value='aaa'
INTERSECT   
SELECT [oid] FROM [index] WHERE tag='t2' AND value='ddd'

执行这两个请求并在客户端上合并它们不是一种选择。

Elastic Search版本为6.1.1

Answer 1

为了达到你想要的效果，你需要使用嵌套类型，即你的映射应该是这样的：

PUT my-index
{
  "mappings": {
    "doc": {
      "properties": {
        "oid": {
          "type": "keyword"
        },
        "data": {
          "type": "nested",
          "properties": {
            "tag": {
              "type": "keyword"
            },
            "value": {
              "type": "text"
            }
          }
        }
      }
    }
  }
}

文件将被编入索引：

PUT /my-index/doc/_bulk
{ "index": {"_id": 1}}
{ "oid": 1, "data": [ {"tag": "t1", "value": "aaa"}, {"tag": "t2", "value": "bbb"}] }
{ "index": {"_id": 2}}
{ "oid": 2, "data": [ {"tag": "t1", "value": "aaa"}, {"tag": "t2", "value": "ddd"}, {"tag": "t3", "value": "eee"}] }

然后你可以让你的查询像这样工作：

POST my-index/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "data",
            "query": {
              "bool": {
                "filter": [
                  {
                    "term": {
                      "data.tag": "t1"
                    }
                  },
                  {
                    "term": {
                      "data.value": "aaa"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "data",
            "query": {
              "bool": {
                "filter": [
                  {
                    "term": {
                      "data.tag": "t2"
                    }
                  },
                  {
                    "term": {
                      "data.value": "ddd"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  }
}

Answer 2

可能有一种方法，有点难看：将terms aggregations添加到查询正文中。

  {
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must": [
              { "term": { "tag": "t1" } },
              { "term": { "value": "aaa" } }
            ]
          }
        },
        {
          "bool": {
            "must": [
              { "term": { "tag": "t2" } },
              { "term": { "value": "ddd" } }
            ]
          }
        }
      ],
      "minimum_should_match": "1"
    }
  },
  "size": 0,
  "aggs": {
      "find_joined_oid": {
          "terms": {
              "field": "oid.keyword"
          }
      }
  }
}

如果一切顺利，这将输出类似

的内容

  {
  "took": 123,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 123,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "find_joined_oid": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "1",
          "doc_count": 1
        },
        {
          "key": "2",
          "doc_count": 2
        }
    }
  }
}

在这里，＆＃34;聚合＆＃34;部分，

＆＃34; key＆＃34;：＆＃34; 1＆＃34;

表示您的＆＃34; oid＆＃34;：＆＃34; 1＆＃34;和

＆＃34; doc_counts＆＃34;：1

表示查询中有1个匹配＆＃34; oid＆＃34;：＆＃34; 1＆＃34;。

如您所知，在聚合结果正文中您要查询的标记数量（例如N），只有那些＆＃34;键＆＃34; s＆＃34; doc_count＆＃34;等于N是你追求的结果。在此示例中，您要查询标记：t1（值为aaa）和标记：t2（值为ddd），因此N = 2。您可以在结果存储区列表中进行迭代，以找出那些拥有＆＃34; doc_count＆＃34;的关键字＆＃34;等于2。

然而，应该有更好的方法。如果要将映射更改为类似样式的文档，即。将一个oid的所有字段存储在一个doc中，生活将变得更加容易。

{
    "properties": {
        "oid": { "type": "integer" },
        "tag-1": { "type": "text" }
        "value-1": { "type": "text" },
        "tag-2": { "type": "text" }
        "value-2": { "type": "text" }
    }
}

如果要添加新的标记值对，只需获取有关oid的原始文档，将新标记对放入文档中，然后将整个新文档放回到具有相同中> _id你从原来的那个得到的。大多数情况下dynamic mapping将在您的情况下正常工作，这意味着您不需要明确断言新字段的映射。

像Elasticsearch这样的非SQL数据库并不是为处理您要求的SQL样式查询而设计的。

ES6：通过AND运算符将子查询连接到两个不同的行

2 个答案: