Question

我是ElasticSearch的新手，需要帮助解决以下问题：

我有一组包含多个产品的文档。我想通过“Apple”过滤产品属性product_brand，并获得与过滤器匹配的产品数量。但结果应按文档ID分组，文档ID也是文档本身的一部分（test_id）。

示例文件：

"test" : {
   "test_id" : 19988,
   "test_name" : "Test",
},
"products" : [ 
    {
        "product_id" : 1,
        "product_brand" : "Apple"
    }, 
    {
        "product_id" : 2,
        "product_brand" : "Apple"
    }, 
    {
        "product_id" : 3,
        "product_brand" : "Samsung"
    } 
]

结果应为：

{
   "key" : 19988,
   "count" : 2
},

在SQL中它看起来大致如下：

SELECT test_id, COUNT(product_id) 
FROM `test` 
WHERE product_brand = 'Apple'
GROUP BY test_id;

我怎样才能做到这一点？

Answer 1

我认为这应该让你非常接近：

GET /test/_search
{
  "_source": {
    "includes": [
      "test.test_id",
      "_score"
    ]
  },
  "query": {
    "function_score": {
      "query": {
        "match": {
          "products.product_brand.keyword": "Apple"
        }
      },
      "functions": [
        {
          "script_score": {
            "script": {
              "source": "def matches=0; def products = params['_source']['products']; for(p in products){if(p.product_brand == params['brand']){matches++;}} return matches;",
              "params": {
                "brand": "Apple"
              }
            }
          }
        }
      ]
    }
  }
}

此方法使用function_score，但如果您想要以不同方式得分，您也可以将其应用于脚本字段。以上内容仅匹配具有子产品对象且品牌文本完全设置为“Apple”的文档。

您只需要控制两个apple实例的输入。或者，您可以匹配function_score查询中的所有内容，并仅关注分数。您的输出可能如下所示：

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 2,
    "hits": [
      {
        "_index": "test",
        "_type": "doc",
        "_id": "AV99vrBpgkgblFY6zscA",
        "_score": 2,
        "_source": {
          "test": {
            "test_id": 19988
          }
        }
      }
    ]
  }
}

我使用的索引中的映射看起来像这样：

{
  "test": {
    "mappings": {
      "doc": {
        "properties": {
          "products": {
            "properties": {
              "product_brand": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "product_id": {
                "type": "long"
              }
            }
          },
          "test": {
            "properties": {
              "test_id": {
                "type": "long"
              },
              "test_name": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

ElasticSearch - 对每个组进行过滤，分组和计数结果

1 个答案: