理想情况下,我需要计算在城市名称中使用“伦敦”的次数。但查询返回“london”,“London”和“LoNdOn”等不同的值。 我尝试使用Case Insensitive作为选项,但它没有给我所需的结果。
这是我的查询,
{
"queryType": "topN",
"dataSource": "wikiticker",
"dimension":"cityName",
"granularity": "ALL",
"metric": "count",
"threshold": 10,
"filter":
{
"type": "search",
"dimension": "cityName",
"query": {
"type": "insensitive_contains",
"value": "london",
}
},
"aggregations": [
{
"type": "longSum",
"name": "count",
"fieldName": "count"
}
],
"intervals": ["2014-10-01T00:00:00.000Z/2016-10-07T00:00:00.000Z"]
}
这是我的结果:
[ {
"timestamp" : "2015-09-12T00:46:58.771Z",
"result" : [ {
"count" : 21,
"cityName" : "London"
},
{
"count" : 10,
"cityName" : "New London"
},
{
"count" : 3,
"cityName" : "london"
},
{
"count" : 1,
"cityName" : "LoNdon"
},
{
"count" : 1,
"cityName" : "LondOn"
} ]
} ]
我应该得到类似的东西:
[ {
"timestamp" : "2015-09-12T00:46:58.771Z",
"result" : [ {
"count" : 26,
"cityName" : "London"
},
{
"count" : 10,
"cityName" : "New London"
} ]
} ]
答案 0 :(得分:0)
使用过滤的聚合器:
过滤的聚合器包装任何给定的聚合器,但只聚合给定维度过滤器匹配的值。
{
"type" : "filtered",
"filter" : {
"type" : "search",
"dimension" : cityName,
"query": {
"type":"contains",
"value":"london"
}
},
"aggregator" : {
"type": "count",
"name": "Total Count of the Name London"
}
}
<强>参考强>