我有一组过滤数据,如下格式
filters = [
{key : colour, values: [red,blue,green,yellow,........]},
{key:website, values:[myntra, jabong, voonik,...........]},
{key:shape,values:[fit, maxi,bodycon, skater...................]}
,........................
,........................]
我的弹性搜索数据库结构是
{
"_index": "products_data",
"_type": "dresses",
"_id": "1",
"_score": 0,
"_source": {
"product_filter":{
"dress_shape": "sheath",
"pattern_type": "solid",
"discount_price": 1347,
"knit_or_woven": "knit",
"year": "2015",
"age_group": "adults-women",
"broad_category": "dress",
"fabric": "polyester",
"lining": "has a lining",
"surface_styling_or_features": "other",
"usage": "casual",
"sleeves_type": "sleeveless",
"brand": "deal jeans",
"website": "myntra",
"season": "fall",
"price": 2695,
"discount_percent": 50,
"product_line": "dresses",
"neck": "round neck",
"sleeve": "sleeveless",
"gender": "women",
"colour": "black",
"occasion": "casual",
"dress_length": "mini",
"display_name": "deal jeans black sheath dress",
"hemline": "curved",
"fabric_type": "lace or crochet"
}
}
}
我需要找到每个过滤器计数,例如image。 目前我正在采取每个过滤器并生成弹性搜索查询,如下面的格式,并将此查询发送到弹性搜索计数api进行计数。
{ "query": { "bool": { "filter":{ "term": { "product_filter.brand": "109f" } } } } }
output : 109f brand --> 2132
我在每个过滤器列表中有更多数据。计算500个过滤器大约需要6秒钟 我尝试了多搜索api(msearch api),但它也花了很多时间。我的数据大小是19787033681字节,它有5个分片。 任何人都可以通过节点js代码帮助我...
答案 0 :(得分:0)
我的建议是将分片数量从5减少到1.除非您有多台机器处理相同的数据或希望使用群集,否则您应该只使用一个分片。
{
"settings":{
"number_of_shards":1,
"number_of_replicas":0
}
}
使用多个服务器/机器进行插入/搜索/删除操作和冗余时,使用多个分片和副本非常有效。