我有这样的映射:
"properties": {
"id": {"type": "long", "index": "not_analyzed"},
"name": {"type": "string", "index": "not_analyzed"},
"skills": {"type": "string", "index": "not_analyzed"}
}
我想使用给定的映射将学生的个人资料存储在elasticsearch中。 skills
是他们在个人资料中指定的计算机技能列表(python,javascript,...)。
鉴于['html', 'css', 'sass', 'javascript', 'django', 'bootstrap', 'angularjs', 'backbone']
这样的技能组合,我想找到所有具有此技能组合中至少3项技能的档案。我不想知道他们与我们想要的名单有什么共同点,只对计数感兴趣。有没有办法在elasticsearch中做到这一点?
答案 0 :(得分:3)
可能有一种更好的方式我没想到,但你可以用script filter来做。
我设置了索引的简化版本,其中包含一些文档:
PUT /test_index
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"doc": {
"properties": {
"skills": {
"type": "string",
"index": "not_analyzed"
}
}
}
}
}
POST /test_index/doc/_bulk
{"index":{"_id":1}}
{"skills":["html","css","javascript"]}
{"index":{"_id":2}}
{"skills":["bootstrap", "angularjs", "backbone"]}
{"index":{"_id":3}}
{"skills":["python", "javascript", "ruby","java"]}
然后运行此查询:
POST /test_index/_search
{
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"script": {
"script": "count=0; for(s: doc['skills'].values){ for(x: skills){ if(s == x){ count +=1 } } } count >= 3",
"params": {
"skills": ["html", "css", "sass", "javascript", "django", "bootstrap", "angularjs", "backbone"]
}
}
}
}
}
}
并取回了我的预期:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 1,
"_source": {
"skills": [
"html",
"css",
"javascript"
]
}
},
{
"_index": "test_index",
"_type": "doc",
"_id": "2",
"_score": 1,
"_source": {
"skills": [
"bootstrap",
"angularjs",
"backbone"
]
}
}
]
}
}
以下是所有代码:
http://sense.qbox.io/gist/1018a01f1df29cb793ea15661f22bc8b25ed3476
答案 1 :(得分:2)
可以使用query string和minimum_should_match选项
示例:
POST <index>/_search
{
"query": {
"filtered": {
"filter": {
"query": {
"query_string": {
"default_field": "skills",
"query": "html css sass javascript django bootstrap angularjs backbone \"ruby on rails\" ",
"minimum_should_match" : "3"
}
}
}
}
}
}