在电影数据库中,我将用户给出的评分(0到5星)存储到每部电影中。我在Elastic Search(版本1.2.2)
中索引了以下文档结构"_index": "my_index"
"_type": "film",
"_id": "6629",
"_source": {
"id": "6629",
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 },
{ "user_id" : 4567, "rating_value" : 2 },
{ "user_id" : 7890, "rating_value" : 1 }
.....
]
}
"_index": "my_index"
"_type": "film",
"_id": "6630",
"_source": {
"id": "6630",
"title": "Pulp Fiction",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 1 },
{ "user_id" : 7654, "rating_value" : 2 },
{ "user_id" : 4321, "rating_value" : 5 }
.....
]
}
等等......
我的目标是在一次搜索中获得用户评分的所有电影(比方说用户1234),以及rating_value
如果我进行以下搜索
GET my_index/film/_search
{
"query": {
"match": {
"ratings.user_id": "1234"
}
}
}
对于所有匹配的电影,我得到整个文档,然后,我必须解析整个评级数组,以找出数组的哪个元素与我的查询匹配,以及与user_id 1234相关联的rating_value是什么。 / p>
理想情况下,我希望此查询的结果为
"hits": [ {
"_index": "my_index"
"_type": "film",
"_id": "6629",
"_source": {
"id": "6629",
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 }, // <= only the row that matches the query
]
},
"_index": "my_index"
"_type": "film",
"_id": "6630",
"_source": {
"id": "6630",
"title": "Pulp Fiction",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 1 }, // <= only the row that matches the query
]
}
} ]
提前致谢
答案 0 :(得分:3)
我设法使用聚合检索值,如我之前的评论中所述。
以下是我如何做到这一点。
首先,我使用的映射:
PUT test/movie/_mapping
{
"properties": {
"title":{
"type": "string",
"index": "not_analyzed"
},
"ratings": {
"type": "nested"
}
}
}
我选择不对标题编制索引,但您可以使用字段属性并将其保留为“原始”字段。
然后,电影编入索引:
PUT test/movie/6629
{
"title": "Fight Club",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 3 },
{ "user_id" : 4567, "rating_value" : 2 },
{ "user_id" : 7890, "rating_value" : 1 }
]
}
PUT test/movie/4456
{
"title": "Jumanji",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 4 },
{ "user_id" : 4567, "rating_value" : 3 },
{ "user_id" : 4630, "rating_value" : 5 }
]
}
PUT test/movie/6547
{
"title": "Hook",
"ratings" : [
{ "user_id" : 1234, "rating_value" : 4 },
{ "user_id" : 7890, "rating_value" : 1 }
]
}
聚合查询是:
GET test/movie/_search
{
"aggs": {
"by_movie": {
"terms": {
"field": "title"
},
"aggs": {
"ratings_by_user": {
"nested": {
"path": "ratings"
},"aggs": {
"for_user_1234": {
"filter": {
"term": {
"ratings.user_id": "1234"
}
},
"aggs": {
"rating_value": {
"terms": {
"field": "ratings.rating_value"
}
}
}
}
}
}
}
}
}
}
最后,这是对先前文档执行此查询时产生的输出:
"aggregations": {
"by_movie": {
"buckets": [
{
"key": "Fight Club",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 3,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 3,
"key_as_string": "3",
"doc_count": 1
}
]
}
}
}
},
{
"key": "Hook",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 2,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 4,
"key_as_string": "4",
"doc_count": 1
}
]
}
}
}
},
{
"key": "Jumanji",
"doc_count": 1,
"ratings_by_user": {
"doc_count": 3,
"for_user_1234": {
"doc_count": 1,
"rating_value": {
"buckets": [
{
"key": 4,
"key_as_string": "4",
"doc_count": 1
}
]
}
}
}
}
]
}
}
由于嵌套语法,这有点单调乏味,但您将能够为每部电影检索所提供用户的评级(此处为1234)。
希望这有帮助!
答案 1 :(得分:2)
将评级存储为嵌套文档(或子级),然后您就可以单独查询它们。
可以在此处找到嵌套文档与子项之间差异的一个很好的解释:http://www.spacevatican.org/2012/6/3/fun-with-elasticsearch-s-children-and-nested-documents/