我在elasticsearch中有一个类似于这样的对象:
{
"text": "something something something",
"entities": { "hashtags":["test","test123"]}
}
问题是并非每个文档都设置了实体属性。所以我想写一个查询:
text
字段entities
字段entities.hashtag
字段我尝试使用以下查询提取叶子字段,问题是我仍然得到没有entities
字段的文档。
对于问题的第二部分,我想知道:我如何只提取entities.hashtags
字段?我试过像"fields": ["entities.hashtags"]
这样的东西,但它没有用。
{
"size": 2000,
"query": {
"filtered": {
"query": {
"match_all": {
}
},
"filter": {
"bool": {
"must": [{
"term": {
"text": "something"
}
},
{
"missing": {
"field": "entities",
"existence": true
}
}]
}
}
}
}
}
答案 0 :(得分:1)
如果我正确理解你,这似乎就是你想要的。 "term"
字段上的"text"
过滤器和"entities"
字段上的"exists"
filter会过滤文档,"entities.hashtags"
上的"terms"
aggregation会提取值。我将发布我使用的完整示例:
DELETE /test_index
PUT /test_index
{
"settings": {
"number_of_shards": 1
}
}
PUT /test_index/doc/1
{
"text": "something something something",
"entities": { "hashtags": ["test","test123"] }
}
PUT /test_index/doc/2
{
"text": "another doc",
"entities": { "hashtags": ["testagain","testagain123"] }
}
PUT /test_index/doc/3
{
"text": "doc with no entities"
}
POST /test_index/_search
{
"size": 0,
"query": {
"filtered": {
"query": {
"match_all": {}
},
"filter": {
"bool": {
"must": [
{ "term": { "text": "something" } },
{ "exists": { "field": "entities" } }
]
}
}
}
},
"aggs": {
"hashtags": {
"terms": {
"field": "entities.hashtags"
}
}
}
}
...
{
"took": 35,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0,
"hits": []
},
"aggregations": {
"hashtags": {
"buckets": [
{
"key": "test",
"doc_count": 1
},
{
"key": "test123",
"doc_count": 1
}
]
}
}
}