我开始关注elasticsearch,我想知道这个操作是否可以用它完成:(我做了一些搜索,但我承认我不知道该寻找什么)。
我有这两个联系人数据:
{
"id" : "id1",
"name" : "Roger",
"phone1" : "123",
"phone2" : "",
"phone3" : "980"
}
{
"id" : "id2",
"name" : "Lucas",
"phone1" : "789",
"phone2" : "123",
"phone3" : ""
}
我很想知道elasticsearch是否可以帮助我找到重复的电话号码,即使它们位于不同的电话领域(这两个记录中都有“123”)。 我已经看到我可以在多个字段中搜索字符串,所以如果我搜索123,我可以得到这两个记录。但是,我希望有能力发出一个请求,可以返回我这样的内容:
{
"phones" : {
"123" : ["id1", "id2"],
"980" : ["id1"],
"789" : ["id2"]
}
}
或者即使这样也很有用(带号码的联系人数量):
{
"phones" : {
"123" : 2,
"980" : 1,
"789" : 1
}
}
知道这是否可行?如果可以的话,这将是非常棒的。
答案 0 :(得分:4)
我同意DrTech的建议,即改变您的数据结构。但是,如果您出于某种原因希望保持原样,您可以使用多字段术语方面获得相同的结果:
curl "localhost:9200/phonefacet/_search?pretty=true&search_type=count" -d '{
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"fields" : ["phone1", "phone2", "phone3"],
"size" : 10
}
}
}
}'
结果如下:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 0.0,
"hits" : [ ]
},
"facets" : {
"tag" : {
"_type" : "terms",
"missing" : 2,
"total" : 4,
"other" : 0,
"terms" : [ {
"term" : "123",
"count" : 2
}, {
"term" : "980",
"count" : 1
}, {
"term" : "789",
"count" : 1
} ]
}
}
}
答案 1 :(得分:1)
您可以使用terms facet到达那里,但是您必须更改数据结构以将所有电话号码都包含在一个字段中:
创建索引:
curl -XPUT 'http://127.0.0.1:9200/test/?pretty=1'
索引您的数据:
curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1' -d '
{
"name" : "Roger",
"id" : "id1",
"phone" : [
"123",
"980"
]
}
'
curl -XPOST 'http://127.0.0.1:9200/test/test?pretty=1' -d '
{
"name" : "Lucas",
"id" : "id2",
"phone" : [
"789",
"123"
]
}
'
搜索所有字段,返回phone
中的字词数:
curl -XGET 'http://127.0.0.1:9200/test/test/_search?pretty=1' -d '
{
"facets" : {
"phone" : {
"terms" : {
"field" : "phone"
}
}
}
}
'
# {
# "hits" : {
# "hits" : [
# {
# "_source" : {
# "name" : "Roger",
# "id" : "id1",
# "phone" : [
# "123",
# "980"
# ]
# },
# "_score" : 1,
# "_index" : "test",
# "_id" : "StaJK9A5Tc6AR7zXsEKmGA",
# "_type" : "test"
# },
# {
# "_source" : {
# "name" : "Lucas",
# "id" : "id2",
# "phone" : [
# "789",
# "123"
# ]
# },
# "_score" : 1,
# "_index" : "test",
# "_id" : "x8w39F-DR9SZOQoHpJw2FQ",
# "_type" : "test"
# }
# ],
# "max_score" : 1,
# "total" : 2
# },
# "timed_out" : false,
# "_shards" : {
# "failed" : 0,
# "successful" : 5,
# "total" : 5
# },
# "facets" : {
# "phone" : {
# "other" : 0,
# "terms" : [
# {
# "count" : 2,
# "term" : "123"
# },
# {
# "count" : 1,
# "term" : "980"
# },
# {
# "count" : 1,
# "term" : "789"
# }
# ],
# "missing" : 0,
# "_type" : "terms",
# "total" : 4
# }
# },
# "took" : 5
# }