假设我在elasticSearch中有这样的文档:
{
"videoName": "taylor.mp4",
"type": "long"
}
我尝试使用DSL查询进行全文搜索:
{
"query": {
"match":{
"videoName": "taylor"
}
}
}
我需要获取上述文档,但我没有得到它。如果我指定 taylor.mp4 ,则返回文档。
所以,我想知道如何使用分隔符进行全文搜索。
KARTHEEK回答后编辑:
正则表达式会获取 taylor.mp4 文档。采取视频索引中的文档为:
的情况{
"videoName": "Akon - smack that.mp4",
"type": "long"
}
因此,检索此文档的查询可以是
{
"query": {
"match":{
"videoName": "smack that"
}
}
}
在这种情况下,将检索文档,因为我们在查询字符串中使用 smack 。 匹配执行全文搜索并获取文档。但是,假设我只知道那个关键字和匹配,则无法获取该文档。我需要使用 regexp 。
{
"query": {
"regexp":{
"videoName": "smack.* that.*"
}
}
}
另一方面,如果我占用正则表达式并将我的所有查询字符串设置为 smack。* that。* ,这也将无法检索任何文档。并且,我们不知道哪个单词的后缀 .mp4 。所以,我的问题是我们需要使用匹配进行全文搜索,它还应该检测分隔符。还有其他办法吗?
在Richa询问索引的映射后编辑
http://localhost:9200/example/videos/_mapping
{
"example": {
"mappings": {
"videos": {
"properties": {
"query": {
"properties": {
"match": {
"properties": {
"videoName": {
"type": "string"
}
}
}
}
},
"type": {
"type": "string"
},
"videoName": {
"type": "string"
}
}
}
}
}
}
答案 0 :(得分:2)
根据您提到的上述查询,我们可以使用正则表达式来获取结果。请查找附件结果供您阅读并告诉我您是否还有其他任何需要。
curl -XGET "http://localhost:9200/test/sample/_search" -d'
{
"query": {
"regexp":{
"videoName": "taylor.*"
}
}
}'
Result:
{
"took": 22,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "test",
"_type": "sample",
"_id": "1",
"_score": 1,
"_source": {
"videoName": "taylor.mp4",
"type": "long"
}
}
]
}
}
答案 1 :(得分:2)
请使用此映射
PUT /test_index
{
"settings": {
"number_of_shards": 1
},
"mappings": {
"doc": {
"properties": {
"videoName": {
"type": "string",
"term_vector": "yes"
}
}
}
}
}
之后,您需要索引前面提到的文档:
PUT test_index/doc/1
{
"videoName": "Akon - smack that.mp4",
"type": "long"
}
输出:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.15342641,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.15342641,
"_source": {
"videoName": "Akon - smack that.mp4",
"type": "long"
}
}
]
}
}
查询以获得结果:
GET /test_index/doc/1/_termvector?fields=videoName
结果:
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_version": 1,
"found": true,
"took": 1,
"term_vectors": {
"videoName": {
"field_statistics": {
"sum_doc_freq": 3,
"doc_count": 1,
"sum_ttf": 3
},
"terms": {
"akon": {
"term_freq": 1
},
"smack": {
"term_freq": 1
},
"that.mp4": {
"term_freq": 1
}
}
}
}
}
通过使用这个我们将基于“smack”进行搜索
POST /test_index/_search
{
"query": {
"match": {
"_all": "smack"
}
}
}
结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.15342641,
"hits": [
{
"_index": "test_index",
"_type": "doc",
"_id": "1",
"_score": 0.15342641,
"_source": {
"videoName": "Akon - smack that.mp4",
"type": "long"
}
}
]
}
}