在elasticsearch

时间:2016-04-16 15:50:13

标签: elasticsearch

假设我在elasticSearch中有这样的文档:

{
  "videoName": "taylor.mp4",
  "type": "long"
}

我尝试使用DSL查询进行全文搜索:

{
    "query": {
        "match":{
            "videoName": "taylor"
        }
    }
}

我需要获取上述文档,但我没有得到它。如果我指定 taylor.mp4 ,则返回文档。

所以,我想知道如何使用分隔符进行全文搜索。

KARTHEEK回答后编辑:

正则表达式会获取 taylor.mp4 文档。采取视频索引中的文档为:

的情况
{
  "videoName": "Akon - smack that.mp4",
  "type": "long"
}

因此,检索此文档的查询可以是

{
    "query": {
        "match":{
            "videoName": "smack that"
        }
    }
}

在这种情况下,将检索文档,因为我们在查询字符串中使用 smack 匹配执行全文搜索并获取文档。但是,假设我只知道那个关键字和匹配,则无法获取该文档。我需要使用 regexp

{
    "query": {
        "regexp":{
            "videoName": "smack.* that.*"
        }
    }
}

另一方面,如果我占用正则表达式并将我的所有查询字符串设置为 smack。* that。* ,这也将无法检索任何文档。并且,我们不知道哪个单词的后缀 .mp4 。所以,我的问题是我们需要使用匹配进行全文搜索,它还应该检测分隔符。还有其他办法吗?

在Richa询问索引的映射后编辑

http://localhost:9200/example/videos/_mapping

{
  "example": {
    "mappings": {
      "videos": {
        "properties": {
          "query": {
            "properties": {
              "match": {
                "properties": {
                  "videoName": {
                    "type": "string"
                  }
                }
              }
            }
          },
          "type": {
            "type": "string"
          },
          "videoName": {
            "type": "string"
          }
        }
      }
    }
  }
}

2 个答案:

答案 0 :(得分:2)

根据您提到的上述查询,我​​们可以使用正则表达式来获取结果。请查找附件结果供您阅读并告诉我您是否还有其他任何需要。

curl -XGET "http://localhost:9200/test/sample/_search" -d'
{
  "query": { 
    "regexp":{
        "videoName": "taylor.*"
    }
  }
}'

Result:

{
  "took": 22,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1,
    "hits": [
      {
        "_index": "test",
        "_type": "sample",
        "_id": "1",
        "_score": 1,
        "_source": {
          "videoName": "taylor.mp4",
          "type": "long"
        }
      }
    ]
  }
}

答案 1 :(得分:2)

请使用此映射

PUT /test_index
{
   "settings": {
      "number_of_shards": 1
   },
   "mappings": {
      "doc": {
         "properties": {
            "videoName": {
               "type": "string",
               "term_vector": "yes"
            }
         }
      }
   }
}

之后,您需要索引前面提到的文档:

PUT test_index/doc/1
{
  "videoName": "Akon - smack that.mp4",
  "type": "long"
}

输出:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.15342641,
    "hits": [
      {
        "_index": "test_index",
        "_type": "doc",
        "_id": "1",
        "_score": 0.15342641,
        "_source": {
          "videoName": "Akon - smack that.mp4",
          "type": "long"
        }
      }
    ]
  }
}

查询以获得结果:

GET /test_index/doc/1/_termvector?fields=videoName

结果:

{
  "_index": "test_index",
  "_type": "doc",
  "_id": "1",
  "_version": 1,
  "found": true,
  "took": 1,
  "term_vectors": {
    "videoName": {
      "field_statistics": {
        "sum_doc_freq": 3,
        "doc_count": 1,
        "sum_ttf": 3
      },
      "terms": {
        "akon": {
          "term_freq": 1
        },
        "smack": {
          "term_freq": 1
        },
        "that.mp4": {
          "term_freq": 1
        }
      }
    }
  }
}

通过使用这个我们将基于“smack”进行搜索

POST /test_index/_search
{
    "query": {
        "match": {
           "_all": "smack"
        }
    }
}

结果:

{
  "took": 1,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.15342641,
    "hits": [
      {
        "_index": "test_index",
        "_type": "doc",
        "_id": "1",
        "_score": 0.15342641,
        "_source": {
          "videoName": "Akon - smack that.mp4",
          "type": "long"
        }
      }
    ]
  }
}