我能够找到另一个问题:Using of possessive_english stemmer in Elasticsearch 但是已经有3年了
我正在尝试让Elasticsearch在索引和搜索时忽略'。例如:
POST my_index/_doc/
{
"message" : "Mike's bike"
}
我希望能够使用“ mikes”,“ mike's”,“ mike”来搜索此文档。我看上去并认为possessive_english
应该可以完成此任务,但是我一直无法获得预期的结果。
我创建的索引是
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"rebuilt_standard": {
"tokenizer": "standard",
"filter": [
"lowercase", "my_stemmer"
]
}
},
"filter": {
"my_stemmer":{
"type": "stemmer",
"language": "possessive_english"
}
}
}
}
}
我用...测试了分析仪
POST /my_index/_analyze
{
"analyzer": "rebuilt_standard",
"text": "Mike's bike"
}
这就是结果
{
"tokens" : [
{
"token" : "mike",
"start_offset" : 0,
"end_offset" : 6,
"type" : "<ALPHANUM>",
"position" : 0
},
{
"token" : "bike",
"start_offset" : 7,
"end_offset" : 11,
"type" : "<ALPHANUM>",
"position" : 1
}
]
}
看起来分析仪正在工作。然后我将文档插入:
POST my_index/_doc/
{
"message" : "Mike's bike"
}
搜索时,它返回了0条结果
GET /my_index/_search
{
"query": {
"match": {"message": "mike"}
}
}
GET /my_index/_search
{
"query": {
"match": {"message": "mikes"}
}
}
但是
GET /my_index/_search
{
"query": {
"match": {"message": "mike's"}
}
}
返回结果
似乎我从链接的问题中丢失了东西在映射方面的配置,但是我不确定如何设置它。
我使用kibana测试了上述内容,但实际上我使用了带有存储库模式的rails和gems“ elasticsearch-model”,“ elasticsearch-rails”,“ elasticsearch-persistence”。我也是Rails的新手,所以我不知道它的配置是否与rails,elasticsearch或两者都需要工作。
为了防万一,我会发布它们
include Elasticsearch::Persistence::Repository
include Elasticsearch::Persistence::Repository::DSL
client = Elasticsearch::Client.new(url: 'http://localhost:9200', log: true)
settings index: {
number_of_shards: 1,
analysis: {
analyzer: {
custom: {
type: "custom",
tokenizer: "standard",
filter: [
"lowercase",
"english_possessive_stemmer",
]
}
},
filter: {
english_possessive_stemmer: {
type: "stemmer",
language: "possessive_english",
}
}
}
}
mappings {
indexes :icon, index: false
indexes :properties, type: 'nested' do
indexes :values
end
indexes :name
}
在控制器中
repository = Repository.new
repository.create_index!(force: true)
repository.save(json)
results = repository.search(query: { match: { name: 'Mikes' } })
答案 0 :(得分:0)
您的分析仪工作正常。我认为您尚未将其应用于映射
PUT /my_index
{
"settings": {
"analysis": {
"analyzer": {
"rebuilt_standard": {
"tokenizer": "standard",
"filter": [
"lowercase", "my_stemmer","english_stemmer"
]
}
},
"filter": {
"my_stemmer":{
"type": "stemmer",
"language": "possessive_english"
},
"english_stemmer": {
"type": "stemmer",
"language": "english"
}
}
}
},
"mappings": {
"properties": {
"message":{
"type": "text",
"analyzer": "rebuilt_standard" ---> pass the analyzer
}
}
}
}
possessive_english过滤器仅删除“'”,您不能使用它来搜索mikes(尽管它适用于mike)。您将需要使用词干分析器,它将词减少为基本形式。
我有一篇很棒的文章here供进一步参考。