ngram for Elastic Search中的通配符搜索

时间:2016-12-15 10:22:04

标签: elasticsearch

我正在尝试向最终用户提供类型,因为它们更像是sqlserver。我能够为给定的sql场景实现ES查询:

 select * from table where name like '%peter tom%' and type != 'xyz 

在ES中我使用了ngram tokenizer来实现所需的结果:

PUT sample
{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_ngram_analyzer": {
          "tokenizer": "my_ngram_tokenizer"
        }
      },
      "tokenizer": {
        "my_ngram_tokenizer": {
          "type": "nGram",
          "min_gram": "2",
          "max_gram": "15"
        }
      }
    }
  },
  "mappings": {
    "typename": {
      "properties": {
        "name": {
          "type": "string",
          "fields": {
            "search": {
              "type": "string",
              "analyzer": "my_ngram_analyzer"
            }
          }
        },
        "type": {
          "type": "string",
          "index": "not_analyzed"
        }
      }
    }
  }
}

{
  "query": {
    "bool": {
      "should": [
        {
          "term": {
            "name.search": "peter tom"
          }
        }
      ],
      "must_not": [
        {
          "match": {
            "type": "xyz"
          }
        },
        {
          "match": {
            "type": "abc"
          }
        }
      ]
    }
  }
}

所以如果我的文档行是

name                              type
peter tomson                      efg
Peter tomson robert simson        efg

上面的查询只显示两个文件但是当我尝试输入彼得sims或彼得simson它不会返回第二个文件,除非我输入彼得汤姆森罗伯特sims或彼得汤姆森罗伯特simson。所以基本上我必须键入在彼得之前和辛普森之前的所有以下单词进入第二个文件。是否有任何方法可以获得部分匹配的第二个文档。我可以使用查询匹配和“AND”操作,但仍然是单词的完全匹配。我正在寻找部分匹配像彼得SIM卡应该给我第二行的文件。 感谢

1 个答案:

答案 0 :(得分:0)

我自己发布了解决方案的答案,以供其他用户进一步参考:

{
    "settings": {
        "analysis": {
            "analyzer": {
                "autocomplete": {
                    "tokenizer": "whitespace",
                    "filter": [
                        "lowercase",
                        "autocomplete"
                    ]
                },
                "autocomplete_search": {
                    "tokenizer": "whitespace",
                    "filter": [
                        "lowercase"
                    ]
                }
            },
            "filter": {
                "autocomplete": {
                    "type": "nGram",
                    "min_gram": 2,
                    "max_gram": 40
                }
            }
        }
    },
    "mappings": {
        "doc": {
            "properties": {
                "title": {
                    "type": "string",
                    "analyzer": "autocomplete",
                    "search_analyzer": "autocomplete_search"
                }
            }
        }
    }
}

PUT my_index/doc/1
{
  "title": "peter tomson" 
}

PUT my_index/doc/2
{
  "title": "Peter tomson robert simson" 
}


GET my_index/doc/_search
    {
      "query": {
        "match": {
          "title": {
            "query": "Pete sim", 
            "operator": "and"
          }
        }
      }
    }