Nest Elasticsearch- searching front of some fields and back of others

时间:2015-07-31 20:50:17

标签: elasticsearch nest

I have a case where I need a partial match on the first part of some properties (last name and first name) and a partial match on the end of some other properties, and I'm wondering how to add both analyzers. For example, if I have the first name of "elastic", I can currently search for "elas" and find it. But, if I have an account number of abc12345678, I need to search for "5678" and find all account numbers ending in that, but I can't have a first name search for "stic" find "elastic".

Here's a simplified example of my Person class:

public class Person
{       
    public string AccountNumber { get; set; }

    [ElasticProperty(IndexAnalyzer = "partial_name", SearchAnalyzer = "full_name")]
    public string LastName { get; set; }
    [ElasticProperty(IndexAnalyzer = "partial_name", SearchAnalyzer = "full_name")]
    public string FirstName { get; set; }       
}

Here's the relevant existing code where I create the index, that currently works great for searching the beginning of a word:

//Set up analyzers on some fields to allow partial, case-insensitive searches.
var partialName = new CustomAnalyzer
{
    Filter = new List<string> { "lowercase", "name_ngrams", "standard", "asciifolding" },
    Tokenizer = "standard"
};

var fullName = new CustomAnalyzer
{
    Filter = new List<string> { "standard", "lowercase", "asciifolding" },
    Tokenizer = "standard"
};

var result = client.CreateIndex("persons", c => c
                .Analysis(descriptor => descriptor
                    .TokenFilters(bases => bases.Add("name_ngrams", new EdgeNGramTokenFilter
                    {
                        MaxGram = 15, //Allow partial match up to 15 characters. 
                        MinGram = 2, //Allow no smaller than 2 characters match
                        Side = "front"
                    }))
                    .Analyzers(bases => bases
                        .Add("partial_name", partialName)
                        .Add("full_name", fullName))
                    )
                    .AddMapping<Person>((m => m.MapFromAttributes()))
                );

It seems like I could add another EdgeNGramTokenFilter, and make the Side = "back", but I don't want the first and last name searches to match back side searches. Can someone provide a way to do that? Thanks, Adrian

Edit

For completeness, this is the new decorator on the property that goes with the code in the accepted answer:

[ElasticProperty(IndexAnalyzer = "partial_back", SearchAnalyzer = "full_name")]
public string AccountNumber { get; set; }

1 个答案:

答案 0 :(得分:3)

您需要声明另一个分析器(让我们称之为partialBack)专门用于从后面进行匹配,但您绝对可以重用现有的edgeNGram令牌过滤器,如下所示:

var partialBack = new CustomAnalyzer
{
    Filter = new List<string> { "lowercase", "reverse", "name_ngrams", "reverse" },
    Tokenizer = "keyword"
};
...
                .Analyzers(bases => bases
                    .Add("partial_name", partialName)
                    .Add("partial_back", partialBack))
                    .Add("full_name", fullName))
                )

这里的关键是reverse令牌过滤器的双重使用。

字符串(abc12345678)是

  • 首先小写(abc12345678),
  • 然后反转(87654321cba),
  • 然后edge-ngramed(87876876587654876543,...)
  • 最后再次反转代币(78678567845678345678,...)。

正如您所看到的,结果是字符串被标记为&#34;从后面&#34;,以便搜索5678abc12345678匹配。