使用Ngram的搜索查询Elasticsearch始终返回0个结果

时间:2019-02-27 17:00:27

标签: .net elasticsearch nest

我使用NEST与Elasticsearch一起工作。我尝试将所有字符串字段分解为标记。同时对于tokininiz使用ngram。但是,在提示查询时,我总是得到0个结果。

我要使用api的课程。

public class Elasticsearch
{
    string index = "video-materials";
    ElasticClient client;
    public Elasticsearch()
    {
        var settings = new ConnectionSettings(new Uri("http://localhost:9200"));
        client = new ElasticClient(settings);
        if (client.IndexExists(index).Exists)
        {
            client.DeleteIndex(index);
        }
        var nGramFilters = new List<string> { "lowercase", "asciifolding", "nGram_filter" };

        var resp = client.CreateIndex(index, c => c
             .Mappings(m => m
                .Map<ElasticVideoMaterial>(mm => mm
                    .AutoMap()
                    .Properties(p => p
                        .Text(t => t
                            .Name(n => n.OriginalTitle)
                            .Fields(f => f
                                .Keyword(k => k
                                    .Name("keyword")
                                    .IgnoreAbove(256)
                                )
                                .Text(tt => tt
                                    .Name("ngram")
                                    .Analyzer("ngram_analyzer")
                                )
                            )
                        )
                    )
                )
            )
            .Settings(s => s
                .Analysis(a => a
                    .Analyzers(anz => anz
                        .Custom("ngram_analyzer", cc => cc
                            .Filters(nGramFilters)
                            .Tokenizer("ngram_tokenizer")))
                    .Tokenizers(tz => tz
                        .NGram("ngram_tokenizer", td => td
                            .MinGram(3)
                            .MaxGram(3)
                            .TokenChars(TokenChar.Letter, TokenChar.Digit)
                        )
                    )
                )
            )
        );
    }
    public void Index(IEnumerable<ElasticVideoMaterial> models)
    {
        foreach(var model in models)
        {
            client.Index(model,i=>i.Index(index));
        }
    }
    public void Search(string query)
    {
        var resp = client.Search<ElasticVideoMaterial>(i => i
                                                        .Query(q => q
                                                            .Match(m => m
                                                                .Field(f => f.OriginalTitle.Suffix("ngram"))
                                                                .Query("Hob")
                                                            )
                                                        )
                                                        .Index(index)
                                                    ).Documents.ToList();
    }
}

我总是再次创建索引,然后索引对象列表。 为此,请使用Index()方法。 这是我的索引类。

public class ElasticVideoMaterial
{
    public int ID { get; set; }
    public string Title { get; set; }
    public string OriginalTitle { get; set; }
    public float? KinopoiskRating { get; set; }
    public float? Imdb { get; set; }
    public int Duration { get; set; }
    public List<string> GenreTitles { get; set; }
    public List<string> CountryNames { get; set; }
    public DateTime? ReleaseDate { get; set; }
    public List<string> TranslationTitles { get; set; }
    public List<string> FilmMakerNames { get; set; }
    public List<string> ActorNames { get; set; }
    public List<string> ThemeNames { get; set; }
    public CompletionField Suggest { get; set; }
}

但是当我尝试使用Search()方法获得结果时,我得到了0个结果。 (写成“霍比特”,我希望能收到名字中包含“霍比特”的电影)

1 个答案:

答案 0 :(得分:0)

ngram_analyzer用于分析搜索请求的查询输入,但此分析器不用于分析索引请求的OriginalTitle输入。

在为文档建立索引时,您只需将分析器配置为用于OriginalTitle字段即可,可以使用attribute mappingfluent mapping指定该索引。例如,流利的贴图

var client = new ElasticClient();

if (client.IndexExists(defaultIndex).Exists)
    client.DeleteIndex(defaultIndex);

var nGramFilters = new List<string> { "lowercase", "asciifolding", "nGram_filter" };

var resp = client.CreateIndex(defaultIndex, c => c
     .Mappings(m => m
        .Map<ElasticVideoMaterial>(mm => mm
            .AutoMap()
            .Properties(p => p
                .Text(t => t
                    .Name(n => n.OriginalTitle)
                    .Fields(f => f
                        .Keyword(k => k
                            .Name("keyword")
                            .IgnoreAbove(256)
                        )
                        .Text(tt => tt
                            .Name("ngram")
                            .Analyzer("ngram_analyzer")
                        )
                    )
                )
            )
        )
    )
    .Settings(s => s
        .Analysis(a => a
            .Analyzers(anz => anz
                .Custom("ngram_analyzer", cc => cc
                    .Filters(nGramFilters)
                    .Tokenizer("ngram_tokenizer")))
            .Tokenizers(tz => tz
                .NGram("ngram_tokenizer", td => td
                    .MinGram(3)
                    .MaxGram(3)
                    .TokenChars(TokenChar.Letter, TokenChar.Digit)
                )
            )
        )
    )
);

var searchResponse = client.Search<ElasticVideoMaterial>(i => i
    .Query(q => q
        .Match(m => m
            .Field(f => f.OriginalTitle.Suffix("ngram"))
            .Query("Hob")
        )
    )
);

这会将OriginalTitle设置为multi-field,并在ngram下创建一个名为OriginalTitle的多字段,该字段将在索引时间和索引时间使用ngram_analyzer该字段的搜索时间。