如何使用Elasticsearch搜索所有同义词?

时间:2018-05-31 11:20:15

标签: elasticsearch nest

到目前为止,我已尝试使用Solr格式的文件:

zavesa => Gotove zavese,
zavesa, blago => Blago in dekorativno blago,
zavesa => Dodatki za zavese,
zavesa => Drogi in vodila za zavese

zavesa => Gotove zavese, Blago in dekorativno blago, Drogi in vodila za zavese, Dodatki za zavese

但我总是只得到与vodila za zavese"中的Drogi相匹配的结果。但如果我在vodila za zavese"

中删除" Drogi
zavesa => Gotove zavese, Blago in dekorativno blago, Dodatki za zavese

然后我只得到" Blago in dekorativno blago"的结果。

我也尝试使用全小写:

zavesa => gotove zavese, blago in dekorativno blago, drogi in vodila za zavese, dodatki za zavese

结果是一样的。

我希望得到结果" Gotove zavese"," Blago in dekorativno blago"," Drogi in vodila za zavese"和#34; Dodatki za zavese"当有人搜索" zavesa"。

Elasticsearch可以实现吗?

我的同义词配置

var indexSettings = new IndexSettings
{
    NumberOfReplicas = 0, // If this is set to 1 or more, then the index becomes yellow, because it's running on a single node (development machine).
    NumberOfShards = 5
};

indexSettings.Analysis = new Analysis();
indexSettings.Analysis.Analyzers = new Analyzers();
indexSettings.Analysis.TokenFilters = new TokenFilters();

var listOfSynonyms = new[] {
    "zavesa => Gotove zavese, Blago in dekorativno blago, Drogi in vodila za zavese, Dodatki za zavese"
};

var customTokenFilterSynonyms = new SynonymTokenFilter
{
    Synonyms = listOfSynonyms,
    Format = SynonymFormat.Solr,
    Tokenizer = "standard"
};

indexSettings.Analysis.TokenFilters.Add("customTokenFilterSynonym", customTokenFilterSynonyms);

CustomAnalyzer customAnalyzer = new CustomAnalyzer
{
    Tokenizer = "standard",
    Filter = new List<string> { "lowercase", "asciifolding", "customTokenFilterSynonym" }
};

indexSettings.Analysis.Analyzers.Add("customAnalyzerLowercaseSynonymAsciifolding", customAnalyzer);

var indexConfig = new IndexState
{
    Settings = indexSettings
};

var request = new IndexExistsRequest(indexName);
var result = ElasticClient.IndexExists(request);

if (!result.Exists)
{
    var response = ElasticClient.CreateIndex(indexName, c => c
           .InitializeUsing(indexConfig)
            .Mappings(m => m
            .Map<ChildGroupModel>(mm => mm
                .Properties(p => p
                    .Completion(cp => cp
                       .Name(elasticsearchModel => elasticsearchModel.TitleAutSuggest)
                       .Analyzer("customAnalyzerLowercaseSynonymAsciifolding")
                       .SearchAnalyzer("customAnalyzerLowercaseSynonymAsciifolding")
                   )
                    .Completion(cp => cp
                       .Name(elasticsearchModel => elasticsearchModel.TitleSloSuggest)
                       .Analyzer("customAnalyzerLowercaseSynonymAsciifolding")
                       .SearchAnalyzer("customAnalyzerLowercaseSynonymAsciifolding")
                   )
                    .Completion(cp => cp
                       .Name(elasticsearchModel => elasticsearchModel.TitleItaSuggest)
                       .Analyzer("customAnalyzerLowercaseSynonymAsciifolding")
                       .SearchAnalyzer("customAnalyzerLowercaseSynonymAsciifolding")
                   )
                    .Text(t => t.Name(model => model.TitleAut).Analyzer("customAnalyzerLowercaseSynonymAsciifolding").SearchAnalyzer("customAnalyzerLowercaseSynonymAsciifolding"))
                    .Text(t => t.Name(model => model.TitleSlo).Analyzer("customAnalyzerLowercaseSynonymAsciifolding").SearchAnalyzer("customAnalyzerLowercaseSynonymAsciifolding"))
                    .Text(t => t.Name(model => model.TitleIta).Analyzer("customAnalyzerLowercaseSynonymAsciifolding").SearchAnalyzer("customAnalyzerLowercaseSynonymAsciifolding"))
                )
            )
        )
    );

}

我正在TitleSloSuggest字段进行测试。

模型

public class ChildGroupModel
{
    [Column("id")]
    public int Id { get; set; }

    [Column("homepage_groups_id")]
    public int GroupId { get; set; }

    [Column("title_aut")]
    public string TitleAut { get; set; }

    public CompletionField TitleAutSuggest
    {
        get
        {
            return new CompletionField
            {
                Input = new[] { TitleAut }
            };
        }
    }

    [Column("title_slo")]
    public string TitleSlo { get; set; }

    public CompletionField TitleSloSuggest
    {
        get
        {
            return new CompletionField
            {
                Input = new[] { TitleSlo }
            };
        }
    }

    [Column("title_ita")]
    public string TitleIta { get; set; }

    public CompletionField TitleItaSuggest
    {
        get
        {
            return new CompletionField
            {
                Input = new[] { TitleIta }
            };
        }
    }




}

这些是索引设置:

// 20180601112924
// http://localhost:9200/child_groups_index/_settings

{
  "child_groups_index_temp_1": {
    "settings": {
      "index": {
        "number_of_shards": "5",
        "provided_name": "child_groups_index_temp_1",
        "creation_date": "1527844777425",
        "analysis": {
          "filter": {
            "customTokenFilterSynonym": {
              "format": "solr",
              "type": "synonym",
              "synonyms": [
                "zavesa => gotove zavese, blago in dekorativno blago, drogi in vodila za zavese, dodatki za zavese"
              ],
              "tokenizer": "standard"
            }
          },
          "analyzer": {
            "customAnalyzerLowercaseSynonymAsciifolding": {
              "filter": [
                "lowercase",
                "asciifolding",
                "customTokenFilterSynonym"
              ],
              "type": "custom",
              "tokenizer": "standard"
            }
          }
        },
        "number_of_replicas": "0",
        "uuid": "WsHzMHm-QSKA4Xzxp02ipQ",
        "version": {
          "created": "6020399"
        }
      }
    }
  }
}

1 个答案:

答案 0 :(得分:0)

我弄清楚我做错了什么。此链接有助于:https://www.elastic.co/guide/en/elasticsearch/guide/current/multi-word-synonyms.html#_use_simple_contraction_for_phrase_queries

而不是:

"zavesa => gotove zavese, blago in dekorativno blago, dodatki za zavese, drogi in vodila za zavese"

我需要:

"gotove zavese, blago in dekorativno blago, dodatki za zavese, drogi in vodila za zavese => zavesa"

或:

"gotove zavese => zavesa",
"blago in dekorativno blago => zavesa",
"dodatki za zavese => zavesa",
"drogi in vodila za zavese => zavesa"