Sphinx-多个全文本搜索会降低完全匹配的分数

时间:2018-12-27 10:18:44

标签: php sphinx

我在Sphinx上遇到了一个奇怪的weight问题。仅在name上执行全文时,结果将正确加权。但是,如果在address上进行搜索扩展时,所计算的权重是不正确的。

我正在使用Sphinx 2.2.11

示例1:搜索名称会产生正确的结果

$cl->SetRankingMode(SPH_RANK_SPH04);
$cl->SetSortMode(SPH_SORT_EXTENDED, '@weight desc');
$cl->SetMatchMode(SPH_MATCH_EXTENDED2);
$res = $cl->Query('@name ("la" | "comedie" | "saint" | "michel")', 'idx_name');
  

输出

      Venue Name                               Address          Weight
La Comédie Saint-Michel                 boulevard Saint-Michel  19620
La Comédie Saint-Michel - Small Hall    boulevard Saint-Michel  18649
La Comédie Saint-Michel - Grande Salle  boulevard Saint-Michel  18649
  

要匹配的单词

[words] => Array
    (
        [la] => Array
            (
                [docs] => 26110
                [hits] => 29358
            )

        [comedie] => Array
            (
                [docs] => 89
                [hits] => 96
            )

        [saint] => Array
            (
                [docs] => 8820
                [hits] => 10171
            )

        [michel] => Array
            (
                [docs] => 314
                [hits] => 353
            )
    )

示例2:搜索姓名和地址时体重不正确

$cl->SetRankingMode(SPH_RANK_SPH04);
$cl->SetSortMode(SPH_SORT_EXTENDED, '@weight desc');
$cl->SetMatchMode(SPH_MATCH_EXTENDED2);
$res = $cl->Query('@name ("la" | "comedie" | "saint" | "michel")
                   @address ("boulevard" | "saint" | "michel")', 'idx_name');
  

输出

      Venue Name                               Address          Weight    
La Comédie Saint-Michel - Small Hall    boulevard Saint-Michel  32631
La Comédie Saint-Michel - Grande Salle  boulevard Saint-Michel  32631
La Comédie Saint-Michel                 boulevard Saint-Michel  32608
  

要匹配的单词

[words] => Array
    (
        [la] => Array
            (
                [docs] => 26110
                [hits] => 29358
            )

        [comedie] => Array
            (
                [docs] => 89
                [hits] => 96
            )

        [saint] => Array
            (
                [docs] => 8820
                [hits] => 10171
            )

        [michel] => Array
            (
                [docs] => 314
                [hits] => 353
            )

        [boulevard] => Array
            (
                [docs] => 19735
                [hits] => 19915
            )
    )

在示例2中,第三条记录应具有最高权重。 预期的输出是最佳/精确匹配,排在最前面。我尝试使用不同的排名模式,但没有运气。

0 个答案:

没有答案