Question

注意：
这是一个交叉帖子，它首先发布在sphinx forum，但我没有答案，所以我在这里发布。

首先来看一个例子：

以下是我的表格（仅用于测试）：

+----+--------------------------+----------------------+
| Id | title                    | body                 |
+----+--------------------------+----------------------+
|  1 | National first hospital  | NASA                 |
|  2 | National second hospital | Space Administration |
|  3 | National govenment       | Support the hospital |
+----+--------------------------+----------------------+

我想从标题和正文字段中搜索内容，因此我配置了sphinx.conf 如下所示：

--------The sphinx config file----------
source mysql
{
        type = mysql
        sql_host = localhost
        sql_user = root
        sql_pass =0000
        sql_db = testfull
        sql_port = 3306 # optional, default is 3306
        sql_query_pre = SET NAMES utf8
        sql_query = SELECT * FROM test
}

index mysql
{
        source = mysql
        path = var/data/mysql_old_test
        docinfo = extern
        mlock = 0
        morphology = stem_en, stem_ru, soundex
        min_stemming_len = 1
        min_word_len = 1
        charset_type = utf-8
        html_strip = 0
}

indexer
{
        mem_limit = 128M
}

searchd
{
    listen = 9312
        read_timeout = 5
        max_children = 30
        max_matches = 1000
        seamless_rotate = 0
        preopen_indexes = 0
        unlink_old = 1
        pid_file = var/log/searchd_mysql.pid
        log = var/log/searchd_mysql.log
        query_log = var/log/query_mysql.log
}
------------------

然后我重新索引数据库并启动searchd守护程序。

在我的客户端，我将属性设置为：

----------客户端配置-------------------

sc = new SphinxClient();
///other thing
HashMap<String, Integer> weiMap=new HashMap<String, Integer>();
weiMap.put("title", 100);
weiMap.put("body", 0);
sc.SetFieldWeights(weiMap);

sc.SetMatchMode(SphinxClient.SPH_MATCH_ALL);

sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight DESC");

当我尝试搜索“国立医院”时，我得到了以下输出：

Query 'National hospital' retrieved 3 of 3 matches in 0.0 sec.
Query stats:
        'nation' found 3 times in 3 documents
        'hospit' found 3 times in 3 documents

Matches:
1. id=3, weight=101
2. id=1, weight=100
3. id=2, weight=100

匹配号码（三个匹配）是正确的，但结果的顺序不是我的想要的。

显然，id 1和2的文档应该是所需的最封闭的项目字符串（“国家医院”），所以在我看来应该给予最大的权重，但它们是在最后一个位置订购的。

我想知道是否有满足我的要求？

PS：

1）请不要建议我将sortModel设置为：

sc.SetSortMode(SphinxClient.SPH_SORT_EXTENDED,"@weight ASC");

这可能适用于这个例子，它会引起一些其他的问题。

2）Actuall我表中的内容是中文，我只是使用“National Hosp..l”来制作一个例子。

Answer 1

1°你问“国立医院”，但是狮身人面像搜索“国家”和“招待”，因为

 morphology = stem_en, stem_ru, soundex

2°你给予重量

 weiMap.put("title", 100);
 weiMap.put("body", 0);

到未存在的文本字段

 sql_query = SELECT * FROM test

3°最后我对主要问题的简单回答

按重量排序，第三排有更多的重量，因为国家和酒店之间没有话语

狮身人面像是如何计算体重的？

1 个答案: