我在我的流浪汉机器上安装了sphinx和CentOs 6,我正在尝试从Snowball安装荷兰libstemmer。 安装成功执行但测试出错。
我创建了2个具有完全相同数据的索引。 我的索引是:
index shop_products1 {
type = rt
dict = keywords
min_prefix_len = 3
rt_mem_limit = 2046M
path = /var/lib/sphinxsearch/data/shop_products2
morphology = libstemmer_nl, stem_en
html_strip = 1
html_index_attrs = img=alt,title; a=title;
preopen = 1
inplace_enable = 1
index_exact_words = 1
rt_field = name
rt_field = brand
rt_field = description
rt_field = specifications
rt_field = tags
rt_field = ourtags
rt_field = searchfield
rt_field = shop
rt_field = category
rt_field = color
rt_field = ourcolor
rt_field = gender
rt_field = material
rt_field = ean
rt_field = sku
rt_attr_string = ean
rt_attr_string = sku
rt_attr_float = price
rt_attr_float = discount
rt_attr_uint = shopid
rt_attr_uint = itemid
rt_attr_uint = deleted
rt_attr_uint = duplicate
rt_attr_uint = brandid
rt_attr_uint = duplicates
rt_attr_timestamp = updated_at
}
index shop_products2 {
type = rt
dict = keywords
min_prefix_len = 3
rt_mem_limit = 2046M
path = /var/lib/sphinxsearch/data/shop_products20
html_strip = 1
html_index_attrs = img=alt,title; a=title;
preopen = 1
inplace_enable = 1
index_exact_words = 1
rt_field = name
rt_field = brand
rt_field = description
rt_field = specifications
rt_field = tags
rt_field = ourtags
rt_field = searchfield
rt_field = shop
rt_field = category
rt_field = color
rt_field = ourcolor
rt_field = gender
rt_field = material
rt_field = ean
rt_field = sku
rt_attr_string = ean
rt_attr_string = sku
rt_attr_float = price
rt_attr_float = discount
rt_attr_uint = shopid
rt_attr_uint = itemid
rt_attr_uint = deleted
rt_attr_uint = duplicate
rt_attr_uint = brandid
rt_attr_uint = duplicates
rt_attr_timestamp = updated_at
}
searchd {
listen = 127.0.0.1:9306:mysql41
log = /var/log/sphinxsearch/searchd.log
workers = threads
binlog_path = /var/lib/sphinxsearch/rt-binlog
read_timeout = 5
client_timeout = 200
max_children = 0
# 2 hours
rt_flush_period = 7200
pid_file = /var/run/searchd.pid
}
当我搜索荷兰语单词“afzuigkappen”时,它必须给出与“afzuigkap”完全相同的结果
有人可以给我一些关于如何让这项工作的信息吗? PS。抱歉我的英语不好..
答案 0 :(得分:0)
荷兰雪球运动员以<h:button value="reset" />
和afzuigkappen
的方式不同:
afzuigkap
所以你应该更新词干分析器算法,以便参考你的目标,关于算法的文档here
答案 1 :(得分:0)
好吧,我已经创建了一些特定的测试。 我创建的索引:
index test1 {
type = rt
dict = keywords
min_prefix_len = 3
rt_mem_limit = 2046M
morphology = libstemmer_nl, stem_en
path = /var/lib/sphinxsearch/data/test1
preopen = 1
inplace_enable = 1
index_exact_words = 1
rt_field = name
rt_attr_uint = shopid
rt_attr_uint = itemid
}
index test2 {
type = rt
dict = keywords
min_prefix_len = 3
rt_mem_limit = 2046M
path = /var/lib/sphinxsearch/data/test2
preopen = 1
inplace_enable = 1
index_exact_words = 1
rt_field = name
rt_attr_uint = shopid
rt_attr_uint = itemid
}
我使用包含足球产品的较小数据库编制索引,并使用sphinx搜索结果:http://imgur.com/n95Ue8v
如您所见,两者都给出了53条记录的相同输出。如果我只在我的mysql中搜索:select * from tests1 WHERE name LIKE'%keeper%'我得到360结果。