我怎样才能让Sphinx将“auto”和“car”识别为类似的词?
我们的图像我有三个数据库记录
Andy likes to drive auto.
Mary don't like to drive car.
Bob is going to buy automobile.
以下是示例查询及其结果......
query: car
result: Mary don't like to drive car.
-------------------------------------
query: auto
result: Andy likes to drive auto.
-------------------------------------
query: automobile
Bob is going to buy automobile.
..但我想让狮身人面像回归......
query: car
result:
Andy likes to drive auto.
Mary don't like to drive car.
Bob is going to buy automobile.
-------------------------------------
query: auto
result:
Andy likes to drive auto.
Mary don't like to drive car.
Bob is going to buy automobile.
-------------------------------------
query: automobile
result:
Andy likes to drive auto.
Mary don't like to drive car.
Bob is going to buy automobile.
我知道狮身人面像有stowords,但我应该把什么放入限制词典让Sphinx这样想?
谢谢。
答案 0 :(得分:4)
您需要做的就是在.conf文件中为sphinx提供正确格式的wordforms文本文件。
文档在此处找到:http://www.sphinxsearch.com/docs/manual-0.9.9.html#conf-wordforms
auto > car
automobile > car
four-wheeled-vehicle-intended-for-public-roads > car
cars > car
答案 1 :(得分:0)
让我举一个关于字形态的例子,其中包含“杠杆”和“杠杆”这两个词在金融中是平等的术语,应该被视为同义词(两个词的意思都是“财务杠杆”)。 / p>
最初你的“wordforms.txt”文件应该包含它们,如下所示:
gear > gear
geared > gear
gearing > gear
gears > gear
……
leverage > leverage
leveraged > leverage
leverages > leverage
leveraging > leverage
这意味着最初这两个词没有连接。为了解决这个问题,你应该这样修改“wordforms.txt”的内容:
gear > leverage
geared > leverage
gearing > leverage
gears > leverage
……
leveraged > leverage
leverages > leverage
leveraging > leverage
此编辑连接它们(及其所有表单)。编辑“wordforms.txt”文件后,必须保存它并重新索引索引才能应用更改。
现在,当您搜索“杠杆”或“杠杆”时,您的结果将包含这两个词及其所有形态。