遇到一个奇怪的问题,Sphinx和带有连字符的短语中的停用词。我有大量的Sphinx索引运行后端搜索文章。索引的属性包括文章的URL slug。我正在使用inflex搜索slug,以便用户更容易找到旧项目。
这是一个问题,给出了这样的slu ::
an-optional-text-string
搜索全文的用户不会收到任何结果。但是如果你删除“an-”并只使用“optional-text-string”或只是“optional-text”或“text-string”,文档将按预期返回。
我认为这可能是一个关键词问题?也许Sphinx索引器正在删除“an-”位,但搜索查询解析器不是?
还有其他人遇到过这个吗?
这是我的源和索引配置的简化版本
source articleSource
{
type = mysql
sql_host =
sql_user =
sql_pass =
sql_db =
sql_query_pre = SET NAMES utf8
sql_query_pre = DELETE FROM foundry_registry where name='__index_articles'
sql_query_pre = INSERT INTO foundry_registry (SELECT (SELECT MAX(uid)+1 from foundry_registry), '__index_articles', MAX(modified), 0, '__core' FROM gryphon_articles)
sql_query = \
select a.uid as id, a.uid as item_id, a.headline as title, a.abstract as description, a.copy, \
a.created, a.published, a.modified, a.status, 'article' as type, a.slug as url_slug, \
group_concat(t.name) as tags, \
group_concat(au.name) as authors, \
a.workflow_id, a.section_id, a.issue_id, '0' as blog_id \
from gryphon_articles as a \
left join gryphon_articlesTags as at on at.article_id = a.uid \
left join gryphon_tags as t on at.tag_id = t.uid \
left join gryphon_articlesAuthors as aa on aa.article_id = a.uid \
left join gryphon_authors as au on aa.author_id = au.uid \
group by a.uid
sql_attr_multi = uint tag from query; SELECT article_id, tag_id as tag from gryphon_articlesTags
sql_attr_multi = uint author from query; SELECT article_id, author_id as author from gryphon_articlesAuthors
sql_field_string = title
sql_field_string = description
sql_field_string = copy
sql_field_string = type
sql_field_string = url_slug
sql_field_string = tags
sql_field_string = authors
sql_field_string = workflow_id
sql_field_string = section_id
sql_field_string = issue_id
sql_attr_uint = item_id
sql_attr_uint = created
sql_attr_uint = published
sql_attr_uint = modified
sql_attr_uint = blog_id
sql_attr_bool = status
sql_query_info = SELECT * FROM gryphon_articles WHERE uid=$id
}
index articleIndex
{
source = articleSource
path = /path/to/index
docinfo = extern
mlock = 0
morphology = stem_en
min_word_len = 1
charset_type = utf-8
enable_star = 1
html_strip = 0
html_remove_elements = style, script
min_infix_len = 3
infix_fields = url_slug
}