我目前正在调查如何将sphinx集成到现有网站上。
我在为度假屋租房的客户工作。他们有一个网站,客户可以在网上进行预订。在他们的头版,他们有一个搜索引擎,搜索他们所有的家(20k)并在各个领域订购。过去,我们为每次搜索都进行了MySQL查询。现在,数据库已经增长了,这使得查询的允许速度比过去慢。 出于这个原因,我们正在研究如何改进搜索引擎。我目前正在与sphinx合作,看看它是否适合我们。
我已安装Sphinx并具有以下源和索引:
source huisjesSource
{
type = mysql
sql_host = localhost
sql_user = user
sql_pass = password
sql_db = database
sql_query= SELECT a.huis_id as huis_id, a.huis_code as huis_code, a.land_code, a.regi_code, a.huis_naam, a.huis_plaats,a.multimedia,a.foto_a, a.foto_w,a.foto_a_full, a.foto_w_full, a.huis_van, a.huis_tm, a.hd, a.a_hd, st,a.sl, a.beds, a.bathr, a.baths,a.airport, a.huis_enqe_vr13_aantal, a.huis_enqe_vr13_punten, a.huis_longitude, a.huis_latitude, a.huis_catering_verplicht, 1w_min, 1w_max, 2w_min, 2w_max, 3w_min, 3w_max, wk_min, wk_max, lw_min, lw_max, mw_min, mw_max, age,(CASE WHEN vz200p239 = '1' or vz238 = '1' or vz235 = '1' THEN 1 ELSE 0 END) as relax,ph.plaatsnaam as hplaats,ph.hplaatsid,ps.subplaatsid,ps.plaatsnaam as splaats,ty20,ty30,ty40,ty50,ty60,ty70,ty90,ty160,becdir,huis_hbes_small, a.regi_oms_nl as regi_oms \
FROM as_search a \
left join bv_myisam.huis_plaats hpl on a.huis_code = hpl.huis_code \
left join bv_myisam.plaatsen_head ph on hpl.hplaatsid = ph.hplaatsid and ph.lang = 'nl' \
left join bv_myisam.plaatsen_sub ps on hpl.subplaatsid = ps.subplaatsid and hpl.subplaatsid != 'null' and ps.lang = 'nl' \
left join huis_oms o on a.huis_code = o.huis_code AND o.lang = 'nl' \
inner join huis_sort so on a.huis_code = so.huis_code \
inner join bbpr_n b on a.huis_code = b.huis_code \
WHERE a.avail = '1' AND a.demo = '0' AND a.bvdir = '1' \
GROUP BY a.huis_code
# (not needed) sql_attr_uint = huis_id # int(11)
sql_attr_string = huis_code # varchar(14)
sql_attr_string = land_code # char(2)
sql_attr_string = regi_code # varchar(10)
sql_attr_string = huis_naam # varchar(50)
sql_attr_string = huis_plaats # varchar(40)
sql_attr_bool = multimedia # enum('1','0')
sql_attr_string = foto_a # varchar(90)
sql_attr_string = foto_w # varchar(90)
sql_attr_string = foto_a_full # varchar(90)
sql_attr_string = foto_w_full # varchar(90)
sql_attr_uint = huis_van # tinyint(4)
sql_attr_uint = huis_tm # tinyint(4)
sql_attr_string = hd # char(1)
sql_attr_uint = a_hd # tinyint(4)
sql_attr_string = st # char(1)
sql_attr_uint = sl # tinyint(3) unsigned
sql_attr_uint = beds # tinyint(3) unsigned
sql_attr_uint = bathr # tinyint(3) unsigned
sql_attr_uint = baths # tinyint(3) unsigned
sql_attr_string = airport # char(3)
sql_attr_uint = huis_enqe_vr13_aantal # smallint(6)
sql_attr_uint = huis_enqe_vr13_punten # smallint(6)
sql_attr_float = huis_longitude # double(8,5)
sql_attr_float = huis_latitude # double(8,5)
sql_attr_bool = huis_catering_verplicht # enum(0,1)
sql_attr_float = 1w_min # decimal(8,2)
sql_attr_float = 1w_max # decimal(8,2)
sql_attr_float = 2w_min # decimal(8,2)
sql_attr_float = 2w_max # decimal(8,2)
sql_attr_float = 3w_min # decimal(8,2)
sql_attr_float = 3w_max # decimal(8,2)
sql_attr_float = wk_min # decimal(8,2)
sql_attr_float = wk_max # decimal(8,2)
sql_attr_float = lw_min # decimal(8,2)
sql_attr_float = lw_max # decimal(8,2)
sql_attr_float = mw_min # decimal(8,2)
sql_attr_float = mw_max # decimal(8,2)
sql_attr_uint = age # tinyint(3) unsigned
sql_attr_uint = relax # boolean
sql_attr_string = hplaats # varchar(100)
sql_attr_uint = hplaatsid # int(11)
sql_attr_uint = subplaatsid # int(11)
sql_attr_string = splaats # varchar(100)
sql_attr_bool = ty20 # enum(1,0)
sql_attr_bool = ty30 # enum(1,0)
sql_attr_bool = ty40 # enum(1,0)
sql_attr_bool = ty50 # enum(1,0)
sql_attr_bool = ty60 # enum(1,0)
sql_attr_bool = ty70 # enum(1,0)
sql_attr_bool = ty90 # enum(1,0)
sql_attr_bool = ty160 # enum(1,0)
sql_attr_bool = becdir # enum(0,1)
sql_attr_string = huis_hbes_small # varchar(2000)
sql_attr_string = regi_oms # varchar(50)
}
#############################################################################
## index definition
#############################################################################
index huisjesIndex
{
type = plain
source = huisjesSource
path = /var/lib/sphinxsearch/data/huisjes
charset_type = utf-8
preopen = 1
}
索引正在创建:
# indexer --all
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/etc/sphinxsearch/sphinx.conf'...
indexing index 'huisjesIndex'...
collected 17059 docs, 0.0 MB
total 17059 docs, 0 bytes
total 98.422 sec, 0 bytes/sec, 173.32 docs/sec
total 1 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 27 writes, 0.008 sec, 312.9 kb/call avg, 0.3 msec/call avg
# indextool --check huisjesIndex
Sphinx 2.0.4-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/etc/sphinxsearch/sphinx.conf'...
checking index 'huisjesIndex'...
checking dictionary...
checking data...
checking kill-list...
check passed, 0.0 sec elapsed
但是当我做一个SELECT * FROM huisjesIndex时,我得到一个空集,但应该有超过17k的记录。我做错了吗?
# mysql -h localhost -P 9306 --protocol=tcp
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 2.0.4-release (r3135)
Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
mysql> SELECT * FROM huisjesIndex;
Empty set (0.00 sec)
任何帮助表示赞赏! :)
答案 0 :(得分:0)
我没有仔细检查,但看起来你已经将sql_query中的所有列定义为属性。
斯芬克斯不喜欢这样。它是一个文本搜索引擎,所以它需要一些全文字段。
一个简单的解决方案(如果你真的想要所有这些属性)是使用sql_field_string使你的至少一个列成为字段和属性。
另一种情况,但无论如何都可能正常工作,你是使用a.huis_id作为document_id,但是将a.huis_code分组。如果它们不是1:1映射(并且是唯一的),那么您将遇到问题。我认为通过document_id进行分组更常见。