为了提高搜索引擎的速度,我经历了很多死胡同,最后我决定转向Sphinx(无论好坏)。所以我们的网站是一个基于定制MVC框架的在线商店。
当用户搜索时,让我们说“乳清蛋白”,我们会向他提供这些关键字的结果以及过滤的类别和属性。
因此,假设名为“ 100%乳清蛋白”的产品属于“Proteins”类别,此类别已分配属性:
产品“ 100%乳清蛋白”具有以下值:
当用户搜索短语“乳清蛋白”时,我们会向他展示包含该短语的所有产品,我们也会过滤我们的类别,以便仅显示包含该短语和我们的产品有短语的产品。您可以查看this link的示例(您可以使用Google翻译,因为该网站是保加利亚语)。
所以这是我的Sphinx配置文件。
source content {
sql_query_pre = SET NAMES utf8
sql_query = \
SELECT p.*, p.product_id, \
m.man_name, m.man_image_location, m.man_seo_url, pcr.ctg_id, \
gal.image_filelocation, pir.item_id, pav.atr_id \
from 3w_products p \
INNER JOIN 3w_products_gallery gal ON gal.product_id = p.product_id AND (gal.show_order = 1 OR gal.show_order = NULL)\
LEFT OUTER JOIN 3w_manufacturers m ON m.man_id = p.man_id \
LEFT OUTER JOIN 3w_products_cat_rel pcr ON pcr.product_id = p.product_id \
LEFT OUTER JOIN 3w_product_item_rel pir ON pir.product_id = p.product_id \
LEFT OUTER JOIN 3w_product_attribute_values pav ON pav.product_id = p.product_id \
;
#attribute declaration
sql_attr_uint = product_id
sql_attr_uint = man_id
sql_attr_uint = product_notinstock
sql_field_string = product_name
sql_field_string = man_name
sql_field_string = product_code
sql_field_string = product_seo_url
sql_field_string = image_filelocation
sql_field_string = product_intro_plain
sql_field_string = product_second_name
sql_field_string = product_price
sql_field_string = product_price_promo
sql_field_string = product_promo_expire_date
sql_field_string = product_views
sql_field_string = product_rating
sql_field_string = product_date_added
sql_field_string = product_price_returned
sql_field_string = man_image_location
sql_field_string = man_seo_url
sql_attr_uint = product_exquisite
sql_attr_uint = product_votes
sql_attr_uint = product_returned
sql_attr_multi = uint item_id from field; items
sql_attr_multi = uint atr_id from field; attributes
sql_attr_multi = uint ctg_id from field; categories
}
PHP代码:
require_once("extensions/libs/sphinx/sphinxapi.php");
if ($params['order']) {
switch ($params['order']) {
case 'newest':
$order = 'product_notinstock ASC, product_id DESC';
break;
case 'priceup':
$order = 'product_notinstock ASC, p.product_price_sell ASC';
break;
case 'pricedown':
$order = 'product_notinstock ASC, p.product_price_sell DESC';
break;
case 'views':
$order = 'product_notinstock ASC, product_views DESC';
break;
case 'promoted':
$order = 'product_notinstock ASC, p.product_price_promo DESC';
break;
case 'exquisite':
if (!$params['search']) {
$order = 'p.product_notinstock ASC, p.product_exquisite DESC, p.product_views DESC';
break;
} else {
$order = "product_notinstock ASC, @weight DESC, product_views DESC ";
break;
}
default:
$order = "product_notinstock ASC, @weight DESC, product_views DESC ";
break;
}
}
$phrase = $params['search'];
$page = isset($_GET['page']) ? $_GET['page'] : 1;
$client = new SphinxClient();
$client->SetLimits($params['offset'], $params['limit']);
$client->SetSortMode(SPH_SORT_EXTENDED, $order);
$client->SetRankingMode(SPH_RANK_SPH04);
$client->SetMatchMode(SPH_MATCH_EXTENDED);
$client->SetFieldWeights(array('product_code'=>30, 'product_name'=>15, 'product_meta_data'=>2));
if($params['returned']) $client->setFilter('product_returned', array($params['returned']));
if($params['man_id']) $client->setFilter('man_id', array($params['man_id']));
if($params['ctg_id']) $client->setFilter('ctg_id', array($params['ctg_id']));
if($params['properties'])
foreach($params['properties'] as $prop_id => $prop) $client->setFilter('item_id', array_keys($prop['items']));
if($params['attribute']) $client->setFilter('atr_id', array($params['attribute']['atr_id']));
$phrase = str_replace(" - ", "-", $phrase);
$query = '@product_name ('.$phrase.') | @product_meta_data ('.$phrase.') | @product_code ('.$phrase.')';
$res = $client->Query($query, 'products');
$products = array();
foreach($res['matches'] as $id => $product) {
$product['attrs']['product_id'] = $id;
$products[] = $product['attrs'];
}
$return['result'] = $products;
$return['finded'] = $res['total_found'];
return $return;
然而,这会返回正确的结果。但我需要将其他一些查询迁移到Sphinx,以进一步提高速度。
查询1 - 类别:
SELECT c.ctg_id, c.ctg_parent_id, c.ctg_name,
c.ctg_seo_url, count(distinct pcr.product_id) products
FROM 3w_product_categories c
LEFT JOIN 3w_products_cat_rel pcr ON pcr.ctg_id = c.ctg_id
LEFT JOIN 3w_products p ON p.product_id = pcr.product_id
WHERE ( MATCH (p.product_code,p.product_meta_data)
AGAINST ('+whey* +protein*' IN BOOLEAN MODE))
GROUP BY c.ctg_id ORDER BY c.ctg_name
查询2 - 制造商:
SELECT m.man_id, m.man_name,
m.man_seo_url, count(distinct p.product_id) products
FROM 3w_manufacturers m
INNER JOIN 3w_products p ON p.man_id = m.man_id
WHERE ( MATCH (p.product_code,p.product_meta_data)
AGAINST ('+whey* +protein*' IN BOOLEAN MODE))
GROUP BY m.man_id ORDER BY m.man_name ASC
查询3 - 属性(及其项目):
SELECT pi.item_id, pi.item_name, pi.item_slug, pp.prop_id,
pp.prop_name, pp.prop_slug, count(distinct pir.product_id) products
FROM 3w_property_items pi
LEFT JOIN 3w_product_item_rel pir ON pir.item_id = pi.item_id
INNER JOIN 3w_products p ON p.product_id = pir.product_id
LEFT JOIN 3w_properties pp ON pp.prop_id = pi.prop_id
WHERE ( MATCH (p.product_code,p.product_meta_data)
AGAINST ('+whey* +protein*' IN BOOLEAN MODE))
ORDER BY pp.prop_name ASC, pi.item_name ASC
我有查询4,它是产品的属性,但它与查询3等效。
所以我的问题是 - 如何在Sphinx索引中实现所有这些查询?当我输入另一个时,我会想到 - 当管理员插入/编辑产品时会发生什么?如何更新Sphinx索引? Cron每晚?
我不介意重写所有的搜索引擎,因为我知道它没有完美组织。
感谢您的时间!