Magento:改进搜索引擎(变形,不相关的单词删除等)

时间:2013-03-08 01:01:23

标签: php magento

我有兴趣知道我是否可以检测到变形(例如狗/狗),删除非重要的单词(“在美国制造” - >“在”中“和”“并不重要”等)。在用户为Magento搜索引擎输入的搜索字符串中,没有在一个大的PHP代码块中硬编码这么多场景。我可以在一定程度上处理这个搜索字符串,但它看起来不卫生和丑陋。

任何使其成为“智能”搜索引擎的建议或指示?

1 个答案:

答案 0 :(得分:0)

使用此课程:

class Inflection
{
    static $plural = array(
    '/(quiz)$/i' => "$1zes",
    '/^(ox)$/i' => "$1en",
    '/([m|l])ouse$/i' => "$1ice",
    '/(matr|vert|ind)ix|ex$/i' => "$1ices",
    '/(x|ch|ss|sh)$/i' => "$1es",
    '/([^aeiouy]|qu)y$/i' => "$1ies",
    '/(hive)$/i' => "$1s",
    '/(?:([^f])fe|([lr])f)$/i' => "$1$2ves",
    '/(shea|lea|loa|thie)f$/i' => "$1ves",
    '/sis$/i' => "ses",
    '/([ti])um$/i' => "$1a",
    '/(tomat|potat|ech|her|vet)o$/i'=> "$1oes",
    '/(bu)s$/i' => "$1ses",
    '/(alias)$/i' => "$1es",
    '/(octop)us$/i' => "$1i",
    '/(ax|test)is$/i' => "$1es",
    '/(us)$/i' => "$1es",
    '/s$/i' => "s",
    '/$/' => "s"
    );

    static $singular = array(
    '/(quiz)zes$/i' => "$1",
    '/(matr)ices$/i' => "$1ix",
    '/(vert|ind)ices$/i' => "$1ex",
    '/^(ox)en$/i' => "$1",
    '/(alias)es$/i' => "$1",
    '/(octop|vir)i$/i' => "$1us",
    '/(cris|ax|test)es$/i' => "$1is",
    '/(shoe)s$/i' => "$1",
    '/(o)es$/i' => "$1",
    '/(bus)es$/i' => "$1",
    '/([m|l])ice$/i' => "$1ouse",
    '/(x|ch|ss|sh)es$/i' => "$1",
    '/(m)ovies$/i' => "$1ovie",
    '/(s)eries$/i' => "$1eries",
    '/([^aeiouy]|qu)ies$/i' => "$1y",
    '/([lr])ves$/i' => "$1f",
    '/(tive)s$/i' => "$1",
    '/(hive)s$/i' => "$1",
    '/(li|wi|kni)ves$/i' => "$1fe",
    '/(shea|loa|lea|thie)ves$/i'=> "$1f",
    '/(^analy)ses$/i' => "$1sis",
    '/((a)naly|(b)a|(d)iagno|(p)arenthe|(p)rogno|(s)ynop|(t)he)ses$/i' => "$1$2sis",
    '/([ti])a$/i' => "$1um",
    '/(n)ews$/i' => "$1ews",
    '/(h|bl)ouses$/i' => "$1ouse",
    '/(corpse)s$/i' => "$1",
    '/(us)es$/i' => "$1",
    '/s$/i' => ""
    );

    static $irregular = array(
    'move' => 'moves',
    'foot' => 'feet',
    'goose' => 'geese',
    'sex' => 'sexes',
    'child' => 'children',
    'man' => 'men',
    'tooth' => 'teeth',
    'person' => 'people',
    'admin' => 'admin'
    );

    static $uncountable = array(
    'sheep',
    'fish',
    'deer',
    'series',
    'species',
    'money',
    'rice',
    'information',
    'equipment'
    );

    public static function pluralize( $string )
    {
global $irregularWords;

// save some time in the case that singular and plural are the same
    if ( in_array( strtolower( $string ), self::$uncountable ) )
        return $string;

    // check for irregular singular forms
    foreach ( $irregularWords as $pattern => $result )
    {
        $pattern = '/' . $pattern . '$/i';

        if ( preg_match( $pattern, $string ) )
            return preg_replace( $pattern, $result, $string);
    }

    // check for irregular singular forms
    foreach ( self::$irregular as $pattern => $result )
    {
        $pattern = '/' . $pattern . '$/i';

        if ( preg_match( $pattern, $string ) )
            return preg_replace( $pattern, $result, $string);
    }

    // check for matches using regular expressions
    foreach ( self::$plural as $pattern => $result )
    {
        if ( preg_match( $pattern, $string ) )
            return preg_replace( $pattern, $result, $string );
    }

    return $string;
    }

    public static function singularize( $string )
    {   
global $irregularWords;
    // save some time in the case that singular and plural are the same
    if ( in_array( strtolower( $string ), self::$uncountable ) )
        return $string;

// check for irregular words
    foreach ( $irregularWords as $result => $pattern )
    {
        $pattern = '/' . $pattern . '$/i';

        if ( preg_match( $pattern, $string ) )
            return preg_replace( $pattern, $result, $string);
    }

// check for irregular plural forms
    foreach ( self::$irregular as $result => $pattern )
    {
        $pattern = '/' . $pattern . '$/i';

        if ( preg_match( $pattern, $string ) )
            return preg_replace( $pattern, $result, $string);
    }

// check for matches using regular expressions
    foreach ( self::$singular as $pattern => $result )
    {
        if ( preg_match( $pattern, $string ) )
            return preg_replace( $pattern, $result, $string );
    }

    return $string;
    }

    public static function pluralize_if($count, $string)
    {
    if ($count == 1)
        return "1 $string";
    else
        return $count . " " . self::pluralize($string);
    }
}

如果您有时间使用标准方式进行变形使用:http://en.wikipedia.org/wiki/Inflection

你可以将数组与XML结合起来放置所有变形数据,看看codeigniter如何非常友好地变形:http://ellislab.com/codeigniter/user-guide/helpers/inflector_helper.html

许多框架支持内置变形,但它仅主要针对英语。对于其他语言,您应该编写自己的...或者使用unicode.org,如果需要,可以使用其他语言的一些变形标准。