RegExp有LONG模式的问题

时间:2013-11-18 13:20:34

标签: php regex wordpress replace preg-replace

通常字符串很长,并且在执行Reqular Expression代码时模式很短,但这次是另一种方式。我有一个约500个字符的短文。在该文本中,我想查找与大约47.000个唯一名称的数据库匹配的名称,并添加指向特定名称的链接。这样做的最佳方式是什么?我将名称数组划分为64个分区,因为一个数组要大到可以作为模式。

function implode_r ($glue, $pieces){
    $out = "";
    foreach ($pieces as $piece){
        if (is_array ($piece)){
            $out .= implode_r ($glue, $piece); // recurse
        }
        else{
            if(strlen($piece)>1){
                $piece = str_replace("(", "\(", $piece);
                $piece = str_replace(")", "\)", $piece);
                $piece = str_replace("[", "\[", $piece);
                $piece = str_replace("]", "\]", $piece);
                $piece = str_replace(":", "\:", $piece);
                $piece = str_replace(".", "\.", $piece);
                $piece = str_replace(",", "\,", $piece);
                $piece = str_replace("'", "\'", $piece);
                $piece = str_replace("&", "\&", $piece);
                $piece = str_replace("?", "\?", $piece);
                $piece = str_replace("!", "\!", $piece);
                $piece = str_replace("<", "\<", $piece);
                $piece = str_replace(">", "\>", $piece);
                $piece = str_replace("{", "\{", $piece);
                $piece = str_replace("}", "\}", $piece);
                $out .= $glue.$piece;
            }
        }
    }
    return $out;
}

function partition( $list, $p ) {
    $listlen = count( $list );
    $partlen = floor( $listlen / $p );
    $partrem = $listlen % $p;
    $partition = array();
    $mark = 0;
    for ($px = 0; $px < $p; $px++) {
        $incr = ($px < $partrem) ? $partlen + 1 : $partlen;
        $partition[$px] = array_slice( $list, $mark, $incr );
        $mark += $incr;
    }
    return $partition;
}

add_filter( 'the_content', 'find_names_in_text');
add_filter( 'get_the_content', 'find_names_in_text');
function find_names_in_text($content){
    global $wpdb;
    $thenames = $wpdb->get_results("SELECT post_title FROM $wpdb->posts WHERE post_type='dogs' GROUP BY post_title", ARRAY_N);
    $namesparts = partition($thenames, 64);
    foreach($namesparts as $part){
        $pattern = implode_r("|", $part);
        $content = preg_replace("(".$pattern.")", "<a href='$1'>$1</a>", $content);
    }
    return $content;
}

1 个答案:

答案 0 :(得分:3)

如果你的文字只有500个字符,我会反过来说。将文本分成可能是名称的部分(假设这些是单词,我认为没有分词单词)。

所以现在你有<您希望在数据库中匹配500个单词,因此最坏的情况是您必须检查数据库中的那些。你不会从500个字符中得到500个单词,所以你可以从那里得到一个可管理的查询。