PHP正则表达式:获取字符串列表,搜索一些内容,用链接替换列表中的任何匹配项

时间:2012-02-16 20:19:37

标签: php regex

我有一个非常大的术语列表,例如这个(大约1600个条目,可能是2000个单词):http://pastebin.com/6XnWBJwM

我想在我的$content中搜索此列表中的字词,并使用以下格式的链接替换所有找到的内容:<a href="/glossary/firstinitial/term">term</a>,例如(term:abdomen){{1} }。

最有效的方法是什么?

根据this主题,我一直在使用<a href="/glossary/a/abdomen">abdomen</a>但无法使其正常工作 - 它目前将内容中的每个单词都链接到“/”!正则表达式我很穷!

提前致谢,

2 个答案:

答案 0 :(得分:2)

// the list of words
$words = explode("|",$arrayOfWords);

// iterate the array
foreach($words as $c=>$v)
 // replace the word in the link with the item of the array
 $line = preg_replace("|<a\w+>(.*)</a>|Usi",$v,$string)

有太多方法可以创建reg并解析它...所有的valids。

答案 1 :(得分:1)

如果你想改变,例如腹部进入<a href="/glossary/a/abdomen">abdomen</a>这是一个建议:

$terms = 'abdomen|etc|parental care';
// this is the string of the terms separated by pipes

$terms = explode('|',$terms);
// split terms into an array (aka $terms)
foreach ($terms as $key => $value) {
    $terms[$key] = preg_replace('/\s\s*/',' ',strtolower($value));
}
// change each into lowercase and normalize spaces

$str = 'Here\'s some example sentence using abdomen. Abdomen is a funny word and parental care is important.';

foreach ($terms as $term) {
// this will loop all the terms so it may take a while
// this looks like a requirement because you have multi-word terms in your list
    $str = preg_replace('/\b('.$term.')\b/i', '<a href="/glossary/'.$term{0}.'/'.str_replace(' ','%20',$term).'">$1</a>', $str);
    // regardless of case the link is assigned the lowercase version of the term.
    // spaces are replaced by %20's
    // --------------------
    // ------- EDIT -------
    // --------------------
    //   added \b's around the term in regex to prevent, e.g.
    //   'etc' in 'ketchup' from being caught.
}

编辑:检查代码中的最后一条评论。