改进我的正则表达式+ php替换

时间:2015-05-01 15:52:43

标签: php regex html-parsing

我试图用正则表达式替换部分字符串。我的代码完成了这项工作,但这是正确的方法吗?

$string = 'blabla <!-- s:D --><img src="{SMILIES_PATH}/icon_biggrin.gif" alt=":D" title="Very Happy" /><!-- s:D --> blabla <!-- scat --><img src="{SMILIES_PATH}/cat2.gif" alt="cat" title="Cat" /><!-- scat --> blabla';
$pattern = '(<!-- s(\S*) --><img src="\S*" alt="\S*" title="[^"]+" \/><!-- s\S* -->)';

preg_match_all($pattern, $string, $result);

$i = 0;
foreach ($result[0] as $match) {
    $string = str_replace($match, $result[1][$i], $string);
    $i++;
}

我想要的是什么:blabla :D blabla cat blabla

正则表达式测试:https://regex101.com/r/fD0xI2/2

PHP测试:http://ideone.com/mrS0BJ

1 个答案:

答案 0 :(得分:1)

我想你可以减少正则表达式的大小,即:

$string = 'blabla <!-- s:D --><img src="{SMILIES_PATH}/icon_biggrin.gif" alt=":D" title="Very Happy" /><!-- s:D --> blabla <!-- scat --><img src="{SMILIES_PATH}/cat2.gif" alt="cat" title="Cat" /><!-- scat --> blabla';

preg_match_all('/(\S+) <!-- (.*?) -->/sm', $string , $matches, PREG_PATTERN_ORDER);
for ($i = 0; $i < count($matches[1]); $i++) {
    $newString .= $matches[1][$i] ." ".$matches[2][$i]." " ;
}

echo $newString;

输出:

blabla s:D blabla scat 

演示:

http://ideone.com/9fons0

Regex Expanation:

(\S+) <!-- (.*?) -->

Options: Dot matches line breaks; ^$ match at line breaks; Greedy quantifiers

Match the regex below and capture its match into backreference number 1 «(\S+)»
   Match a single character that is NOT a “whitespace character” «\S+»
      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»
Match the character string “ <!-- ” literally « <!-- »
Match the regex below and capture its match into backreference number 2 «(.*?)»
   Match any single character «.*?»
      Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character string “ -->” literally « -->»