我对preg_match和preg_replace有点困惑。我有一个很长的内容字符串(来自博客),我想找到,分开并替换所有[caption]标签。可能的标签可以是:
[caption]test[/caption]
[caption align="center" caption="test" width="123"]<img src="...">[/caption]
[caption caption="test" align="center" width="123"]<img src="...">[/caption]
等
这是我的代码(但我发现它不按照我想要的方式工作......):
public function parse_captions($content) {
if(preg_match("/\[caption(.*) align=\"(.*)\" width=\"(.*)\" caption=\"(.*)\"\](.*)\[\/caption\]/", $content, $c)) {
$caption = $c[4];
$code = "<div>Test<p class='caption-text'>" . $caption . "</p></div>";
// Here, I'd like to ONLY replace what was found above (since there can be
// multiple instances
$content = preg_replace("/\[caption(.*) width=\"(.*)\" caption=\"(.*)\"\](.*)\[\/caption\]/", $code, $content);
}
return $content;
}
答案 0 :(得分:1)
目标是忽略内容位置。你可以试试这个:
$subject = <<<'LOD'
[caption]test1[/caption]
[caption align="center" caption="test2" width="123"][/caption]
[caption caption="test3" align="center" width="123"][/caption]
LOD;
$pattern = <<<'LOD'
~
\[caption # begining of the tag
(?>[^]c]++|c(?!aption\b))* # followed by anything but c and ]
# or c not followed by "aption"
(?| # alternation group
caption="([^"]++)"[^]]*+] # the content is inside the begining tag
| # OR
]([^[]+) # outside
) # end of alternation group
\[/caption] # closing tag
~x
LOD;
$replacement = "<div>Test<p class='caption-text'>$1</p></div>";
echo htmlspecialchars(preg_replace($pattern, $replacement, $subject));
模式(精简版):
$pattern = '~\[caption(?>[^]c]++|c(?!aption\b))*(?|caption="([^"]++)"[^]]*+]|]([^[]++))\[/caption]~';
模式说明:
开始标记之后,您可以拥有]
之前的内容或标题属性。该内容用以下内容描述:
(?> # atomic group
[^]c]++ # all characters that are not ] or c, 1 or more times
| # OR
c(?!aption\b) # c not followed by aption (to avoid the caption attribute)
)* # zero or more times
交替组(?|
允许多个具有相同编号的捕获组:
(?|
# case: the target is in the caption attribute #
caption=" # (you can replace it by caption\s*+=\s*+")
([^"]++) # all that is not a " one or more times (capture group)
"
[^]]*+ # all that is not a ] zero or more times
| # OR
# case: the target is outside the opening tag #
] # square bracket close the opening tag
([^[]+) # all that is not a [ 1 or more times (capture group)
)
这两个捕获现在具有相同的数字#1
注意:如果您确定每个标题标签不在多行上,则可以在模式的末尾添加m修饰符。
注意2:所有量词都是possessive,并且当快速失败和更好的表现时,我会使用atomic groups。
答案 1 :(得分:0)
你最好的行动方法是:
匹配caption
之后的所有内容。
preg_match("#\[caption(.*?)\]#", $q, $match)
使用爆炸功能提取$match[1]
中的值(如果有)。
explode(' ', trim($match[1]))