我正在尝试捕获<pre>
标记中的属性以及可选的类标记。我想在一个正则表达式中捕获类标记的内容,而不是捕获所有属性,然后在可能的情况下找到类属性值。由于类标记是可选的,我尝试添加?
,但这会导致以下正则表达式仅使用最后一个捕获组捕获 - 未捕获类,也不会捕获它之前的属性。
// Works, but class isn't optional
'(?<!\$)<pre([^\>]*?)(\bclass\s*=\s*(["\'])(.*?)\3)([^\>]*)>'
// Fails to match class, the whole set of attributes are matched by last group
'(?<!\$)<pre([^\>]*?)(\bclass\s*=\s*(["\'])?(.*?)\3)([^\>]*)>'
e.g. <pre style="..." class="some-class" title="stuff">
修改
我最终使用了这个:
$wp_content = preg_replace_callback('#(?<!\$)<\s*pre(?=(?:([^>]*)\bclass\s*=\s*(["\'])(.*?)\2([^>]*))?)([^>]*)>(.*?)<\s*/\s*pre\s*>#msi', 'CrayonWP::pre_tag', $wp_content);
它允许标记内的空格,并且还分隔类属性之前和之后的内容以及捕获所有属性。
然后回调将事情放在适当位置:
public static function pre_tag($matches) {
$pre_class = $matches[1];
$quotes = $matches[2];
$class = $matches[3];
$post_class = $matches[4];
$atts = $matches[5];
$content = $matches[6];
if (!empty($class)) {
// Allow hyphenated "setting-value" style settings in the class attribute
$class = preg_replace('#\b([A-Za-z-]+)-(\S+)#msi', '$1='.$quotes.'$2'.$quotes, $class);
return "[crayon $pre_class $class $post_class] $content [/crayon]";
} else {
return "[crayon $atts] $content [/crayon]";
}
}
答案 0 :(得分:4)
您可以将class
属性的捕获组放在先行断言中并使其成为可选项:
'(?<!\$)<pre(?=(?:[^>]*\bclass\s*=\s*(["\'])(.*?)\1)?)([^>]*)>'
现在,$2
将包含class
属性的值(如果存在)。
(?<!\$) # Assert no preceding $ (why?)
<pre # Match <pre
(?= # Assert that the following can be matched:
(?: # Try to match this:
[^>]* # any text except >
\bclass\s*=\s* # class =
(["\']) # opening quote
(.*?) # any text, lazy --> capture this in group no. 2
\1 # corresponding closing quote
)? # but make the whole thing optional.
) # End of lookahead
([^\>]*)> # Match the entire contents of the tag and the closing >