我有一个WordPress短代码,以[pullquote]
打开,以[/pullquote]
结尾。我试图获得开始和结束标签内的任何内容。
我是正则表达式的新手,所以我盯着一个捕捉字母,数字和空格的简单表达。
\[pullquote\]([0-9a-zA-z\s]*)\[\/pullquote\]
工作正常,但没有考虑标点符号等等所以我尝试了(.*)
这样做太多而且不够具体。
最后我尝试了这个
\[pullquote\](^(?:\[\/pullquote\])*)\[\/pullquote\]
我不清楚这里的术语,但基本上想要获得以[pullquote]
开头的任何内容,只要它不是[/pullquote]
并以{{结尾1}}。
至少在regexr.com它没有用,但我认为这意味着我做错了。
regexr上使用的文字
[/pullquote]
我怎样才能完成这项工作?我在这里做了其他任何错事。
由于
答案 0 :(得分:1)
你需要这个:
(\[pullquote\])(.+)(\[\/pullquote\])
只获得第2组$2
请在此处查看:https://regex101.com/r/dS8eZ0/2
从链接中提取的信息:
MATCH INFORMATION
"(\[pullquote\])(.+)(\[\/pullquote\])/g"
1st Capturing group "(\[pullquote\])"
"\[" matches the character [ literally
"pullquote" matches the characters pullquote literally (case sensitive)
"\]" matches the character ] literally
2nd Capturing group "(.+)"
".+" matches any character (except newline)
"Quantifier: +" Between one and unlimited times, as many times as possible,
giving back as needed [greedy]
3rd Capturing group "(\[\/pullquote\])"
"\[" matches the character [ literally
"\/" matches the character / literally
"pullquote" matches the characters pullquote literally (case sensitive)
"\]" matches the character ] literally
"g" modifier: global. All matches (don't return on first match)
答案 1 :(得分:1)
以下是使用strpos()
的基本搜索,您可能会尝试这样做以进行性能比较。
function extract_shortcode_content($needle, $haystack) {
if(empty($needle) || empty($haystack || !is_string($needle) || !is_string($haystack)) {
throw new Exception('Bad input');
}
// $needle is just intended to be shortcode value (i.e. 'pullquote')
// we will build appropriate start and end tags
$needle_trimmed = trim(trim($needle), '[]');
$start_code = '[' . $needle_trimmed. ']';
$end_code = '[/' . $needle_trimmed . ']';
$start_code_length = strlen($start_code);
$end_code_length = strlen($end_code);
$haystack_length = strlen($haystack);
$last_searchable_position = $haystack_length - $start_code_length - $end_code_length - 1;
$return_array = array();
// iterate through haystack extracting content
$search_offset = 0;
$continue = true;
while($search_offset < $last_searchable_position) {
$start_code_found = strpos($haystack, $start_code, $search_offset) {
if ($start_code_found === false) {
// no match in remainder of string
return $return_array;
}
// extract content
$content_start_position = $code_found + $start_code_length;
$end_code found = strpos($haystack, $start_code, $content_start_position);
if ($end_code_found === false) {
// we couldn't find close for current shortcode open tag.
// we don't count this as a match, so let's just return matches we have
return $return_array;
}
$match_length = $end_close_found - $content_start_position;
// add content to result array
$result_array[] = substr($haystack, $content_start_position, $match_length);
// set new search offset position for next iteration
$search_offset = $end_code_found + $end_code_length;
}
return $return_array;
}
现在,我并不是说你应该使用它而不是正则表达式方法。当然,正则表达式方法可以在几行代码中得到相同的结果。我只是建议这种方法可能比这个用例的正则表达式更好。然而,这可能是针对您的用例的微优化,并且不值得额外的代码复杂性。
我只是想为正则表达式提供另一种建议。