我想创建一种匹配字符串的方法,比如
abc(xyz)
abc
abc(xyz)[123]
其中每个括号是可选单位。我最想要的就是
preg_match_all('complicated regex', $mystring, $matches);
$matches
返回以下内容:
$mystring= abc(xyz)[123]R
提供$matches=array(0 => "abc", 1=> "xyz", 2=> "123", 3=> "R")
$mystring= abc(xyz)R
提供$matches=array(0 => "abc", 1=> "xyz", 2=> "", 3=> "R")
$mystring= abc[123]R
提供$matches=array(0 => "abc", 1=> "", 2=> "123", 3=> "R")
$mystring= abc(xyz)[123]
提供$matches=array(0 => "abc", 1=> "xyz", 2=> "123", 3=> "")
$mystring= abc
提供$matches=array(0 => "abc", 1=> "", 2=> "", 3=> "")
我希望你明白这一点。我尝试如下:
preg_match_all("/([a-z]*)(\([a-zA-Z]\))?(\[\w\])?/", "foo(dd)[sdfgh]", $matches)
matches[0]
是
Array
(
[0] => foo
[1] =>
[2] => dd
[3] =>
[4] =>
[5] => sdfgh
[6] =>
[7] =>
)
为什么我会获得额外的空白比赛?如何避免他们根据需要获得结果(在matches
或matches[0]
...)。
答案 0 :(得分:1)
你得到的结果很多,因为你的比赛会再次开始8次。所有字符串(包括空字符串)与正则表达式的第一个非光学部分匹配:([a-z]*)
。
更正后的正则表达式:
preg_match_all("/^([a-z]*)(\([a-zA-Z]*\))?(\[\w*\])?$/", "foo(ddd)[sdfgh]", $matches);
编辑(排除主题第二部分中的括号)
我们希望'ddd'
代替'(ddd)'
:
此正则表达式使用“非捕获模式”(?: ... )
来标记主题的可选部分,但不在匹配数组中捕获它。
preg_match_all("/^([a-z]*)(?:\(([a-zA-Z]*)\))?(\[\w*\])?$/", "foo(ddd)[sdfgh]", $matches);
有趣的是:(?:\(([a-zA-Z]*)\))?
。
(?:
标志着非捕获子模式的开始\(
是一个转义的字面值(
标记标准捕获子模式的开头只有第三个parens对的内容才会显示在$ matches数组中。
答案 1 :(得分:1)
怎么样:
/^(\w*)(?:\((\w*)\))?(?:\[(\w*)\])(\w*)?$/
用法:
preg_match_all("/^(\w*)(?:\((\w*)\))?(?:\[(\w*)\])(\w*)?$/", "abc[123]R", $matches);
print_r($matches);
<强>输出:强>
Array
(
[0] => Array
(
[0] => abc[123]R
)
[1] => Array
(
[0] => abc
)
[2] => Array
(
[0] =>
)
[3] => Array
(
[0] => 123
)
[4] => Array
(
[0] => R
)
)
<强>解释强>
The regular expression:
(?-imsx:^(\w*)(?:\((\w*)\))?(?:\[(\w*)\])(\w*)?$)
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?: group, but do not capture (optional
(matching the most amount possible)):
----------------------------------------------------------------------
\( '('
----------------------------------------------------------------------
( group and capture to \2:
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0
or more times (matching the most
amount possible))
----------------------------------------------------------------------
) end of \2
----------------------------------------------------------------------
\) ')'
----------------------------------------------------------------------
)? end of grouping
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
\[ '['
----------------------------------------------------------------------
( group and capture to \3:
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0
or more times (matching the most
amount possible))
----------------------------------------------------------------------
) end of \3
----------------------------------------------------------------------
\] ']'
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
( group and capture to \4 (optional
(matching the most amount possible)):
----------------------------------------------------------------------
\w* word characters (a-z, A-Z, 0-9, _) (0 or
more times (matching the most amount
possible))
----------------------------------------------------------------------
)? end of \4 (NOTE: because you are using a
quantifier on this capture, only the LAST
repetition of the captured pattern will be
stored in \4)
----------------------------------------------------------------------
$ before an optional \n, and the end of the
string
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
答案 2 :(得分:0)
为什么不使用preg_split()?
$string = 'abc(xyz)[123]';
$array = preg_split('/\]?\(|\)\[?|\[|\]/', $string);
print_r($array);
答案 3 :(得分:0)
尝试使用这个简单的正则表达式:
[a-zA-Z0-9]+
使用preg_match_all
时,它会找到与给定模式匹配的所有子字符串,如果它们之间有大括号,括号或其他字符,则将它们分组。
preg_match_all("/[a-zA-Z0-9]+/", "foo(dd)[sdfgh]", $matches);
print_r($matches);
Array
(
[0] => Array
(
[0] => foo
[1] => dd
[2] => sdfgh
)
)
如果由于某种原因你需要单独使用括号和括号,你可以使用这样的分组:
([\(\)\[\]])?([a-zA-Z0-9]+)([\(\)\[\]])?
preg_match_all("/([\(\)\[\]])?([a-zA-Z0-9]+)([\(\)\[\]])?/", "foo(dd)[sdfgh]", $matches);
print_r($matches);
Array
(
[0] => Array
(
[0] => foo(
[1] => dd)
[2] => [sdfgh]
)
[1] => Array
(
[0] =>
[1] =>
[2] => [
)
[2] => Array
(
[0] => foo
[1] => dd
[2] => sdfgh
)
[3] => Array
(
[0] => (
[1] => )
[2] => ]
)
)
答案 4 :(得分:0)
一种在没有空物品的情况下获得所需物品的方法:
$pattern = '~(?|\[(\w*+)]|\(([a-zA-Z]*+)\)|\b([a-z]*+)\b)~';
preg_match_all($pattern, 'foo(dd)[sdfgh]', $matches);
print_r($matches[1]);
注意:这可以匹配括号中的空字符串,以避免它们,替换* by +
答案 5 :(得分:0)
可选的最后一个字母会稍微抛出结果,但这个表达式将覆盖它:
function doit($s)
{
echo "==== $s ====\n";
preg_match_all('/(\w+) # first word
(?: \(([^)]+)\) )? # match optional (xyz)
(?: \[([^]]+)\])? # match optional [123]
(\w?) # match optional last char
/x', $s, $matches, PREG_SET_ORDER);
print_r($matches);
}
doit('abc(xyz)[123]R xyz(123)');
doit('abc(xyz)R');
doit('abc[123]R');
doit('abc(xyz)[123]');
<强>结果
==== abc(xyz)[123]R xyz(123) ====
Array
(
[0] => Array
(
[0] => abc(xyz)[123]R
[1] => abc
[2] => xyz
[3] => 123
[4] => R
)
[1] => Array
(
[0] => xyz(123)
[1] => xyz
[2] => 123
[3] =>
[4] =>
)
)
==== abc(xyz)R ====
Array
(
[0] => Array
(
[0] => abc(xyz)R
[1] => abc
[2] => xyz
[3] =>
[4] => R
)
)
==== abc[123]R ====
Array
(
[0] => Array
(
[0] => abc[123]R
[1] => abc
[2] =>
[3] => 123
[4] => R
)
)
==== abc(xyz)[123] ====
Array
(
[0] => Array
(
[0] => abc(xyz)[123]
[1] => abc
[2] => xyz
[3] => 123
[4] =>
)
)