这就是我想要做的事情 - 在Php:给一个字符串,得到这样的结果:
(a()?b|c)
a是一个返回false的函数。调用b
c
或a()
(a()?(b()?d|e)|c)
。同样的原则。最终结果应为d
,e
或c
(a()?(b()?d|e)|(c()?f|g))
。同样的原则。最终结果应为d
,e
,f
或g
我面临的问题是a
(在我之前的例子中)也可以是一个表达式,如下所示:
((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))
我试图使用正则表达式来做,但这不起作用。
$res=preg_match_all('/\([^.\(]+\)/', $str, $matches);
所以最后我想像这样调用我的函数:
$final_string=compute("(a(x(y(z()?o|p)))?(b()?d|e)|(c()?f|g))");
$final_string
中的最终结果应为d
,e
,f
或g
我很确定之前已经做过一些事情但是无法在谷歌上找到它。 你会怎么做?
更准确地说,我想知道如何分析字符串:
$str =
"
(myfunction(12684444)
? {* comment *}
(
myfunction(1)|
myfunction(2)|
myfunction(80)|
myfunction(120)|
myfunction(184)|
myfunction(196)
? {* comment *}
AAAAA
{* /comment *}
|
{* Ignore all other values: *}
BBBBB
) {* /comment *}
| {* comment *}
CCCC
)";
答案 0 :(得分:4)
我相信你正在寻找类似的东西。在此过程中穿插的解释性评论。
如果您要将语法扩展到远远超出您所拥有的范围(即使您不是这样),请编写一个合适的解析器,而不是尝试在单个正则表达式中执行所有操作。这是一个有趣的练习,展示了PCRE的一些强大功能,但它可以非常轻易成为一个难以维护的混乱。
$tests = [
"a",
"a()",
"a(b)",
"(a?b|c)",
"(a()?(b()?d|e)|(c()?f|g))",
"((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))",
"(a(d(f))?b(e(f))|c)"
];
以后再使用。
$regex = <<<'REGEX'
/
(?(DEFINE)
# An expression is any function, ternary, or string.
(?<expression>
(?&function) | (?&ternary) | (?&string)
)
)
^(?<expr>
# A function is a function name (consisting of one or more word characters)
# followed by an opening parenthesis, an optional parameter (expression),
# and a closing parenthesis.
# Optional space is allowed around the parentheses.
(?<function>
(?<func_name> \w+ )
\s*\(\s*
(?<parameter> (?&expression)? )
\s*\)\s*
)
|
# A ternary is an opening parenthesis followed by an 'if' expression,
# a question mark, an expression evaluated when the 'if' is true,
# a pipe, an expression evaluated when the 'if' is false, and a closing
# parenthesis.
# Whitespace is allowed after '('; surrounding '?' and '|'; and before ')'.
(?<ternary>
\(\s*
(?<if> (?&expression) )
\s*\?\s*
(?<true> (?&expression) )
\s*\|\s*
(?<false> (?&expression) )
\s*\)
)
|
# A string, for simplicity's sake here, we'll call a sequence of word
# characters.
(?<string> \w+ )
)$
/x
REGEX;
自由使用命名捕获组有很大帮助,x
(PCRE_EXTENDED)修饰符允许使用注释和空格。 (?(DEFINE)...)
块允许您定义子模式以供仅供参考。
foreach ($tests as $test) {
if (preg_match($regex, $test, $m)) {
echo "expression: $m[expr]\n";
if ($m['function']) {
echo "function: $m[function]\n",
"function name: $m[func_name]\n",
"parameter: $m[parameter]\n";
} elseif ($m['ternary']) {
echo "ternary: $m[ternary]\n",
"if: $m[if]\n",
"true: $m[true]\n",
"false: $m[false]\n";
} else {
echo "string: $m[string]\n";
}
echo "\n";
}
}
expression: a
string: a
expression: a()
function: a()
function name: a
parameter:
expression: a(b)
function: a(b)
function name: a
parameter: b
expression: (a?b|c)
ternary: (a?b|c)
if: a
true: b
false: c
expression: (a()?(b()?d|e)|(c()?f|g))
ternary: (a()?(b()?d|e)|(c()?f|g))
if: a()
true: (b()?d|e)
false: (c()?f|g)
expression: ((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))
ternary: ((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))
if: (h() ? a | i)
true: (b() ? d | e)
false: (c() ? f | g)
expression: (a(d(f))?b(e(f))|c)
ternary: (a(d(f))?b(e(f))|c)
if: a(d(f))
true: b(e(f))
false: c
有点冗长,但很好地展示了什么是匹配的。
compute()
功能:function compute($expr) {
$regex = '/.../x'; // regex from above
if (!preg_match($regex, $expr, $m)) {
return false;
}
if ($m['function']) {
if ($m['parameter']) {
return $m['func_name'](compute($m['parameter']));
} else {
return $m['func_name']();
}
}
if ($m['ternary']) {
return compute($m['if']) ? compute($m['true']) : compute($m['false']);
}
return $m['string'];
}
非常简单 - 执行匹配的函数,评估匹配的三元表达式,或返回匹配的字符串;适当时递归。
compute()
演示:function a() {return true;}
function b() {return false;}
function d() {return true;}
function e() {return false;}
function h() {return true;}
foreach ($tests as $test) {
$result = compute($test);
echo "$test returns: ";
var_dump($result);
}
a returns: string(1) "a"
a() returns: bool(true)
a(b) returns: bool(true)
(a?b|c) returns: string(1) "b"
(a()?(b()?d|e)|(c()?f|g)) returns: string(1) "e"
((h() ? a | i) ? (b() ? d | e) | (c() ? f | g)) returns: string(1) "e"
(a(d(f))?b(e(f))|c) returns: bool(false)
我很确定这是正确的。
答案 1 :(得分:2)
在这里扩展@PaulCrovella正则表达式 这将允许任何级别关闭任何表达式周围的嵌套括号并进行解析 并相应地修剪。空白也被修剪。
PHP示例:
$rx =
'/
^
(?<found> # (1 start)
\h*
(?:
# A function is a function name (consisting of one or more word characters)
# followed by an opening parenthesis, an optional parameter (expression),
# and a closing parenthesis.
(?<function> # (2 start)
(?>
(?<func_name> \w+ ) # (3)
\h* \(
(?<parameter> # (4 start)
(?&expression)
|
) # (4 end)
\h* \)
)
) # (2 end)
|
# A ternary is an opening \'if\' expression,
# a question mark, an expression evaluated when the \'if\' is true,
# a pipe, an expression evaluated when the \'if\' is false.
(?<pt> \( )? # (5)
(?<ternary> # (6 start)
(?>
(?<if> # (7 start)
(?&expression)
) # (7 end)
\h* \? \h*
(?<true> # (8 start)
(?&expression)
) # (8 end)
\h* \| \h*
(?<false> # (9 start)
(?&expression)
) # (9 end)
)
) # (6 end)
(?(\'pt\') \h* \) )
|
# A string, for simplicity\'s sake here, we\'ll call a sequence of word
# characters.
(?<string> # (10 start)
(?> \w+ )
) # (10 end)
|
(?<parens> # (11 start)
(?>
\( \h*
(?<parens_core> # (12 start)
\h*
(?&p_expression)
) # (12 end)
\h* \)
)
) # (11 end)
)
) # (1 end)
\h*
$
(?(DEFINE)
# expression is any function, parenthesized-ternary, or string.
(?<expression> # (13 start)
\h*
(?:
(?&function)
| \( (?&ternary) \h* \)
| (?&string)
| (?&parens)
)
) # (13 end)
# p_expression is any parenthesized - function, ternary, or string.
(?<p_expression> # (14 start)
\h*
(?:
(?&function)
| (?= . )
(?&ternary)
| (?&string)
| (?&parens)
)
) # (14 end)
)
/x';
function compute($expr) {
global $rx;
if (!preg_match($rx, $expr, $m)) {
return false;
}
if ($m['function']) {
if ($m['parameter']) {
return $m['func_name'](compute($m['parameter']));
} else {
return $m['func_name']( '' );
}
}
if ($m['ternary']) {
return compute($m['if']) ? compute($m['true']) : compute($m['false']);
}
if ($m['parens']) {
return compute($m['parens_core']);
}
return $m['string'];
}
function a() {return true; }
function b() {return false;}
function d() {return true;}
function e() {return false;}
function h() {return true;}
function intro($p) {if ($p) return 'intro'; return false;}
function type($p) {if ($p) return 'type'; return false;}
function insist($p) {if ($p) return 'insist'; return false;}
$tests = array(
"a",
"a()",
"a(b)",
"(a?b|c)",
"(a()?(b()?d|e)|(c()?f|g))",
"((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))",
"(a(d(f))?b(e(f))|c)",
"------------",
"a?b|c",
"(a?b|c)",
" ( ( ( ( ( a ) ) ? ( ( b ) ) | ( ( c ) ) ) ) ) ",
"b( (oo ? p |u) ) ? x | y",
"a ?b() | c",
" a? ( b ? t | r) | d ",
"a()",
"a? (bhh ) |(c)",
"(a) ? ((b(oo) ? x | y )) | (c)",
"a(((b)))",
"a? (bhh ) |((c))",
"(a()?(b()?d|e)|(c()?f|g))",
"((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))",
"(((h() ? a | i) ? (b() ? d | e) | (c() ? f | g)))",
"((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))",
"(a(d(f))?b(e(f))|c)",
"------------",
"((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))",
"(a(d(f))?b(e(f))|c)",
'(intro(intro(type(insist(poou))))?toutou|tutu)',
'type()intro(intro(type(insist(poou))))?type()|tutu'
);
foreach ($tests as $test) {
$result = compute($test);
echo "$test returns: ";
var_dump($result);
}
输出:
a() returns: bool(true)
a(b) returns: bool(true)
(a?b|c) returns: string(1) "b"
(a()?(b()?d|e)|(c()?f|g)) returns: string(1) "e"
((h() ? a | i) ? (b() ? d | e) | (c() ? f | g)) returns: string(1) "e"
(a(d(f))?b(e(f))|c) returns: bool(false)
------------ returns: bool(false)
a?b|c returns: string(1) "b"
(a?b|c) returns: string(1) "b"
( ( ( ( ( a ) ) ? ( ( b ) ) | ( ( c ) ) ) ) ) returns: string(1) "b"
b( (oo ? p |u) ) ? x | y returns: string(1) "y"
a ?b() | c returns: bool(false)
a? ( b ? t | r) | d returns: string(1) "t"
a() returns: bool(true)
a? (bhh ) |(c) returns: string(3) "bhh"
(a) ? ((b(oo) ? x | y )) | (c) returns: bool(false)
a(((b))) returns: bool(true)
a? (bhh ) |((c)) returns: string(3) "bhh"
(a()?(b()?d|e)|(c()?f|g)) returns: string(1) "e"
((h() ? a | i) ? (b() ? d | e) | (c() ? f | g)) returns: string(1) "e"
(((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))) returns: string(1) "e"
((h() ? a | i) ? (b() ? d | e) | (c() ? f | g)) returns: string(1) "e"
(a(d(f))?b(e(f))|c) returns: bool(false)
------------ returns: bool(false)
((h() ? a | i) ? (b() ? d | e) | (c() ? f | g)) returns: string(1) "e"
(a(d(f))?b(e(f))|c) returns: bool(false)
(intro(intro(type(insist(poou))))?toutou|tutu) returns: string(6) "toutou"
type()intro(intro(type(insist(poou))))?type()|tutu returns: bool(false)
答案 2 :(得分:0)
这是我的基本&#34;工作和完成工作的版本。它的一半&#34;递归(一个可以调用函数的循环)和我计划做的改进(处理&#34; +
&#34;分隔符到&#34;添加&#34;返回两个函数,以及句柄&#34; =
&#34;设置变量以使返回函数的值的短别名)似乎很容易在_compute()
函数中实现...也许因为我自己写了代码,也许是因为像Paul Crovella说的那样,我没有使用PCRE,因为它很容易变成一个难以维护的混乱...
注意:这段代码可以很容易地进行优化,并且它并不完美(有些情况下它不像(a()+b())
那样工作)...但是如果有人愿意完成它他/她&# 39;欢迎!
class Parser
{
private $ref = array(
'a' => array( 'type' => 'fn', 'val' => '_a'),
'b' => array( 'type' => 'fn', 'val' => '_b'),
'c' => array( 'type' => 'fn', 'val' => '_c'),
'd' => array( 'type' => 'fn', 'val' => '_d'),
'e' => array( 'type' => 'fn', 'val' => '_e'),
'f' => array( 'type' => 'fn', 'val' => '_f'),
'intro' => array( 'type' => 'fn', 'val' => '_getIntro'),
'insist' => array( 'type' => 'fn', 'val' => '_insist'),
'summoner_name' => array( 'type' => 'fn', 'val' => '_getSummonerName'),
'type' => array( 'type' => 'fn', 'val' => '_getEtat'),
' ' => array( 'type' => 'str', 'val' => ' ')
);
private function _a($p) { return 'valfnA'; }
private function _b($p) { return 'valfnB'; }
private function _c($p) { return 'valfnC'; }
private function _d($p) { return 'valfnD'; }
private function _e($p) { return 'valfnE'; }
private function _f($p) { return 'valfnF'; }
private function _getIntro($p) { return 'valGetIntro'; }
private function _insist($p) { return 'valInsist'; }
private function _getSummonerName($p) { return 'valGetSqmmonerName'; }
private function _getEtat($p) { return 'valGetEtat'; }
private function _convertKey($key, $params=false)
{
$retour = 'indéfini';
if (isset($this->ref[$key])) {
$val = $this->ref[$key];
switch ($val['type']) {
case 'fn':
$val=$val['val'];
if (method_exists($this, $val)) {
$retour = $this->$val($params);
}
break;
default:
if (isset($this->val['val'])) {
$retour = $this->val['val'];
}
break;
}
}
return $retour;
}
private function _compute($str)
{
$p=strpos($str, '?');
if ($p===false) {
$p=strpos($str, '=');
if ($p===false) {
return $str;
}
} else {
$or=strpos($str, '|');
if ($or===false) {
return false;
}
$s=substr($str,0,$p);
if (empty($s) || (strtolower($s)=='false')) {
return substr($str, $or+1);
}
return substr($str, $p+1, ($or-$p)-1);
}
return $str;
}
private function _getTexte($str, $i, $level)
{
if (empty($str)) {
return $str;
}
$level++;
$f = (strlen($str)-$i);
$val = substr($str, $i);
do {
$d = $i;
do {
$p=$d;
$d=strpos($str, '(', $p+1);
if (($p==$i) && ($d===false)) {
$retour = $this->_compute($str);
return $retour;
} elseif (($d===false) && ($p>$i)) {
$f=strpos($str, ')', $p+1);
if ($f===false) {
return false;
}
$d=$p;
while((--$d)>=$i) {
if (($str[$d]!=' ')
&& ($str[$d]!='_')
&& (!ctype_alnum($str[$d]))
) {
break;
}
}
if ($d>=$i) {
$d++;
} else {
$d=$i;
}
$val=substr($str, $d, ($f-$d)+1);
$fn=substr($str, $d, $p-$d);
$param=$this->_getTexte(
substr($str, $p+1, ($f-$p)-1), 0, $level+1
);
if (!empty($fn)) {
$val = $this->_convertKey($fn, $param);
} else {
$val = $this->_compute($param);
}
$str = substr($str, 0, $d).$val.substr($str, $f+1);
break;
} elseif ($d===false) {
break;
}
} while (true);
} while (true);
}
public function parse($str)
{
$retour=preg_replace('/{\*[^.{]+\*}/', '', $str); //}
$retour=str_replace("\n", "", $retour);
$retour=str_replace("\r", "", $retour);
while (strpos($retour, ' ')!==false) {
$retour=str_replace(" ", " ", $retour);
}
return trim($this->_getTexte($retour, 0, 0));
}
}
$p=new Parser();
$tests = [
"a",
"a()",
"a(b)",
"(a?b|c)",
"(a()?(b()?d|e)|(c()?f|g))",
"(a()?(b()?d|e)|(c()?f()|g))",
"((h() ? a | i) ? (b() ? d | e) | (c() ? f | g))",
"(a(d(f))?b(e(f))|c)",
'(intro(intro(type(insist(poou))))?toutou|tutu)',
'type()intro(intro(type(insist(poou))))?type()|tutu'
];
foreach ($tests as $test) {
$res=$p->parse($test);
echo $test.' = '.var_export($res,true)."\n";
}