以下是字符串示例。
a (b, c(d and/or e, f, g), h, i[j, k]), l (m, n(o, p[q, r{s or t,u}, v]), w)
我需要解析这个:
{
-a
-b
-c
-d
-and/or
-e
-f
-g
-h
-i
-j
-k
-l
-m
-n
-o
-p
-q
-r
-s
-t
-or
-u
-v
-w
}
我开始搞乱一些正则表达式,但它很快就变得丑陋了。有什么建议吗?
感谢。
答案 0 :(得分:1)
我对你的规则一无所知,但这段代码基本上可以完成这项工作
<?php
$string = 'a (b, c(d and/or e, f, g), h, i[j, k]), l (m, n(o, p[q, r{s or t,u}, v]), w)';
$indentLevel = 0;
$out = '';
echo '{'."\n";
// Split string into array of characters (AFAIK, that is basically how every parser works out there) and iterate over it
foreach (str_split($string) as $x) {
// Determine if this character is string terminator or not
$isTerminator = in_array($x, array(' ', ',', '(', '[', '{', ')', ']', '}'));
// Output, because of string terminator, but only if output has something in it
if ($isTerminator && strlen($out) > 0) {
echo str_repeat("\t", $indentLevel).'-'.$out."\n";
$out = '';
}
// Add to output (multiple character string support), if this is not string terminator
elseif (!$isTerminator) {
$out .= $x;
}
// Increase indent, because of brackets
if (in_array($x, array('(', '[', '{'))) {
$indentLevel++;
}
// Decrease indent, because of brackets
elseif (in_array($x, array(')', ']', '}'))) {
$indentLevel--;
}
// This is how you can tell that there is bracket count mismatch
if ($indentLevel < 0) {
die('Syntax error');
}
}
echo '}'."\n";
请注意,我为字符串添加了多个字符支持,这是没有请求的,但我想,它会更好地展示基本想法。
我希望您能获得基本的想法,并且您将能够继续将此代码扩展到您特定需求的解析器中。
答案 1 :(得分:0)
没有赢得任何选美比赛,但是工作:
<?php
$s = 'a (b, c(d and/or e, f, g), h, i[j, k]), l (m, n(o, p[q, r{s or t,u}, v]), w)';
$chars = str_split($s);
$sep = array(',', ' ');
$open = array('(', '[', '{');
$close = array(')', ']', '}');
function parse($s)
{
global $sep, $open, $close;
$chars = str_split($s);
$arr = array();
$collect = '';
for ($i = 0; $i < count($chars); $i++) {
$c = $chars[$i];
if (in_array($c, $open)) {
$parens = 1;
$inner = '';
do {
$i++;
$ch = $chars[$i];
if (in_array($ch, $open)) {
$parens++;
} elseif (in_array($ch, $close)) {
$parens--;
}
if ($parens > 0) {
$inner .= $ch;
}
} while ($parens > 0);
if ($collect) {
$arr[] = '-'.$collect;
}
$arr[] = parse($inner);
$collect = '';
continue;
}
if (in_array($c, $sep)) {
if ($collect == '') {
continue;
}
$arr[] = '-'.$collect;
$collect = '';
} else {
$collect .= $c;
}
}
if ($collect) {
$arr[] = '-'.$collect;
}
return $arr;
}
print_r(parse($s));