我希望解析一个类似下面的令牌文件,以获取令牌名称/值对。令牌/值/嵌套关系已经定义,所以我无法改变令牌文件的制作方式。看起来无上下文语法可能是最好的方法,但我没有写作或实现它的经验。是否有可能使用正则表达式?我对嵌套的多行令牌(如Master1,Servant2)没有任何好运。
;token1 = I am a top level single line token
;token2 {
I am a top level
multiline line token
}
master1 {
;servant1 = I am Master1, Servant1 single line token
;servant2 {
I am Master1, Servant2.
A mulit line token.
}
;servant3 = I am Master1, Servant3
}
master2 {
;servant1 = I am Master2, Servant1
;servant2 {
I am Master2, Servant2
A mulit line token.
}
;servant3 = I am Master2, Servant3
}
答案 0 :(得分:3)
PHP有一个用
标记字符串的函数strtok
- 将字符串(str)拆分为较小的字符串(标记),每个标记由标记中的任何字符分隔。也就是说,如果您有一个类似“这是一个示例字符串”的字符串,您可以使用空格字符作为标记将此字符串标记为单个字。 答案 1 :(得分:2)
这是一个相当简单的行走解析器(我最初试图为它编写一个正则表达式,但是在多行主控器的开头缺少一个前导;
真的让它变得更难了(没有;
缺失,写起来相当容易。)我放弃并写了这个):
function getTokens($string) {
$string = trim($string);;
$lines = explode("\n", $string);
$data = array();
$key = '';
$open = 0;
$buffer = '';
foreach ($lines as $line) {
$line = trim($line);
if (empty($line)) {
continue;
} elseif (strpos($line, '}') === 0) {
$open--;
if ($open == 0) {
$data[$key] = getTokens($buffer);
$buffer = '';
} elseif ($open < 0) {
throw new Exception('Unmatched }');
} else {
$buffer .= "\n" . $line;
}
} elseif ($open > 0) {
if (strpos($line, '{') !== false) {
$open++;
}
$buffer .= "\n" . $line;
} elseif ($line[0] == ';') {
if (strpos($line, "=") !== false) {
list ($key, $value) = explode("=", $line, 2);
$key = trim(substr($key, 1));
$value = trim($value);
$data[$key] = $value;
} elseif (strpos($line, "{") !== false) {
$open++;
list ($key, $value) = explode("{", $line, 2);
$key = trim(substr($key, 1));
} else {
throw new Exception('Unmatched token ;');
}
} elseif (strpos($line, '{') !== false) {
$open++;
list ($key, $value) = explode("{", $line, 2);
$key = trim($key);
} else {
$buffer .= "\n" . $line;
}
}
if ($open > 0) {
throw new Exception('Unmatched {');
} elseif (empty($data) && !empty($buffer)) {
return trim($buffer);
}
return $data;
}
当我把你的字符串作为输入时,我得到:
Array(
"token1" => "I am a top level single line token",
"token2" => "I am a top level
multiline line token",
"master1" => Array(
"servant1" => "I am Master1, Servant1 single line token",
"servant2" => "I am Master1, Servant2.
A mulit line token.",
"servant3" => "I am Master1, Servant3",
),
"master2" => Array(
"servant1" => "I am Master2, Servant1",
"servant2" => "I am Master2, Servant2
A mulit line token.",
"servant3" => "I am Master2, Servant3",
),
)