我试图用StringTokenizer
递归地解析一个字符串。该字符串表示一个树,格式为:
[(0,1),[(00,01,02),[()],[()]]]
其中节点的信息存储在括号内,而括号是节点的子节点,用逗号分隔。例如,此字符串表示此树:
如果一个节点在括号内有某些东西,那么它就是一个普通的节点,如果它什么都没有,那就是一个叶子。
我已经编写了下面的代码来解析它,并且它工作正常但是当递归结束时,似乎令牌器没有任何其他令牌可以分析。问题是当它遇到最后一个括号(]]]
)时,它会直接跳到最后一个跳过其他括号。
import java.util.*;
public class ParseString
{
public void setParameters(String parameters) throws Exception {
setParameters(new StringTokenizer(parameters, "[(,)]", true));
}
public void setParameters(StringTokenizer tokenizer) throws Exception{
String buf;
try{
if (!(buf = tokenizer.nextToken()).equals("["))
throw new Exception("Malformed string, found " + buf + "instead of [");
boolean isLeaf = setWeights(tokenizer);
System.out.println("Leaf: " + isLeaf);
while (!(buf = tokenizer.nextToken()).equals("]")) {
do{
setParameters(tokenizer);
}while (!(tokenizer.nextToken().equals("]")));
if (!(buf = tokenizer.nextToken()).equals(","))
break;
}
}catch(Exception e){e.printStackTrace();}
}
public boolean setWeights(StringTokenizer tokenizer) throws
Exception{
String buf;
if(!(buf = tokenizer.nextToken()).equals("("))
throw new Exception("Malformed string, found "+ buf + "instead of (");
do{
buf = tokenizer.nextToken();
if(buf.equals(")")){
return true;
}
if(!buf.equals(","))
System.out.println(buf);
}while(!tokenizer.nextToken().equals(")"));
return false;
}
public static void main(String[] args)
{
ParseString ps = new ParseString();
try{
ps.setParameters("[(0,1),[(00,01,02),[()],[()]]]");
}catch(Exception e){e.printStackTrace();}
}
}
这是我运行它的输出:
0
1
Leaf: false
00
01
02
Leaf: false
Leaf: true
Leaf: true
java.util.NoSuchElementException
at java.util.StringTokenizer.nextToken(StringTokenizer.java:349)
at ParseString.setParameters(ParseString.java:22)
at ParseString.setParameters(ParseString.java:7)
at ParseString.main(ParseString.java:51)
另一件事:解析器应该能够分析任何通用树,而不仅仅是这个。如果有人能解决这个问题,我会很高兴。
答案 0 :(得分:1)
我认为在某些情况下,您可能会在嵌套循环中使用]
两次,可能会占用父级的右括号。
我只是按照以下方式使结构更加明显:
// Precondition: '[' expected
// Postcondition: Matching ']' consumed
void parseNode(StringTokenizer st) {
if (!st.nextToken().equals("[")) {
throw new RuntimeException("[ expected parsing node.");
}
boolean leaf = parseWeights(st);
System.out.println("isleaf: " + leaf);
// Behind ')': Parse children if any.
String token = st.nextToken();
while (token.equals(",")) {
parseNode(st);
token = st.nextToken();
}
if (!token.equals("]")) {
throw new RuntimeException("] expected.");
}
}
// Precondition: '(' expected
// Postcondition: Matching ')' consumed
boolean parseWeights(StringTokenizer st) {
if (!st.nextToken().equals("(")) {
throw new RuntimeException("( expected parsing node weights.");
}
String token = st.nextToken();
if (token.equals(")") {
return true;
}
while(true) {
System.out.println(token);
token = st.nextToken();
if (token.equals(")") {
break;
}
if (!token.equals(",") {
throw new RuntimeException(", or ) expected parsing weights.");
}
token = st.nextToken();
}
return false;
}
答案 1 :(得分:0)
您正在调用tokenizer.nextToken()
而不检查下一个令牌是否可用(可以通过调用tokenizer.hasMoreTokens()
来检查)。您应该首先检查,如果hasMoreTokens()
返回false
,只需通过调用return;
退出该方法。
但IMO最好先将所有令牌放入列表中,然后再以更简单的方式遍历它:
String s = "[(0,1),[(00,01,02),[()],[()]]]";
StringTokenizer strtok = new StringTokenizer(s, "[(,)]", true);
// put tokens in a list
List<String> list = new ArrayList<>();
while (strtok.hasMoreTokens()) {
list.add(strtok.nextToken());
}
// parse it, starting at position 0
parse(list, 0);
// parse method
public void parse(List<String> list, int position) {
if (position > list.size() - 1) {
// no more elements, stop
return;
}
String element = list.get(position);
if (")".equals(element)) { // end of node
// is leaf if previous element was the matching "("
System.out.println("Leaf:" + "(".equals(list.get(position - 1)));
} else if (!("[".equals(element) || "(".equals(element) || "]".equals(element) || ",".equals(element))) {
// print only contents of a node (ignoring delimiters)
System.out.println(element);
}
// parse next element
parse(list, position + 1);
}
输出结果为:
0
1
Leaf:false
00
01
02
Leaf:false
Leaf:true
Leaf:true
如果您想要嵌套/配置输出,可以向level
方法添加parse
变量:
public void parse(List<String> list, int position, int level) {
if (position > list.size() - 1) {
return;
}
String element = list.get(position);
int nextLevel = level;
if ("[".equals(element)) {
nextLevel++;
} else if ("]".equals(element)) {
nextLevel--;
} else if (")".equals(element)) {
for (int i = 0; i < nextLevel; i++) {
System.out.print(" ");
}
System.out.println("Leaf:" + "(".equals(list.get(position - 1)));
} else if (!("(".equals(element) || "]".equals(element) || ",".equals(element))) {
for (int i = 0; i < nextLevel; i++) {
System.out.print(" ");
}
System.out.println(element);
}
parse(list, position + 1, nextLevel);
}
然后,如果我打电话(使用与上面相同的列表):
// starting at position zero and level zero
parse(list, 0, 0);
输出将是:
0
1
Leaf:false
00
01
02
Leaf:false
Leaf:true
Leaf:true
同一级别中的所有元素都具有相同的标识。