我试图编写一个从嵌套括号中提取信息的小程序。例如,如果我给出了字符串:
"content (content1 (content2, content3) content4 (content5 (content6, content7))"
我希望将其返回(在ArrayList或其他Collection中):
["content", "content1", "content2, content3", "content4", "content5", "content6, content7"]
是否有任何现有的库或算法可以帮助我解决这个问题。
提前致谢!
修改
感谢您的建议,但是content2和content3应该保存在最终列表中的相同字符串中,因为它们位于同一组括号中。
答案 0 :(得分:2)
这似乎符合上面给出的一个例子:
import java.util.ArrayList;
public class ParseParenthesizedString {
public enum States { STARTING, TOKEN, BETWEEN }
public static void main(String[] args)
{
ParseParenthesizedString theApp = new ParseParenthesizedString();
theApp.Answer();
}
public void Answer()
{
String theString =
"content (content1 (content2, content3) content4 (content5 (content6, content7))";
// wants:
// ["content", "content1", "content2, content3", "content4", "content5", "content6, content7"]
States state = States.STARTING;
ArrayList<String> theStrings = new ArrayList<String>();
StringBuffer temp = new StringBuffer("");
for (int i = 0; i < theString.length() ; i++)
{
char cTemp = theString.charAt(i);
switch (cTemp)
{
case '(':
{
if (state == States.STARTING) state = States.BETWEEN;
else if (state == States.BETWEEN) {}
else if (state == States.TOKEN )
{
state = States.BETWEEN;
theStrings.add(temp.toString().trim());
temp.delete(0,temp.length());
}
break;
}
case ')':
{
if (state == States.STARTING)
{ /* this is an error */ }
else if (state == States.TOKEN)
{
theStrings.add(temp.toString().trim());
temp.delete(0,temp.length());
state = States.BETWEEN;
}
else if (state == States.BETWEEN ) {}
break;
}
default:
{
state = States.TOKEN;
temp.append(cTemp);
}
}
}
PrintArrayList(theStrings);
}
public static void PrintArrayList(ArrayList<String> theList)
{
System.out.println("The ArrayList with "
+ theList.size() + " elements:");
for (int i = 0; i < theList.size(); i++)
{
System.out.println(i + ":" + theList.get(i));
}
}
}
输出:
The ArrayList with 6 elements:
0:content
1:content1
2:content2, content3
3:content4
4:content5
5:content6, content7
答案 1 :(得分:0)
Java的String.split()将为您完成这项工作。它需要一个正则表达式来定义每个标记之间的分隔符...对于你来说,你的分隔符似乎是圆括号或逗号,可选择用两边的空格包围。所以这应该可以解决问题:
String[] result = s.split("\\s*[\\(\\),]+\\s*");
答案 2 :(得分:-1)
如果括号对您来说不重要(意味着结果不依赖于包围),那么String.split
可能会使用简单的正则表达式:
String[] result = input.split("[ ,()]+");