字符串中包含3条规则:
它包含单词或组(用括号括起来),并且组可以嵌套;
如果单词或组之间有空格,则这些单词或组应附加“ +”。
例如:
"a b" needs to be "+a +b"
"a (b c)" needs to be "+a +(+b +c)"
|
,则这些单词或组应用括号括起来。
例如:"a|b" needs to be "(a b)"
"a|b|c" needs to be "(a b c)"
考虑所有规则,这是另一个示例:
"aa|bb|(cc|(ff gg)) hh" needs to be "+(aa bb (cc (+ff +gg))) +hh"
我尝试使用正则表达式,堆栈和recursive descent parser logic,但仍不能完全解决问题。
有人可以分享这个问题的逻辑或伪代码吗?
新编辑: 一个更重要的规则:竖线的优先级更高。
例如:
aa|bb hh cc|dd (a|b)
必须为+(aa bb) +hh +(cc dd) +((a b))
(aa dd)|bb|cc (ee ff)|(gg hh)
必须为+((+aa +dd) bb cc) +((+ee +ff) (+gg +hh))
新编辑: 为了解决优先级问题,我找到了一种在调用Sunil Dabburi的方法之前添加括号的方法。
例如:
aa|bb hh cc|dd (a|b)
将是(aa|bb) hh (cc|dd) (a|b)
(aa dd)|bb|cc (ee ff)|(gg hh)
将是((aa dd)|bb|cc) ((ee ff)|(gg hh))
由于性能不是我的应用程序的大问题,因此至少可以使它对我有用。我猜 JavaCC 工具可以很好地解决此问题。希望其他人可以继续讨论并贡献这个问题。
答案 0 :(得分:2)
我试图解决问题。不确定是否在所有情况下均有效。验证了问题中提供的输入,效果很好。
因此,通常我们有<left_part><parantheses_code><right_part>
。现在left_part
可以为空,相似的right_part
可以为空。我们需要处理这种情况。
另外,如果right_part
以空格开头,则需要根据空间要求在left_part
上添加'+'。
注意:我不确定(a|b)
会有什么期望。结果应为((a b))
还是(a b)
。我完全按照((a b))
的定义去做。
现在这是工作代码:
public class Test {
public static void main(String[] args) {
String input = "aa|bb hh cc|dd (a|b)";
String result = formatSpaces(formatPipes(input)).replaceAll("\\$", " ");
System.out.println(result);
}
private static String formatPipes(String input) {
while (true) {
char[] chars = input.toCharArray();
int pIndex = input.indexOf("|");
if (pIndex == -1) {
return input;
}
input = input.substring(0, pIndex) + '$' + input.substring(pIndex + 1);
int first = pIndex - 1;
int closeParenthesesCount = 0;
while (first >= 0) {
if (chars[first] == ')') {
closeParenthesesCount++;
}
if (chars[first] == '(') {
if (closeParenthesesCount > 0) {
closeParenthesesCount--;
}
}
if (chars[first] == ' ') {
if (closeParenthesesCount == 0) {
break;
}
}
first--;
}
String result;
if (first > 0) {
result = input.substring(0, first + 1) + "(";
} else {
result = "(";
}
int last = pIndex + 1;
int openParenthesesCount = 0;
while (last <= input.length() - 1) {
if (chars[last] == '(') {
openParenthesesCount++;
}
if (chars[last] == ')') {
if (openParenthesesCount > 0) {
openParenthesesCount--;
}
}
if (chars[last] == ' ') {
if (openParenthesesCount == 0) {
break;
}
}
last++;
}
if (last >= input.length() - 1) {
result = result + input.substring(first + 1) + ")";
} else {
result = result + input.substring(first + 1, last) + ")" + input.substring(last);
}
input = result;
}
}
private static String formatSpaces(String input) {
if (input.isEmpty()) {
return "";
}
int startIndex = input.indexOf("(");
if (startIndex == -1) {
if (input.contains(" ")) {
String result = input.replaceAll(" ", " +");
if (!result.trim().startsWith("+")) {
result = '+' + result;
}
return result;
} else {
return input;
}
}
int endIndex = startIndex + matchingCloseParenthesesIndex(input.substring(startIndex));
if (endIndex == -1) {
System.out.println("Invalid input!!!");
return "";
}
String first = "";
String last = "";
if (startIndex > 0) {
first = input.substring(0, startIndex);
}
if (endIndex < input.length() - 1) {
last = input.substring(endIndex + 1);
}
String result = formatSpaces(first);
String parenthesesStr = input.substring(startIndex + 1, endIndex);
if (last.startsWith(" ") && first.isEmpty()) {
result = result + "+";
}
result = result + "("
+ formatSpaces(parenthesesStr)
+ ")"
+ formatSpaces(last);
return result;
}
private static int matchingCloseParenthesesIndex(String input) {
int counter = 1;
char[] chars = input.toCharArray();
for (int i = 1; i < chars.length; i++) {
char ch = chars[i];
if (ch == '(') {
counter++;
} else if (ch == ')') {
counter--;
}
if (counter == 0) {
return i;
}
}
return -1;
}
}
答案 1 :(得分:2)
这是我的尝试。根据您的示例以及我提出的一些示例,我相信根据规则,这是正确的。我将问题分为两部分来解决。
private String transformString(String input) {
Stack<Pair<Integer, String>> childParams = new Stack<>();
String parsedInput = input;
int nextInt = Integer.MAX_VALUE;
Pattern pattern = Pattern.compile("\\((\\w|\\|| )+\\)");
Matcher matcher = pattern.matcher(parsedInput);
while (matcher.find()) {
nextInt--;
parsedInput = matcher.replaceFirst(String.valueOf(nextInt));
String childParam = matcher.group();
childParams.add(Pair.of(nextInt, childParam));
matcher = pattern.matcher(parsedInput);
}
parsedInput = transformBasic(parsedInput);
while (!childParams.empty()) {
Pair<Integer, String> childGroup = childParams.pop();
parsedInput = parsedInput.replace(childGroup.fst.toString(), transformBasic(childGroup.snd));
}
return parsedInput;
}
// Transform basic only handles strings that contain words. This allows us to simplify the problem
// and not have to worry about child groups or nested groups.
private String transformBasic(String input) {
String transformedBasic = input;
if (input.startsWith("(")) {
transformedBasic = input.substring(1, input.length() - 1);
}
// Append + in front of each word if there are multiple words.
if (transformedBasic.contains(" ")) {
transformedBasic = transformedBasic.replaceAll("( )|^", "$1+");
}
// Surround all words containing | with parenthesis.
transformedBasic = transformedBasic.replaceAll("([\\w]+\\|[\\w|]*[\\w]+)", "($1)");
// Replace pipes with spaces.
transformedBasic = transformedBasic.replace("|", " ");
if (input.startsWith("(") && !transformedBasic.startsWith("(")) {
transformedBasic = "(" + transformedBasic + ")";
}
return transformedBasic;
}
通过以下测试案例验证:
@ParameterizedTest
@CsvSource({
"a b,+a +b",
"a (b c),+a +(+b +c)",
"a|b,(a b)",
"a|b|c,(a b c)",
"aa|bb|(cc|(ff gg)) hh,+(aa bb (cc (+ff +gg))) +hh",
"(aa(bb(cc|ee)|ff) gg),(+aa(bb(cc ee) ff) +gg)",
"(a b),(+a +b)",
"(a(c|d) b),(+a(c d) +b)",
"bb(cc|ee),bb(cc ee)",
"((a|b) (a b)|b (c|d)|e),(+(a b) +((+a +b) b) +((c d) e))"
})
void testTransformString(String input, String output) {
Assertions.assertEquals(output, transformString(input));
}
@ParameterizedTest
@CsvSource({
"a b,+a +b",
"a b c,+a +b +c",
"a|b,(a b)",
"(a b),(+a +b)",
"(a|b),(a b)",
"a|b|c,(a b c)",
"(aa|bb cc|dd),(+(aa bb) +(cc dd))",
"(aa|bb|ee cc|dd),(+(aa bb ee) +(cc dd))",
"aa|bb|cc|ff gg hh,+(aa bb cc ff) +gg +hh"
})
void testTransformBasic(String input, String output) {
Assertions.assertEquals(output, transformBasic(input));
}