我正在编写一些需要接受用户计算器输入的代码,因此,我认为我会使用正则表达式对输入字符串进行标记化,但是对字符串本身进行标记化会使我的单元测试中的小数点和“]”失败。
我首先使用了见过here的lookahead和lookbehind方法。
我写的是"((?<=[+-/*(){^}[%]π])|(?=[+-/*(){^}[%]π]))";
编译并成功运行,但是如果有带小数的数字则失败。
我回过头来,并尝试使用"[+-/*\\^%(){}[]]"
(下面的regex3)在链接问题中使用带和不带π的方法,因为我的第一个直觉是引起问题的字符,但在两种情况下都导致Exception in thread "main" java.util.regex.PatternSyntaxException: Unclosed character class near index 41
((?<=[+-/*\^%(){}[]])|(?=[+-/*\^%(){}[]]))
在这一点上,我回到了第一次尝试,重新排列了术语"((?<=[+-/*^%(){}[]π])|(?=[+-/*^%(){}[]π]))";
(下面的regex2),但是这个词在最后一个括号中也具有相同的PatternSyntaxException。
仅通过代码显示问题可能会更容易,我编写了一个类来运行三种不同的regex类尝试:
import java.util.Arrays;
public class RegexProblem {
/** This Delimiter string came from {@link https://stackoverflow.com/a/2206432/} */
static public final String WITH_DELIMITER = "((?<=%1$s)|(?=%1$s))";
// Split on and include + - * / ^ % ( ) [ ] { } π
public static void main(String[] args) {
String regex1="((?<=[+-/*(){^}[%]π])|(?=[+-/*(){^}[%]π]))";
String regex2="((?<=[+-/*^%(){}[]π])|(?=[+-/*^%(){}[]π]))";
String regex3="[+-/*\\^%(){}[]]";
String str="1.2+3-4^5*6/(78%9π)+[{0+-1}*2]";
String str2="[1.2+3]*4";
String[] expected={"1.2","+","3","-","4","^","5","*","6","(","78","%",
"9","π",")","+","[","{","0","+","-","1","}","*","2","]"};
String[] expected2={"[","1.2","+","3","]","*","4"};
System.out.println("Expected: ");
System.out.print("str: ");
System.out.println(Arrays.toString(expected));
System.out.print("str2: ");
System.out.println(Arrays.toString(expected2));
System.out.println();
System.out.println();
System.out.println("Regex1: ");
System.out.print("str: ");
System.out.println(Arrays.toString(str.split(regex1)));
System.out.print("str2: ");
System.out.println(Arrays.toString(str2.split(regex1)));
System.out.println();
System.out.println("Regex2: ");
System.out.print("str: ");
System.out.println(Arrays.toString(str.split(regex2)));
System.out.print("str2: ");
System.out.println(Arrays.toString(str2.split(regex2)));
System.out.println();
System.out.println("Regex3: ");
System.out.print("str: ");
System.out.print(Arrays.toString(str.split(String.format(WITH_DELIMITER, regex3))));
System.out.print("str2: ");
System.out.print(Arrays.toString(str2.split(String.format(WITH_DELIMITER, regex3))));
}
}
运行regex2和regex 3都失败了,但令我感到困惑的是regex1的行为,即使它看起来与其他字符的结束字符数量相同,它也会运行,并使用“。”分割。但不是“]”。
答案 0 :(得分:1)
尝试一下:
(?<=[^\d.])|(?=[^\d.])
说明:
\d
是[0-9]
的简写,所以任何数字。.
仅与文字点相匹配,该点在示例输入中似乎始终是数字的一部分。因此,[\d.]
是我们用来识别数字字符的东西。[^\d.]
与非数字字符匹配(克拉^
否定字符类)。(?<=[^\d.])
匹配以非数字字符开头的点。(?=[^\d.])
匹配点后跟非数字字符。