我很难找出忽略转义引号的模式。 我想要这个:
"10\" 2 Topping Pizza, Pasta, or Sandwich for $5 each. Valid until 2pm. Carryout only.","blah blah"
匹配为:
1> "10\" 2 Topping Pizza, Pasta, or Sandwich for $5 each. Valid until 2pm. Carryout only."
2> "blah blah"
我一直在尝试这个:
Pattern pattern = Pattern.compile("\"[^\"]*\"");
Matcher matcher = pattern.matcher(filteredCoupons);
我得到了这个
1> "10\"
2> ","
答案 0 :(得分:2)
你正在寻找的正则表达式是
"[^"\\]*(?:\\.[^"\\]*)*"
请参阅demo
在Java中,
String pattern = "\"[^\"\\\\]*(?:\\\\.[^\"\\\\]*)*\"";
答案 1 :(得分:0)
您的正则表达式似乎需要接受非引号或引号Pattern pattern = Pattern.compile("\"(\\\\.|[^\"])*\"");
之前的引号。在这种情况下尝试
\\\\.|[^\"]
这部分正则表达式\.
将尝试查找
|
- 任何转义字符,[^\"]
或)\.
- 任何非引用字符我在[^\"]
之前放置了\
,以阻止[^\"]
与foo\"bar"
匹配。
换句话说,对于像\\\\.|[^\"]
和正则表达式foo\"bar"
^^^-matched by [^\"]
foo\"bar"
^^-matched by \.
foo\"bar"
^^^-matched by [^\"]
foo\"bar"
^-can't be matched by anything since there is no \ before
nor it is non-quote
这样的文字,您将获得此匹配
String filteredCoupons = "\"10\\\" 2 Topping Pizza, Pasta, or Sandwich for $5 each. Valid until 2pm. Carryout only.\",\"blah blah\"";
Pattern pattern = Pattern.compile("\"(\\\\.|[^\"])*\"");
Matcher matcher = pattern.matcher(filteredCoupons);
while(matcher.find()){
System.out.println(matcher.group());
}
样本:
"10\" 2 Topping Pizza, Pasta, or Sandwich for $5 each. Valid until 2pm. Carryout only."
"blah blah"
输出:
File folder = new File(System.getProperty("user.dir")+"/src/test/resources/");
File[] files = folder.listFiles();
答案 2 :(得分:0)
也可以使用否定lookbehind:
(?s)".*?"(?<!\\.)
作为Java字符串:
"(?s)\".*?\"(?<!\\\\.)"
见test at regex101; test at regexplanet(点击&#34; Java&#34;)
"
之后,如果没有前面的反斜杠跳过一个字符".*?(?<!\\)"
,但在遇到"
(?s)
标记使点也匹配换行符为了兴趣,我在regexhero.net处使用示例字符串对不同版本进行了基准测试(感谢@stribizhev获取此链接!)。不确定regex101的步骤计数器在这里是否准确。
仅用于基准测试的非捕获组。有趣的是,"(?:\\.|[^"])*"
与捕获组"(\\.|[^"])*"
相比几乎翻了一倍。