我需要匹配String中最后一个大写单词和另一个单词之间的所有字符。输入文字:在夜晚的大洞和(洞2)墙上跳过的CLEVER狐狸。
使用RegEx:
(?<=\b[A-Z]+\s)(.+?)(?=\sin)
以上正则表达式为 fox JUMPED OVER the big and (Hole 2) wall
预期输出: the big and (Hole 2) wall
任何人都可以破解这个吗?
答案 0 :(得分:2)
这可能不是最有效的解决方案,但似乎有效:
String text = "The CLEVER fox JUMPED OVER the big wall in the night.";
String regex = "(\\b[A-Z]+\\s)(?!.*\\b[A-Z]+\\b)(.+?)(\\sin)";
Matcher m = Pattern.compile(regex).matcher(text);
if (m.find()) {
System.out.println(m.group(2));
}
它使用负向前瞻以确保在捕获所需数据之前文本中不再有大写单词。
答案 1 :(得分:1)
您可以在第二个匹配表达式
中简单地排除大写字符 (?<=\b[A-Z]+\s)([^A-Z]+)(?=\sin)
这将强制第一部分与The CLEVER fox JUMPED OVER
匹配,第二个匹配表达式将产生the big wall
,最后一个匹配您测试句中唯一的in
序列。
答案 2 :(得分:1)
怎么样:
[A-Z][\s.](?!.*?[A-Z])(.*)\sin
Expl。:找到一个大写字母,后跟一个空格,后面没有任何后跟大写字母。然后捕获任何内容,但不包括空格,后跟给定的单词。
仅捕获想要的部分。
此致
答案 3 :(得分:0)
怎么样:
^.*(?:\b[A-Z]+\b)(.+?)(?=\sin)
<强>解释强>
The regular expression:
(?-imsx:^.*(?:\b[A-Z]+\b)(.+?)(?=\sin))
matches as follows:
NODE EXPLANATION
----------------------------------------------------------------------
(?-imsx: group, but do not capture (case-sensitive)
(with ^ and $ matching normally) (with . not
matching \n) (matching whitespace and #
normally):
----------------------------------------------------------------------
^ the beginning of the string
----------------------------------------------------------------------
.* any character except \n (0 or more times
(matching the most amount possible))
----------------------------------------------------------------------
(?: group, but do not capture:
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
[A-Z]+ any character of: 'A' to 'Z' (1 or more
times (matching the most amount
possible))
----------------------------------------------------------------------
\b the boundary between a word char (\w)
and something that is not a word char
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------
( group and capture to \1:
----------------------------------------------------------------------
.+? any character except \n (1 or more times
(matching the least amount possible))
----------------------------------------------------------------------
) end of \1
----------------------------------------------------------------------
(?= look ahead to see if there is:
----------------------------------------------------------------------
\s whitespace (\n, \r, \t, \f, and " ")
----------------------------------------------------------------------
in 'in'
----------------------------------------------------------------------
) end of look-ahead
----------------------------------------------------------------------
) end of grouping
----------------------------------------------------------------------