正则表达式匹配句末标记

时间:2014-09-23 08:18:50

标签: java regex

我需要在给定的文本正文中匹配所有句子结尾符号,例如!?.(句点)等。

任何人都可以帮我解决这方面的正则表达式吗?

示例输入:

This is the f!!rst sentence! Is this the second one? The third sentence is here... And the fourth one!!

输出:

This is the f!!rst sentence Is this the second one The third sentence is here And the fourth one

3 个答案:

答案 0 :(得分:0)

您可能希望匹配任何内容(。*?),然后是句子结尾,后跟空格(\ s +)。从!,?和。是特殊字符,你需要远离它们。

例如

Pattern pattern = Pattern.compile("(.*?)[\\!\\?\\.]\\s+");
Matcher matcher = pattern.matcher("one two. three! four five? ");
while (matcher.find()) {
   System.out.println(matcher.group(1));
}

打印

one two
three
four five

答案 1 :(得分:0)

[!?.]+(?=$|\s)

试试这个。你可以根据需要添加标记。替换为``。

参见演示。

http://regex101.com/r/lS5tT3/15

答案 2 :(得分:0)

下面的正则表达式将匹配非单词字符(空格除外),后面必须跟一个空格字符或行锚点的结尾。 replaceAll函数有助于删除所有匹配的字符。

String s = "Blah! blah? blah... blah blah!!";
System.out.println(s.replaceAll("[^\\w\\s]+(?=\\s|$)", ""));

输出:

Blah blah blah blah blah

如果您只想删除单词中最后一个出现的?.!个字符,可以尝试以下代码。

String s = "This is the f!!rst sentence! Is this the second one? The third sentence is here... And the fourth one!!";
System.out.println(s.replaceAll("[!?.]+(?=\\s|$)", ""));

<强>输出:

This is the f!!rst sentence Is this the second one The third sentence is here And the fourth one