如何从java中的文本文件中减少多个标点符号

时间:2013-08-19 05:49:37

标签: java regex punctuation

我有一些包含多个标点符号的文本文件,所以我需要将它们减少为单个标点符号。

以下是一些示例文本:

They are working in London..... he is a Java developer !!!!! they are playing------ She is working_______

这是必需的输出:

They are working in London.he is a Java developer !they are playing- She is working_

我需要一些Java正则表达式的帮助。

由于

3 个答案:

答案 0 :(得分:2)

使用反向引用(\1+)来匹配重复的字符。

请尝试以下操作:

String text = "They are working in London..... he is a Java developer !!!!! they are playing------ ---- ---- She is working_______";
String replaced = text.replaceAll("(?:([-.!_])\\1+\\s*)+", "$1");
System.out.println(replaced);

打印

They are working in London.he is a Java developer !they are playing-She is working_

答案 1 :(得分:0)

你可以试试这个

   String str = "They are working in London..... he is a Java developer !!!!! they are playing-----She is working_______";
   String newStr = str.replaceAll("([|\\-|\\.|\\!|\\_])\\1+", "$1");
   System.out.println(newStr);

直播Demo

Out put

They are working in London. he is a Java developer ! they are playing-She is working_

答案 2 :(得分:-1)

尝试这样的事情:

/([;?!-_]){2} / $ 1 /