Java使用特定规则编辑文本

时间:2013-09-18 13:41:38

标签: java

我必须使用某些规则编辑文本:

同一个单词中的重复字母将缩写为单个字母。

"Questions" instead of "QuestionssSsS"

单词之间的多个间隙将缩小为单个空格

"go to the cinema" instead of "go    to   the     cinema"

与单词分开的单个字母将连接到单词

"first ten person" instead of "firs t ten person"

例如:

String s = "I am enouuugGh of an artis t to draw         freely upon my imagination. ImaginatioOO n is more importan t than      knowledge. KKkKkKnowledge is limited. Imagination encircles the wwWorl d.";

预期产出:

I am enough of an artist to draw freely upon my imagination. Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world.

请提出建议和意见。

1 个答案:

答案 0 :(得分:2)

String s = "I am enouuugGh of an artis t to draw         freely upon my imagination. ImaginatioOO n is more importan t than      knowledge. KKkKkKnowledge is limited. Imagination encircles the wwWorl d.";
System.out.println(s);
System.out.println("========================================================");
s = s.replaceAll("\\s+"," ");
s = s.replaceAll("(?i)(\\w)\\1+","$1");
s = s.replaceAll("(\\w+) (\\w)(?=[ \\.\\?!,])","$1$2");
System.out.println(s);

输出:I am enough of an artist to draw frely upon my imagination. Imagination is more important than knowledge. Knowledge is limited. Imagination encircles the world.

 ==> \\s+ Several whitespace characters
 ==> \\w means A word character, short for [a-zA-Z_0-9]
 ==> \\w+ will represent one or more characters of \\w class
 we will also place it in group (\\w+) - this will be 2nd group
 ==> Pattern.CASE_INSENSITIVE flag is (?i)
 ==> $number is backreferrence