假设我们有一个像这样的字符串:
"abcdaaaaefghaaaaaaaaa"
"012003400000000"
我想删除最后一个重复字符,以获取此信息:
"abcdaaaaefgh"
"0120034"
使用正则表达式有没有一种简单的方法可以做到这一点? 我有点困难,我的代码开始看起来像一个巨大的怪物......
一些澄清:
什么被视为重复?
最后至少 2 字符的序列。一个字符不被视为重复。例如:在"aaaa"
中,'a'
不被视为重复,但在"baaaa"
中,它是。因此,在"aaaa"
的情况下,我们不必将任何内容更改为String。另一个例子:"baa"
必须提供"b"
。
对于只有一个字符的字符串?
像"a"
这样的字符串必须在不更改任何内容的情况下返回,但我们必须返回'a'
。
答案 0 :(得分:9)
您可以将replaceAll()
与后退参考一起使用:
str = str.replaceAll("(.)\\1+$", "");
编辑
为了满足整个字符串无法删除的要求,我只需添加一个检查,而不是使正则表达式过于复杂:
public String replaceLastRepeated(String str) {
String replaced = str.replaceAll("(.)\\1+$", "");
if (replaced.equals("")) {
return str;
}
return replaced;
}
答案 1 :(得分:3)
我不认为我会使用正则表达式:
public static String removeRepeatedLastCharacter(String text) {
if (text.length() == 0) {
return text;
}
char lastCharacter = text.charAt(text.length() - 1);
// Look backwards through the string until you find anything which isn't
// the final character
for (int i = text.length() - 2; i >= 0; i--) {
if (text.charAt(i) != lastCharacter) {
// Add one to *include* index i
return text.substring(0, i + 1);
}
}
// Looks like we had a string such as "1111111111111".
return "";
}
我个人我觉得比正则表达式更容易理解。它可能会也可能不会更快 - 我不想做出预测。
请注意,这将始终删除最终字符,无论是否重复。这意味着单个字符串总是以空字符串结尾:
"" => ""
"x" => ""
"xx" => ""
"ax" => "a"
"abcd" => "abc"
"abcdddd" => "abc"
答案 2 :(得分:3)
我不会使用正则表达式:
public class Test {
public void test() {
System.out.println(removeTrailingDupes("abcdaaaaefghaaaaaaaaa"));
System.out.println(removeTrailingDupes("012003400000000"));
System.out.println(removeTrailingDupes("0120034000000001"));
System.out.println(removeTrailingDupes("cc"));
System.out.println(removeTrailingDupes("c"));
}
private String removeTrailingDupes(String s) {
// Is there a dupe?
int l = s.length();
if (l > 1 && s.charAt(l - 1) == s.charAt(l - 2)) {
// Where to cut.
int cut = l - 2;
// What to cut.
char c = s.charAt(cut);
while (cut > 0 && s.charAt(cut - 1) == c) {
// Cut that one too.
cut -= 1;
}
// Cut off the repeats.
return s.substring(0, cut);
}
// Return it untouched.
return s;
}
public static void main(String args[]) {
new Test().test();
}
}
匹配@ JonSkeet的“规范”:
请注意,这将仅删除最后重复的字符。这意味着不会触及单个字符串,但如果两个字符相同,则双字符字符串可能变为空:
"" => ""
"x" => "x"
"xx" => ""
"aaaa" => ""
"ax" => "ax"
"abcd" => "abcd"
"abcdddd" => "abc"
我想知道是否有可能在正则表达式中实现该级别的控制?
由于而添加了 ...但是如果我们将此正则表达式与aaaa一起使用,则它不会返回任何内容。它应该返回aaaa。评论:
相反,请使用:
private String removeTrailingDupes(String s) {
// Is there a dupe?
int l = s.length();
if (l > 1 && s.charAt(l - 1) == s.charAt(l - 2)) {
// Where to cut.
int cut = l - 2;
// What to cut.
char c = s.charAt(cut);
while (cut > 0 && s.charAt(cut - 1) == c) {
// Cut that one too.
cut -= 1;
}
// Cut off the repeats.
return cut > 0 ? s.substring(0, cut): s;
}
// Return it untouched.
return s;
}
有合同:
"" => ""
"x" => "x"
"xx" => "xx"
"aaaa" => "aaaa"
"ax" => "ax"
"abcd" => "abcd"
"abcdddd" => "abc"
答案 3 :(得分:0)
将(.)\1+$
替换为空字符串:
"abcddddd".replaceFirst("(.)\\1+$", ""); // returns abc
答案 4 :(得分:0)
这应该可以解决问题:
public class Remover {
public static String removeTrailing(String toProcess)
{
char lastOne = toProcess.charAt(toProcess.length() - 1);
return toProcess.replaceAll(lastOne + "+$", "");
}
public static void main(String[] args)
{
String test1 = "abcdaaaaefghaaaaaaaaa";
String test2 = "012003400000000";
System.out.println("Test1 without trail : " + removeTrailing(test1));
System.out.println("Test2 without trail : " + removeTrailing(test2));
}
}