Question

我有ArrayList个String，它包含以下记录：

this is a first sentence
hello my name is Chris 
what's up man what's up man
today is tuesday

我需要清除此列表，以便输出不包含重复内容。在上面的例子中，输出应该是：

this is a first sentence
hello my name is Chris 
what's up man
today is tuesday

如您所见，第3个字符串已被修改，现在只包含一个语句what's up man而不是其中两个。在我的列表中，有时候字符串是正确的，有时它会加倍，如上所示。

我想摆脱它，所以我想到迭代这个列表：

for (String s: myList) {

但我找不到消除重复的方法，特别是因为每个字符串的长度没有确定，所以我的意思是可能有记录：

this is a very long sentence this is a very long sentence

或有时短的：

single word singe word

是否有一些原生java函数呢？

Answer 1

假设字符串重复两次，并且在示例中间有空格，以下代码将删除重复：

for (int i=0; i<myList.size(); i++) {
    String s = myList.get(i);
    String fs = s.substring(0, s.length()/2);
    String ls = s.substring(s.length()/2+1, s.length());
    if (fs.equals(ls)) {
        myList.set(i, fs);
    }
}

代码只是将列表的每个条目分成两个子串（除以半点）。如果两者相等，则仅用一半替换原始元素，从而消除重复。

我正在测试代码并没有看到@Brendan Robert回答。该代码遵循与其答案相同的逻辑。

Answer 2

我建议使用正则表达式。我能够使用这种模式删除重复项：\b([\w\s']+) \1\b

public class Main {
    static String [] phrases = {
            "this is a first sentence",
            "hello my name is Chris",
            "what's up man what's up man",
            "today is tuesday",
            "this is a very long sentence this is a very long sentence",
            "single word single word",
            "hey hey"
    };
    public static void main(String[] args) throws Exception {
        String duplicatePattern = "\\b([\\w\\s']+) \\1\\b";
        Pattern p = Pattern.compile(duplicatePattern);
        for (String phrase : phrases) {
            Matcher m = p.matcher(phrase);
            if (m.matches()) {
                System.out.println(m.group(1));
            } else {
                System.out.println(phrase);
            }
        }
    }
}

结果：

this is a first sentence
hello my name is Chris
what's up man
today is tuesday
this is a very long sentence
single word
hey

Answer 3

假设：

大写单词等于小写单词。

String fullString = "lol lol";
String[] words = fullString.split("\\W+");
StringBuilder stringBuilder = new StringBuilder();
Set<String> wordsHashSet = new HashSet<>();

for (String word : words) {
    // Check for duplicates
    if (wordsHashSet.contains(word.toLowerCase())) continue;

    wordsHashSet.add(word.toLowerCase());
    stringBuilder.append(word).append(" ");
}
String nonDuplicateString = stringBuilder.toString().trim();

Answer 4

简单的逻辑：用标记空间分割每个单词，即＆＃34; ＆＃34;现在将它添加到LinkedHashSet中，取回，替换＆＃34; [＆＃34;，＆＃34;]＆＃34;，＆＃34;，＆＃34;

 String s = "I want to walk my dog I want to walk my dog";
 Set<String> temp = new LinkedHashSet<>();
 String[] arr = s.split(" ");

 for ( String ss : arr)
      temp.add(ss);

 String newl = temp.toString()
          .replace("[","")
          .replace("]","")
          .replace(",","");

 System.out.println(newl);

o / p：我想遛狗

Answer 5

这取决于您所拥有的情况，但假设该字符串最多可重复两次，而不是三次或更多次，您可以找到整个字符串的长度，找到中间点并比较中途点后的每个索引与匹配的开始索引。如果字符串可以重复多次，则需要一个更复杂的算法，该算法首先确定字符串重复的次数，然后找到每个重复的起始索引，并从第一个开头截断所有索引。重复前进。如果您可以为您希望处理的可能场景提供更多上下文，我们可以开始汇总一些想法。

Answer 6

//在Java 8中完成

String str1 = "I am am am a good Good coder";
        String[] arrStr = str1.split(" ");
        String[] element = new String[1];
        return Arrays.stream(arrStr).filter(str1 -> {
            if (!str1.equalsIgnoreCase(element[0])) {
                element[0] = str1;
               return true;
            }return false;
        }).collect(Collectors.joining(" "));

如何在Java中消除String中的重复单词？

6 个答案: