在数组中删除重复的单词句子

时间:2018-11-04 02:21:36

标签: java arrays

鉴于存在确定单词输出,每个单词的长度以及单词重复次数的问题,我有以下代码能够确定以下单词和每个单词的长度:

String sentence;
String charSentence;
String[] wordOutput;

private void analyzeWords(String s) {
    String[] words = sentence.split(" ");
    wordOutput = new String[words.length];
    int[] repeats = new int[words.length];

    // Increment a single repeat
    for (int i = 0; i < words.length; i++) {

        repeats[i] = 1;

        // Increment when repeated.
        for (int j = i + 1; j < words.length - 1; j++) {
            if (words[i].equalsIgnoreCase(words[j])) {
                repeats[i]++;
            }
        }

        wordOutput[i] = words[i] + "\t" + words[i].length() + "\t" + repeats[i];
    }

运行程序时,得到以下输出:

Equal   5   2
Equal   5   1 <- This is a duplicate word and should not be here when it repeats.

有人知道我的问题在哪里吗?与我的重复数组有关的事情吗?

3 个答案:

答案 0 :(得分:1)

第一个问题是,在内部的for循环中,您正在从i+1循环到length-1。您需要循环直到length。其次,您需要确定String中是否存在该单词,如果是,请使用continue语句。您可以这样做:

outer:
for (int i = 0; i < words.length; i++) {

    repeats[i] = 1;
    for(int index = i-1; index >= 0; index--) {
        if(words[i].equals(words[index])) {
            continue outer;
        }
    }
    ...
}

但是,与此相关的问题是,当您指定长度与字数相同的null时,列表的末尾将有Array个值。要解决此问题,您可以执行以下操作:

 wordOutput = Arrays.stream(wordOutput).filter(e-> e!= null).toArray(String[]::new);

这将滤除null

输出:

(输入String"This is a String is a with a lot lot of this repeats repeats"

This    4   2
is      2   2
a       1   3
String  6   1
with    4   1
lot     3   2
of      2   1
this    4   1
repeats 7   2

答案 1 :(得分:0)

不是在所有索引处递增计数,而是仅在单词的最后一次出现时才存储计数,在其他情况下,计数值将为0。最后遍历计数数组,如果其大于零,则打印该值及其计数

private void analyzeWords(String s) {
    String[] words = sentence.split(" ");
    wordOutput = new String[words.length];
    int[] repeats = new int[words.length];
    for (int i = 0; i < words.length; i++) {
        int count =1;
        int index = i;
        for (int j = i + 1; j < words.length - 1; j++) {
            if (words[i].equalsIgnoreCase(words[j])) {
                count++;
                index = j;
            }
        }
        if(repeats[index]==0){
          repeats[index]=count; // update repeat array only for last occurence of word
          wordOutput[i] = words[i] + "\t" + words[i].length() + "\t" + repeats[index];
        }
    }

答案 2 :(得分:0)

首先,正如GBlodgett提到的那样,您应该检查所有剩余的单词是否重复,您当前的解决方案会跳过最后一个单词。将第二个循环终止条件更新为j < words.length

第二,如果仅在需要解决方案中的条件时才打印重复项。示例之一:

boolean[] duplicates = new boolean[words.length];
// Increment a single repeat
for (int i = 0; i < words.length; i++) {
    repeats[i] = 1;
    // Check for duplicates,
    // If the word was not marked as duplicate
    if (!duplicates[i]) {
        // Increment when repeated.
        for (int j = i + 1; j < words.length; j++) {
            if (words[i].equalsIgnoreCase(words[j])) {
                repeats[i]++;
                duplicates[j] = true;
            }
        }
        wordOutput[i] = words[i] + "\t" + words[i].length() + "\t" + repeats[i];
    }
}

有一个Java 8+解决方案,例如:

Map<String, Long> m = Arrays.stream(s.split(" ")).collect(Collectors.groupingBy(Function.identity(), Collectors.counting()));

地图将有成对的单词及其出现。