如何在不使用集合的情况下计算单词的频率?

时间:2014-02-17 22:47:10

标签: java

结果

0: Bear
1: Car
2: Bear
3: Cat
4: Car
5: Dog
6: Bear
---Frequency---
Bear : 1
Car : 1
null : 1
Cat : 1
null : 1
Dog : 1
null : 1

代码

import java.util.Arrays;
import java.util.StringTokenizer;

public class WordCount {

    public static void main(String[] args) {

        String text = "Bear Car Bear Cat Car Dog Bear";
        StringTokenizer str = new StringTokenizer(text);
        String word[] = new String[10];
        String unique[] = new String[10];
        String w;

        int count = -1;

        while (str.hasMoreTokens()) {
            count++;
            w = str.nextToken();
            word[count] = w;

            System.out.println(count + ": " + word[count]);

        }

        System.out.println("---Frequency---");

        // create unique words
        for (int i = 0; i < 7; i++) {

            if ((!Arrays.asList(unique).contains(word[i]))) {
                unique[i] = word[i];
            }
        }

        // measuring frequency
        int[] measure = new int[10];

        for (int z = 0; z < 7; z++) {
            if (Arrays.asList(unique).contains(word[z])) {
                measure[z] += 1;
                System.out.println(unique[z] + " : " + measure[z]);
            }
        }
    }
}

3 个答案:

答案 0 :(得分:2)

您当前的代码存在以下几个问题:

  1. 您明确使用Collection界面。每次拨打Arrays#asList(...)时都会这样做。你没有满足你的主要要求。
  2. 您应该为所有数组中的元素维护一个计数器。在这种情况下,您可以使用相同的变量来保存uniquemeasure数组的大小。
  3. 您填写unique的算法错误。您只应添加一次一词(因为它必须是唯一的)。
  4. 您应该在measure数组中每Stringunique添加一次计数器。
  5. 此代码考虑了所有这些建议。

    import java.util.StringTokenizer;
    
    public class ThirteenthMain {
    
        public static void main(String[] args) {
    
            String text = "Bear Car Bear Cat Car Dog Bear";
            StringTokenizer str = new StringTokenizer(text);
            String word[] = new String[10];
            String unique[] = new String[10];
            // reading the words to analyze
            int wordSize = 0;
            while (str.hasMoreTokens()) {
                String w = str.nextToken();
                word[wordSize] = w;
                System.out.println(wordSize + ": " + word[wordSize]);
                wordSize++;
            }
            System.out.println("---Frequency---");
            // create unique words
            int uniqueWordSize = 0;
            for (int i = 0; i < wordSize; i++) {
                boolean found = false;
                for (int j = 0; j < uniqueWordSize; j++) {
                    if (word[i].equals(unique[j])) {
                        found = true;
                        break;
                    }
                }
                if (!found) {
                    unique[uniqueWordSize++] = word[i];
                }
            }
            // measuring frequency
            int[] measure = new int[10];
            for (int i = 0; i < uniqueWordSize; i++) {
                for (int j = 0; j < wordSize; j++) {
                    if (unique[i].equals(word[j])) {
                        measure[i]++;
                    }
                }
            }
            //printing results
            for (int i = 0; i < uniqueWordSize; i++) {
                System.out.println(unique[i] + " : " + measure[i]);
            }
        }
    }
    

    打印:

    0: Bear
    1: Car
    2: Bear
    3: Cat
    4: Car
    5: Dog
    6: Bear
    ---Frequency---
    Bear : 3
    Car : 2
    Cat : 1
    Dog : 1
    

答案 1 :(得分:1)

感谢Luiggi的灵感。这是我的解决方案,意识到我错过了非常重要的事情。嵌套循环。这只是我现有代码的几行编辑。我希望你们都能看到Luiggi的代码,因为它更加冗长(呵呵)。

<强>结果

0: Bear
1: Car
2: Bear
3: Cat
4: Car
5: Dog
6: Bear
---Frequency---
Bear : 3
Car : 2
Cat : 1
Dog : 1

import java.util.Arrays;
import java.util.StringTokenizer;

public class WordCount {

    public static void main(String[] args) {

        String text = "Bear Car Bear Cat Car Dog Bear";
        StringTokenizer str = new StringTokenizer(text);
        String word[] = new String[10];
        String unique[] = new String[10];
        String w;

        int count = -1;

        while (str.hasMoreTokens()) {
            count++;
            w = str.nextToken();
            word[count] = w;

            System.out.println(count + ": " + word[count]);

        }

        System.out.println("---Frequency---");

        // create unique words
        for (int i = 0; i < 7; i++) {

            if ((!Arrays.asList(unique).contains(word[i]))) {

                unique[i] = word[i];
            }

        }

        // measuring frequency
        int[] measure = new int[10];

        for (int z = 0; z < 7; z++) {

            if (unique[z] != null) {

                for (int j = 0; j < 7; j++) {

                    if (unique[z].equals(word[j])) {

                        measure[z] += 1;

                    }
                }
                System.out.println(unique[z] + " : " + measure[z]);
            }

        }

    }

}

答案 2 :(得分:1)

另一种解决方案:

String text = "Bear Car Bear Cat Car Dog Bear";
String[] allWords = text.split(" ");
String[] foundWords = new String[allWords.length];
int[] foundCount = new int[allWords.length];
int foundIndex= 0;

for (String aWord : allWords) {
    int j = 0;
    for (; j < foundIndex; j++) {
        if (foundWords[j].equals(aWord)) { //found
            foundCount[j]++;
            break;
        }
    }
    if (j == foundIndex) { //word bot found in foundWords
        foundWords[foundIndex] = aWord;
        foundCount[foundIndex] = 1;
        foundIndex++;
    }
}

// Print result
for (int i = 0; i <foundIndex ; i++) {
    System.out.println(foundWords[i] + " : " + foundCount[i]);
}

结果是:

Bear : 3
Car : 2
Cat : 1
Dog : 1