Question

我要计算文本中最常用的单词，我想这样做只需要一些帮助我将如何修复树形图.. 这就像现在的样子......

    TreeMap<Integer, List<String>> Word = new TreeMap<Integer, List<String>>();
    List<String> TheList = new ArrayList<String>();

//还有一些东西需要阅读..

    while (scanner.hasNext()) {
        String NewWord = scanner.next().toLowerCase();

        if (Word.containsKey(NewWord)) {
            Word.put(HERE I NEED HELP);
        } else {
            Word.put(HERE I NEED HELP);
        }

    }

所以我想做的是，如果NewWord在列表中，那么在Integer（key）上添加一个，如果没有，则将该词添加到下一个列表中。

Answer 1

上面的所有例子都正确地将计数存储到地图中，遗憾的是它们没有按计数排序，这也是您的要求。

不要使用TreeMap，而是使用HashMap来构建值。

获得完整的值列表后，您可以将entrySet从HashMap中删除到新的ArrayList中，然后按Entry<String,Integer>.getValue()对该数组列表进行排序。

或者整洁地创建一个新的“Count”对象，其中包含单词和计数并使用它。

Answer 2

不要......

  TreeMap<Integer, List<String>>

代替，

 TreeMap<String, Integer>   // String represents the word... Integer represents the count

因为你的密钥（计数）有时可能是相同的，因为单词会是唯一的... 反过来做...继续阅读单词并检查你的地图是否包含该单词...如果是，则增加计数，否则添加count = 1的单词。

Answer 3

您的类型似乎完全不正确

...如果你想要频率计数

您希望将单词作为键，将计数作为值。使用已排序的集合几乎没有价值，但它要慢很多，所以我会使用HashMap。

Map<String, Integer> frequencyCount = new HashMap<>();
while (scanner.hasNext()) {
    String word = scanner.next().toLowerCase();
    Integer count = frequencyCount.get(word);
    if (count == null)
        frequencyCount.put(word, 1);
    else
        frequencyCount.put(word, 1 + count);
}

...如果你想按长度键入。我会使用List<Set<String>>这是因为你的单词长度是正的和有界的，你想要忽略重复的单词，这是Set的目的。

 List<Set<String>> wordsByLength = new ArrayList<Set<String>>();
 while (scanner.hasNext()) {
    String word = scanner.next().toLowerCase();
    // grow the array list as required.
    while(wordsByteLength.size() <= word.length()) 
         wordsByLength.add(new HashSet<String>());
    // add the word ignoring duplicates.
    wordsByLength.get(words.length()).add(word);
 }

Answer 4

以节省时间的方式解决此问题的方法是使用两个映射。一张地图应该是从键到计数，另一张从计数到键。你可以用不同的通道组装它们。第一个应该将地图从键组合到计数：

Map<String, Integer> wordCount = new HashMap<String,Integer>();
while (scanner.hasNext()) {
    String word = scanner.next().toLowerCase();
    wordCount.put(word, wordCount.containsKey(word) ? wordCount.get(word) + 1 : 1);
}

第二阶段反转地图，以便您可以读取最顶层的按键：

// Biggest values first!
Map<Integer,List<String>> wordsByFreq = new TreeMap<Integer,List<String>>(new Comparator<Integer>(){
    public int compare(Integer a, Integer b) {
        return a - b;
    }
});
for (Map.Entry<String,Integer> e : wordCount) {
    List<String> current = wordsByFreq.get(e.getValue());
    if (current == null)
        wordsByFreq.put(e.getValue(), current = new ArrayList<String>());
    current.add(e.getKey());
}

请注意，第一阶段使用HashMap，因为我们根本不需要订单;只是快速访问。第二阶段需要一个TreeMap，它需要一个非标准的比较器，以便读出的第一个值将是最频繁的单词列表（允许两个或多个单词最频繁）。

Answer 5

试试这个

TreeMap<String, Integer> Word = new TreeMap<String,Integer>();

while (scanner.hasNext()) {
    String NewWord = scanner.next().toLowerCase();

    if (Word.containsKey(NewWord)) {
        Word.put(NewWord,Word.get(NewWord)+1);
    } else {
        Word.put(NewWord,1);
    }

}

Answer 6

试试这个：

        TreeMap<String, Integer> map = new TreeMap<String, Integer>();
        Scanner scanner = null;
        while (scanner.hasNext()) {
            String NewWord = scanner.next().toLowerCase();

            if (map.containsKey(NewWord)) {
                Integer count = map.get(NewWord);
                // Add the element back along with incremented count
                map.put(NewWord, count++); 
            } else {
                map.put(NewWord,1); // Add a new entry
            }

        }

带有<integer，list =“”> </integer，>的树形图

6 个答案: