计算java中出现的单词数量

时间:2015-11-23 06:21:37

标签: java tokenize words

如何获得给定字符串上的单词对示例

快速,快速的棕色,棕色的狐狸,狐狸跳跃 跳过等...

然后算出它出现了多少次?

以下代码只能计算一个单词。

 import java.util.*;
    import java.util.Map;
    import java.util.HashMap;

    public class Tokenizer

    {
        public static void main(String[] args)
        {
            int index = 0; int tokenCount; int i =0;
            Map<String,Integer> wordCount = new HashMap<String,Integer>();
            Map<Integer,Integer> letterCount = new HashMap<Integer,Integer>();
            String message="The Quick brown fox jumps over the lazy brown dog the quick";

            StringTokenizer string = new StringTokenizer(message);


            tokenCount = string.countTokens();
            System.out.println("Number of tokens = " + tokenCount);
            while (string.hasMoreTokens()) {
                String word = string.nextToken().toLowerCase();
                Integer count = wordCount.get(word);
                Integer lettercount = letterCount.get(word);

                if(count == null) {
                    wordCount.put(word, 1);
                }
                else {
                    wordCount.put(word, count + 1);
                }
            }
            for (String words : wordCount.keySet())
            {System.out.println("Word : " +  words + " has count :" +wordCount.get(words));


            }
            int first ,second;
            first = second = Integer.MIN_VALUE;
            String firstword ="";
            String secondword="";


            for(Map.Entry<String, Integer> entry : wordCount.entrySet())
            {

                int count = entry.getValue();
                String word = entry.getKey();
                if(count>first){
                    second = first;
                    secondword = firstword;
                    first = count;
                    firstword = word;

                }
                else if(count>second && count ==first){
                    second = count;
                    secondword = word;
                }
            }
            System.out.println(firstword + "" + first);
            System.out.println(secondword + " " + second);

            for(i = 0; i < message.length(); i++){
                char c = message.charAt(i);
                if (c != ' ') {

                    int value = letterCount.getOrDefault((int) c, 0);
                    letterCount.put((int) c, value + 1);
                }
            }

            for(int key : letterCount.keySet()) {
                System.out.println((char) key + ": " + letterCount.get(key));
            }
        }

    }

3 个答案:

答案 0 :(得分:2)

好的,所以从我理解的问题来看,你需要检查一个字符串中的一对单词是否必须在整个字符串中计算。我看到你的代码,觉得它比需要的要复杂得多。请参阅以下代码段。

  1. 使用空格分隔源字符串作为分隔符
  2. 连接相邻的字符串,并用空格分隔
  3. 在源字符串
  4. 中搜索连接字符串
  5. 如果未找到,请添加到地图中,其中键作为单词对,值为1。
  6. 如果找到,请从地图中获取单词对的值并递增并重新设置。

    String message = "The Quick brown fox jumps over the lazy brown dog the quick";
    String[] split = message.split(" ");
    Map<String, Integer> map = new HashMap<>();
    int count = 0;
    for (int i = 0; i < split.length - 1; i++) {
        String temp = split[i] + " " + split[i + 1];
        temp = temp.toLowerCase();
        if (message.toLowerCase().contains(temp)) {
            if (map.containsKey(temp))
                map.put(temp, map.get(temp) + 1);
            else
                map.put(temp, 1);
        }
    
    }
    System.out.println(map);
    

答案 1 :(得分:0)

这是完整的主方法代码, 如果有任何疑问,请告诉我。

public static void main(String[] args)
     {

         int index = 0; int tokenCount; int i =0;
         Map<String,Integer> wordCount = new HashMap<String,Integer>();
         Map<Integer,Integer> letterCount = new HashMap<Integer,Integer>();
         String message="The Quick brown fox jumps over the lazy brown dog the quick";

         StringTokenizer string = new StringTokenizer(message);


         tokenCount = string.countTokens();
         System.out.println("Number of tokens = " + tokenCount);

         while (string.hasMoreTokens()) {
             String word = string.nextToken().toLowerCase();
             Integer count = wordCount.get(word);
             Integer lettercount = letterCount.get(word);
             System.out.println("Count : " + count);
             if(count == null) {
                 wordCount.put(word, 1);
             }
             else {
                 wordCount.put(word, count + 1);
             }
         }
         for (String words : wordCount.keySet())
         {
             System.out.println("Word : " +  words + " has count :" +wordCount.get(words));
         }

     }

答案 2 :(得分:-1)

while (string.hasMoreTokens()) {

      String word = string.nextToken().toLowerCase();

      if (string.hasMoreTokens())
        word += " "+string.nextToken().toLowerCase();

      Integer count = wordCount.get(word);
      Integer lettercount = letterCount.get(word);

      if(count == null) {
        wordCount.put(word,  1);
      }
      else {
        wordCount.put(word,  count + 1);
      }
    }