列出多个文件中的单词频率

时间:2017-04-27 19:16:45

标签: java multithreading frequency

我创建了一个程序,它将查看某个目录中的文本文件,然后继续列出该文件中的单词。

例如,如果我的文本文件包含此内容。

  

你好我的名字是约翰你好我的

输出显示

  

你好2

     

我的2

     

名称1

     

是1

     

john 1

但是现在我希望我的程序搜索目录中的多个文本文件并列出所有文本文件中出现的所有单词。

这是我的程序,它会在单个文件中列出单词。

import java.io.File;
import java.io.FileNotFoundException;
import java.util.HashMap;
import java.util.Scanner;

public class WordCountstackquestion implements Runnable {

    private String filename;

    public WordCountstackquestion(String filename) {
        this.filename = filename;
    }

    public void run() {
        int count = 0;
        try {
            HashMap<String, Integer> map = new HashMap<String, Integer>();
            Scanner in = new Scanner(new File(filename));

            while (in.hasNext()) {
                String word = in.next();

                if (map.containsKey(word))
                    map.put(word, map.get(word) + 1);
                else {
                    map.put(word, 1);
                }
                count++;

            }
            System.out.println(filename + " : " + count);

            for (String word : map.keySet()) {
                System.out.println(word + " " + map.get(word));

            }
        } catch (FileNotFoundException e) {
            System.out.println(filename + " was not found.");
        }
    }

}

我的主要课程。

 public class Mainstackquestion
    {

       public static void main(String args[])
       {
         if(args.length > 0)
         {
           for (String filename : args)
           {

             CheckFile(filename);
           }
         }
         else
         {

           CheckFile("C:\\Users\\User\\Desktop\\files\\1.txt"); 
         }

       }

 private static void CheckFile(String file)
 {
     Runnable tester = new WordCountstackquestion(file);
     Thread t = new Thread(tester);
     t.start();
 }
}

我尝试使用一些在线资源制作一个可以查看多个文件的方法。但是我很挣扎,似乎无法在我的程序中正确实现它。

我会为每个文件都有一个工人类。

 int count;

   @Override
   public void run()
   {
      count = 0;
      /* Count the words... */
      ...
      ++count;
      ...
   }

然后这个方法使用它们。

public static void main(String args[]) throws InterruptedException
   {
      WordCount[] counters = new WordCount[args.length];
      for (int idx = 0; idx < args.length; ++idx) {
         counters[idx] = new WordCount(args[idx]);
         counters[idx].start();
      }
      int total = 0;
      for (WordCount counter : counters) {
        counter.join();
        total += counter.count;
      }
      System.out.println("Total: " + total);
   }

1 个答案:

答案 0 :(得分:0)

我将假设所有这些文件都位于同一目录中。你可以这样做:

public void run() {
    // Replace the link to your filename variable
    File f = new File("link/to/folder/here");
    // Check if file is a directory (always do this if you are going to use listFiles()
    if (f.isDirectory()) {
        // I've moved to scanner object outside the code in order to prevent mass creation of an object
        Scanner in = null;
        // Lists all files in a directory
        // You could also use a for loop, but I prefer enchanced for loops
        for (File file : f.listFiles()) {
            // Everything here is your old code, utilizing a new file (now named "f" instead of "filename"
            int count = 0;
            try {
                HashMap<String, Integer> map = new HashMap<String, Integer>();
                in = new Scanner(f);

                while (in.hasNext()) {
                    String word = in.next();

                    if (map.containsKey(word))
                        map.put(word, map.get(word) + 1);
                    else {
                        map.put(word, 1);
                    }
                    count++;

                }
                System.out.println(f + " : " + count);

                for (String word : map.keySet()) {
                    System.out.println(word + " " + map.get(word));

                }
            } catch (FileNotFoundException e) {
                System.out.println(file + " was not found.");
            }
        }
        // Once done with the scanner, close it (I didn't see it in your code, so including it now)
        in.close();
    }
}

如果您想使用for循环而不是增强型for循环(出于兼容性目的),则在评论中共享link

否则,您可以继续扫描用户输入,并将其全部放入ArrayList(或ArrayList的其他形式,无论您需要什么),并循环遍历arraylist并移动&#34;文件F&#34;变量(在循环内),像这样排序:

for(String s : arraylist){
    File f = new File(s);
}