Question

我制作了一个程序来计算单个文件中的单词，但是我如何修改我的程序，所以它给出了所有文件中的单词总数（作为一个值）。

我的代码如下所示：

public class WordCount implements Runnable
{
   public WordCount(String filename)
   {
      this.filename = filename;
   }

   public void run()
   {
      int count = 0;
      try
      {
         Scanner in = new Scanner(new File(filename));

         while (in.hasNext())
         {
            in.next();
            count++;
         }
         System.out.println(filename + ": " + count);
      }
      catch (FileNotFoundException e)
      {
         System.out.println(filename + " blev ikke fundet.");
      }
   }
   private String filename;
}

使用Main-Class：

public class Main
{

   public static void main(String args[])
   {
      for (String filename : args)
      {
         Runnable tester = new WordCount(filename);

         Thread t = new Thread(tester);
         t.start();
      }
   }
}

如何避免竞争条件？谢谢你的帮助。

Answer 1

工作线程：

class WordCount extends Thread
{

   int count;

   @Override
   public void run()
   {
      count = 0;
      /* Count the words... */
      ...
      ++count;
      ...
   }

}

使用它们的课程：

class Main
{

   public static void main(String args[]) throws InterruptedException
   {
      WordCount[] counters = new WordCount[args.length];
      for (int idx = 0; idx < args.length; ++idx) {
         counters[idx] = new WordCount(args[idx]);
         counters[idx].start();
      }
      int total = 0;
      for (WordCount counter : counters) {
        counter.join();
        total += counter.count;
      }
      System.out.println("Total: " + total);
   }

}

许多硬盘驱动器在同时读取多个文件方面做得不好。参考地点对绩效有很大影响。

Answer 2

您可以使用Future来获取计数，最后将所有计数相加或使用静态变量并以synchronized方式递增，即明确使用synchronized或使用Atomic Increment

Answer 3

如果你的Runnable有两个论点怎么办？

BlockingQueue<String>或BlockingQueue<File>个输入文件
a AtomicLong

在循环中，您将从队列中获取下一个字符串/文件，计算其单词，并将AtomicLong增加该数量。循环是while(!queue.isEmpty())还是while(!done)取决于您如何将文件提供到队列中：如果您从一开始就知道所有文件，则可以使用isEmpty版本，但如果您＆＃ 39;从某个地方重新传输它们，您想要使用!done版本（并且done为volatile boolean或AtomicBoolean以获得内存可见性。

然后你将这些Runnable提供给遗嘱执行人，你应该好好去。

Answer 4

您可以设置count volatile和static，以便所有线程都可以增加它。

public class WordCount implements Runnable
{
   private static AtomicInteger count = new AtomicInteger(0); // <-- now all threads increment the same count

   private String filename;

   public WordCount(String filename)
   {
      this.filename = filename;
   }

   public static int getCount()
   {
       return count.get();
   }

   public void run()
   {
      try
      {
         Scanner in = new Scanner(new File(filename));

         while (in.hasNext())
         {
            in.next();
            count.incrementAndGet();
         }
         System.out.println(filename + ": " + count);
      }
      catch (FileNotFoundException e)
      {
         System.out.println(filename + " blev ikke fundet.");
      }
   }
}

更新：暂时还没有完成java，但关于使其成为私有静态字段的观点仍然存在......只需将其设为AtomicInteger。

Answer 5

您可以创建一些侦听器以从线程获得反馈。

   public interface ResultListener {
       public synchronized void result(int words);
   }
   private String filename;
   private ResultListener listener;
   public void run()
   {
     int count = 0;
     try
     {
       Scanner in = new Scanner(new File(filename));

       while (in.hasNext())
       {
          in.next();
          count++;
       }
       listener.result(count); 
    }
    catch (FileNotFoundException e)
    {
       System.out.println(filename + " blev ikke fundet.");
    }
   }
  }

您可以为侦听器添加一个contructor参数，就像您的文件名一样。

  public class Main
  {
     private static int totalCount = 0;
     private static ResultListener listener = new ResultListener(){
         public synchronized void result(int words){
            totalCount += words;
         }
     }
     public static void main(String args[])
     {
        for (String filename : args)
        {
           Runnable tester = new WordCount(filename, listener);

           Thread t = new Thread(tester);
           t.start();
        }
     }
  }

Answer 6

您可以使用同步的任务队列创建一个线程池，该队列将保存您希望计算单词的所有文件。

当您的线程池工作者联机时，他们可以向任务队列询问是否要计算文件。在工人完成工作后，他们可以通知主线程他们的最终号码。

主线程将有一个同步的notify方法，它将添加所有工作线程的结果。

希望这有帮助。

Answer 7

或者您可以让所有线程更新单个字数变量。如果count是单词的，则count ++是原子的（int应该足够）。

编辑：原来Java规格很傻，计数++ 不原子。我不知道为什么。无论如何，看看AtomicInteger及其incrementAndGet方法。希望这个是原子的（我现在不知道会发生什么......），而且你不需要任何其他同步机制 - 只需将你的计数存储在AtomicInteger中。

Answer 8

考虑到 Java8 并发包涉及多线程的 Executors 和 Future，共享给定的解决方案。

首先，为处理单个文件而创建的可调用类

server.variables.value

现在，我们将创建多个未来任务来调用/处理参数中的每个文件，如下所示

db.collection.aggregate([
  {
    $project: {
      bucketName: {
        $cond: {
          if: { $in: ["aKey", "$server.variables.key"] },
          then: {
            $arrayElemAt: [
              "$server.variables.value",
              { $indexOfArray: ["$server.variables.key", "aKey"] }
            ]
          },
          else: "NOT_FOUND"
        }
      }
    }
  }
])

进一步的 public class WordCounter implements Callable { Path bookPath; public WordCounter(Path bookPath) { this.bookPath = bookPath; } @Override public Map<String, Long> call() throws Exception { Map<String, Long> wordCount = new HashMap<>(); wordCount = Files.lines(bookPath).flatMap(line -> Arrays.stream(line.trim().split(" ")).parallel()) .map(word -> word.replaceAll("[^a-zA-Z]", "").toLowerCase().trim()) .filter(word -> word.length() > 0) .map(word -> new SimpleEntry<>(word, 1)) .collect(Collectors.groupingBy(SimpleEntry::getKey, Collectors.counting())); return wordCount; } } 映射可以升级为 volatile 关键字并在 ExecutorService exes = Executors.newCachedThreadPool(); FutureTask[] tasks = new FutureTask[count]; Map<String, Long> result = new HashMap<>(); Path[] books = new Path[2]; books[0] = Paths.get("C:\\Users\\Documents\\book1.txt"); books[1] = Paths.get("C:\\Users\\Documents\\book2.txt"); for(int i=0; i<books.length; i++) { tasks[i] = new FutureTask(new WordCounter(books[i])); exes.submit(tasks[i]); } for(int i=0; i<count; i++) { try { Map<String, Long> wordCount = (Map<String, Long>) tasks[i].get(); wordCount.forEach((k,v) -> result.put(k, result.getOrDefault(k, 0L)+1)); } catch (InterruptedException e) { e.printStackTrace(); } catch (ExecutionException e) { e.printStackTrace(); } } exes.shutdown(); 线程之间共享以同时更新字数。

最终结果：result 应该给出预期的输出

多线程 - 计算多个文件中的单词总数

8 个答案: