Question

我有一个Java程序，它给出一个列表，对每个列表项执行一些独立处理（包括从一些HTTP资源中检索文本并将其插入独立的HashMap中），最后计算出一些这些HashMap上的数字。主要代码段如下所示：

    for (int i = 0; i < mylist.size(); i++) {
        long startepoch = getTime(mylist.get(i).time);
        MyItem m = mylist.get(i);
        String index=(i+1)+"";

        process1(index, m.name, startepoch, m.duration);
        //adds to hashmap1

        if(m.name.equals("TEST")) {
            process2(index, m.name, startepoch, m.duration);
        //adds to hashmap2

        } else {
            process3(index, m.name, startepoch, m.duration);
        //adds to hashmap3
            process4(index, m.name, startepoch, m.duration);
        //adds to hashmap4
            process5(index, m.name, startepoch, m.duration);
        //adds to hashmap5
            process6(index, m.name, startepoch, m.duration);
        //adds to hashmap6
        }
    }

    // then start calculation on all hashmaps
    calculate_all();

由于当前该代码段是按顺序执行的，因此可能需要30分钟左右才能获得500个项目的列表。如何对我的代码进行多线程处理以使其更快？并且以线程安全的方式？

我尝试使用ExecutorService executorService = Executors.newFixedThreadPool(10);，然后通过如下包装将每个进程提交给executorService，但问题是我不知道它们何时结束，所以称{{1 }}。所以我没有继续。

calculate_all()

还有更好的工作思路吗？

Answer 1

但是问题是我不知道他们什么时候结束

将某些内容提交给执行人时，您将返回一个Future及其结果（如果有）。

然后，您可以从主线程调用Future::get以等待这些结果（或仅根据您的情况完成）。

List<Future<?>> completions = executor.invokeAll(tasks);

// later, when you need to wait for completion
for(Future<?> c: completions) c.get();

您需要注意的另一件事是如何存储结果。如果您计划将任务放置在某些共享数据结构中，请确保使该线程安全。从Runnable更改为Callable可能更容易，以便任务可以返回结果（以后可以在主线程上以单线程方式合并结果）。

Answer 2

请注意，多线程并不一定会提高速度。多线程主要用于通过防止不必要的睡眠等来减少空闲的CPU周期。

对于您所提供的内容，我没有什么可以帮助的，但是，我认为您可以从执行以下操作开始：

使用线程安全的数据结构。这是必须的。如果你错过了步骤，您的软件最终会崩溃。你将有一个很难找到原因。（例如，如果您有ArrayList，使用线程安全的一个）
您可以通过删除for循环开始尝试多线程并在每次执行时都使用一个线程。如果您的for循环大小大于线程数，您将拥有入队。
您进行的最终计算需要所有其他线程进行完。您可以使用CountDownLatch，wait（）/ notifyAll（）或取决于您的实现。
执行您的最终计算。

编辑

针对（2）：

您当前的执行是这样

for (int i = 0; i < mylist.size(); i++) {
    some_processes();
}

// then start calculation on all hashmaps
calculate_all();

现在，要删除“ for”循环，您可以首先从增加“ for”循环开始。例如：

// Assuming mylist.size() is around 500 and you want, say 5, hardcoded multi-thrads
Thread_1:
for (int i = 0; i < 100; i++) {
    some_processes();
}
Thread_2:
for (int i = 100; i < 200; i++) {
    some_processes();
}
Thread_3:
for (int i = 200; i < 300; i++) {
    some_processes();
}
Thread_4:
for (int i = 300; i < 400; i++) {
    some_processes();
}
Thread_5:
for (int i = 400; i < mylist.size(); i++) {
    some_processes();
}
// Now you can use these threads as such:
CountDownLatch latch = new CountDownLatch(5);
ExecutorService executor = Executors.newFixedThreadPool(5);
executor.submit(new Thread1(latch));
executor.submit(new Thread2(latch));
executor.submit(new Thread3(latch));
executor.submit(new Thread4(latch));
executor.submit(new Thread5(latch));
try {
    latch.await();  // wait until latch counted down to 0
} catch (InterruptedException e) {
    e.printStackTrace();
}
// then start calculation on all hashmaps
calculate_all();

如您所见，此方法有两个缺点。例如，如果列表大小变为380，该怎么办？然后，您有一个空闲线程。另外，如果您想要5个以上的线程怎么办？

因此，在这一点上，您可以通过使“ for”循环越来越少来进一步增加它们的数量。最多，“ for循环计数” ==“线程计数”有效地删除了您的for循环。因此，从技术上讲，您需要“ mylist.size（）”数量的线程。您可以这样实现：

// Allow a maximum amount of threads, say mylist.size(). I used LinkedBlockingDeque here because you might choose something lower than mylist.size().
BlockingQueue<String> blockingQueue = new LinkedBlockingDeque<>(mylist.size());
CountDownLatch latch = new CountDownLatch(mylist.size());

new Thread(new add_some_processes_w_single_loop_for_loop_to_queue(queue, latch)).start();
new Thread(new take_finished_processes_from_queue(queue)).start();
try {
    latch.await();  // wait until latch counted down to 0
} catch (InterruptedException e) {
    e.printStackTrace();
}
// then start calculation on all hashmaps
calculate_all();

请注意，通过这种安排，我们已经删除了最初的“ for”循环，而是创建了一个仅在清空队列时提交新线程的循环。您可以在生产者和使用者应用程序中查看BlockingQueue示例。例如，请参见：BlockingQueue examples

编辑2

Future的简单实现如下所示：

ExecutorService executorService = Executors.newCachedThreadPool();  
Future future1, future2, future3, future4, future5, future6;  

for (int i = 0; i < mylist.size(); i++) {
    long startepoch = getTime(mylist.get(i).time);
    MyItem m = mylist.get(i);
    String index=(i+1)+"";

    future1 = executorService.submit(new Callable() {...})
    //adds to hashmap1

    future1.get(); // Add this if you need to wait for process1 to finish before moving on to others. Also, add a try{}catch{} block as shown below.

    if(m.name.equals("TEST")) {
        future2 = executorService.submit(new Callable() {...})
    //adds to hashmap2

        future2.get(); // Add this if you need to wait for process2 to finish before moving on to others. Also, add a try{}catch{} block as shown below.

    } else {
        future3 = executorService.submit(new Callable() {...})
    //adds to hashmap3
        future4 = executorService.submit(new Callable() {...})
    //adds to hashmap4
        future5 = executorService.submit(new Callable() {...})
    //adds to hashmap5
        future6 = executorService.submit(new Callable() {...})
    //adds to hashmap6

         // Add extra future.get here as above...
    }
}

// then start calculation on all hashmaps
calculate_all();

别忘了添加try-catch块，否则您可能无法从异常和崩溃中恢复。

// Example try-catch block surrounding a Future.get().
try {
    Object result = future.get();       
} catch (ExecutionException e) {
    //Do something
} catch (InterruptedException e) {
    //Do something
}

但是，您可以拥有一个更复杂的版本，如here所示。该链接还说明了Thilo的答案。

如何对我的顺序Java代码进行多线程处理

2 个答案: