这可能是一个非常简单的问题,但由于我从未使用过线程之前我认为最好不要试图自己完全找到最佳解决方案。
我有一个巨大的for
循环,几十亿次。在每个循环运行中,根据当前index
,程序以数字的形式计算最终结果。我只对存储顶部result
(或前x个结果)及其相应的索引感兴趣。
我的问题很简单,在线程中运行此循环的正确方法是什么,因此它使用所有可用的CPU /核心。
int topResultIndex;
double topResult = 0;
for (i=1; i < 1000000000; ++i) {
double result = // some complicated calculation based on the current index
if (result > topResult) {
topResult = result;
topResultIndex = i;
}
}
每个索引的计算完全独立,不共享资源。每个线程显然都会访问topResultIndex
和topResult
。
* 更新:Giulio和rolfl的解决方案都很好,也非常相似。只能接受其中一个作为我的答案。
答案 0 :(得分:5)
假设结果是由calculateResult(long)
方法计算的,该方法是私有的和静态的,并且不访问任何静态字段(它也可以是非静态的,但它仍然必须是线程安全的并发可执行,希望线程受限制。)
然后,我认为这将做肮脏的工作:
public static class Response {
int index;
double result;
}
private static class MyTask implements Callable<Response> {
private long from;
private long to;
public MyTask(long fromIndexInclusive, long toIndexExclusive) {
this.from = fromIndexInclusive;
this.to = toIndexExclusive;
}
public Response call() {
int topResultIndex;
double topResult = 0;
for (long i = from; i < to; ++i) {
double result = calculateResult(i);
if (result > topResult) {
topResult = result;
topResultIndex = i;
}
}
Response res = new Response();
res.index = topResultIndex;
res.result = topResult;
return res;
}
};
private static calculateResult(long index) { ... }
public Response interfaceMethod() {
//You might want to make this static/shared/global
ExecutorService svc = Executors.newCachedThreadPool();
int chunks = Runtime.getRuntime().availableProcessors();
long iterations = 1000000000;
MyTask[] tasks = new MyTask[chunks];
for (int i = 0; i < chunks; ++i) {
//You'd better cast to double and round here
tasks[i] = new MyTask(iterations / chunks * i, iterations / chunks * (i + 1));
}
List<Future<Response>> resp = svc.invokeAll(Arrays.asList(tasks));
Iterator<Future<Response>> respIt = resp.iterator();
//You'll have to handle exceptions here
Response bestResponse = respIt.next().get();
while (respIt.hasNext()) {
Response r = respIt.next().get();
if (r.result > bestResponse.result) {
bestResponse = r;
}
}
return bestResponse;
}
根据我的经验,对每个索引执行任务时,这个块的分区要快得多(特别是如果每个索引的计算负载很小,就像它可能是这样。小的,我的意思是不到半秒) 。但是,编码有点困难,因为你需要进行两步最大化(首先是块级,然后是全局级)。有了这个,如果计算纯粹是基于cpu的(不会过多地推动ram),你应该获得几乎等于物理核心数量80%的加速。
答案 1 :(得分:2)
除了观察到使用OpenMP或其他并行计算扩展的C程序是一个更好的想法之外,Java的方法是创建一个计算问题子集的“未来”任务:
private static final class Result {
final int index;
final double result;
public Result (int index, double result) {
this.result = result;
this.index = index;
}
}
// Calculate 10,000 values in each thead
int steps = 10000;
int cpucount = Runtime.getRuntime().availableProcessors();
ExecutorService service = Executors.newFixedThreadPool(cpucount);
ArrayList<Future<Result>> results = new ArrayList<>();
for (int i = 0; i < 1000000000; i+= steps) {
final int from = i;
final int to = from + steps;
results.add(service.submit(new Callable<Result>() {
public Result call() {
int topResultIndex = -1;
double topResult = 0;
for (int j = from; j < to; j++) {
// do complicated things with 'j'
double result = // some complicated calculation based on the current index
if (result > topResult) {
topResult = result;
topResultIndex = j;
}
}
return new Result(topResultIndex, topResult);
}
});
}
service.shutdown();
while (!service.isTerminated()) {
System.out.println("Waiting for threads to complete");
service.awaitTermination(10, TimeUnit.SECONDS);
}
Result best = null;
for (Future<Result> fut : results) {
if (best == null || fut.result > best.result) {
best = fut;
}
}
System.out.printf("Best result is %f at index %d\n", best.result, best.index);
Future<Result>
答案 2 :(得分:1)
最简单的方法是使用ExecutorService
并将您的任务提交为Runnable
或Callable
。您可以使用Executors.newFixedThreadPool(Runtime.getRuntime().availableProcessors())
创建一个ExeuctorService
,它将使用与处理器相同数量的线程。