我有N个ID的long。对于每个ID,我都需要执行一个Runnable(即,我不在乎返回值),并等到所有它们都完成为止。每个Runnable可能需要花费几秒钟到几分钟的时间,并且可以并行运行大约100个线程。
在当前解决方案中,我们使用Executors.newFixedThreadPool(),为每个ID调用Submit(),然后在每个返回的Future上调用get()。
代码运行良好,并且非常简单,因为我不必处理线程,复杂的等待逻辑等。它有一个缺点:内存占用量。
所有仍排队的Runnable的内存消耗(比很长的时间要多8个字节:这是我的Java类,带有某些内部状态),并且所有N Future实例也都消耗内存(这些Java类具有状态,也仅用于等待,但不需要实际结果)。我查看了一个堆转储,我估计N = 1000万将占用1 GiB以上的内存。阵列中的1000万个长度只会消耗76 MiB。
是否有办法仅将ID保留在内存中,最好不采用低级并发编程来解决此问题?
答案 0 :(得分:1)
是的:您可以有一个共享的long队列。您向执行器提交了n个Runnable
,其中n是执行器中的线程数,在run
方法的结尾,您从队列中获得了下一个long,然后重新提交了一个新的{ {1}}。
答案 1 :(得分:1)
创建特定的线程池而不是创建数百万个Runnable,这些线程池需要很长时间才能完成任务。 不用等待任务完成Future.get(),而使用CountdownLatch。
该线程池可以这样实现:
int N = 1000000;// number of tasks;
int T = 100; // number of threads;
CountdownLatch latch = new CountdownLatch(N);
ArrayBlockingQueue<Long> queue = new ArrayBlockingQueue<>();
for (int k=0; k<N; k++) {
queue.put(createNumber(k));
}
for (int k=0; k<T; k++) {
new WorkingThread().start();
}
CountdownLatch.await();
class WorkingThread extends Thread {
public void run() {
while (latch.getCount() != 0) {
processNumber(queue.take());
latch.countDown();
}
}
}
答案 2 :(得分:1)
使用ExecutorCompletionService
怎么样?类似于以下内容(可能包含错误,但我没有对其进行测试):
import java.util.concurrent.Executor;
import java.util.concurrent.ExecutorCompletionService;
import java.util.function.LongFunction;
public class Foo {
private final ExecutorCompletionService<Void> completionService;
private final LongFunction<Runnable> taskCreator;
private final long maxRunning; // max tasks running or queued
public Foo(Executor executor, LongFunction<Runnable> taskCreator, long maxRunning) {
this.completionService = new ExecutorCompletionService<>(executor);
this.taskCreator = taskCreator;
this.maxRunning = maxRunning;
}
public synchronized void processIds(long[] ids) throws InterruptedException {
int completed = 0;
int running = 0;
for (long id : ids) {
if (running < maxRunning) {
completionService.submit(taskCreator.apply(id), null);
running++;
} else {
completionService.take();
running--;
completed++;
}
}
while (completed < ids.length) {
completionService.take();
completed++;
}
}
}
上述内容的另一个版本可以使用Semaphore
和CountDownLatch
,而不是CompletionService
。
public static void processIds(long[] ids, Executor executor,
int max, LongFunction<Runnable> taskSup) throws InterruptedException {
CountDownLatch latch = new CountDownLatch(ids.length);
Semaphore semaphore = new Semaphore(max);
for (long id : ids) {
semaphore.acquire();
Runnable task = taskSup.apply(id);
executor.execute(() -> {
try {
task.run();
} finally {
semaphore.release();
latch.countDown();
}
});
}
latch.await();
}
答案 3 :(得分:1)
这是我通常用Producer / Consummer模式和一个BlockingQueue协调两者的事情,或者如果我有项目,可以使用Akka actor。
但是我认为我建议依靠Java的Stream行为来解决一些问题。
直觉是流的惰性执行将用于限制工作单元,期货及其结果的创建。
1.3.20
说实话,我不能确定规范是否保证了这种行为,但是根据我的经验,它是可行的。并行的“块”中的每一个都获取一些ID,将其馈送到管道(映射到工作单元,调度到线程池,等待结果,过滤异常),这意味着可以很快达到平衡平衡有效工作单位数量与public static void main(String[] args) {
// So we have a list of ids, I stream it
// (note : if we have an iterator, you could group it by a batch of, say 100,
// and then flat map each batch)
LongStream ids = LongStream.range(0, 10_000_000L);
// This is were the actual tasks will be dispatched
ExecutorService executor = Executors.newFixedThreadPool(4);
// For each id to compute, create a runnable, which I call "WorkUnit"
Optional<Exception> error = ids.mapToObj(WorkUnit::new)
// create a parralel stream
// this allows the stream engine to launch the next instructions concurrently
.parallel()
// We dispatch ("parallely") the work units to a thread and have them execute
.map(workUnit -> CompletableFuture.runAsync(workUnit, executor))
// And then we wait for the unit of work to complete
.map(future -> {
try {
future.get();
} catch (Exception e) {
// we do care about exceptions
return e;
} finally {
System.out.println("Done with a work unit ");
}
// we do not care for the result
return null;
})
// Keep exceptions on the stream
.filter(Objects::nonNull)
// Stop as soon as one is found
.findFirst();
executor.shutdown();
System.out.println(error.isPresent());
}
的数量。
如果要微调并行“块”的数量,请在此处跟进:Custom thread pool in Java 8 parallel stream