以较小的内存占用空间执行数百万个Runnable

时间:2019-02-04 10:08:03

标签: java multithreading

我有N个ID的long。对于每个ID,我都需要执行一个Runnable(即,我不在乎返回值),并等到所有它们都完成为止。每个Runnable可能需要花费几秒钟到几分钟的时间,并且可以并行运行大约100个线程。

在当前解决方案中,我们使用Executors.newFixedThreadPool(),为每个ID调用Submit(),然后在每个返回的Future上调用get()。

代码运行良好,并且非常简单,因为我不必处理线程,复杂的等待逻辑等。它有一个缺点:内存占用量。

所有仍排队的Runnable的内存消耗(比很长的时间要多8个字节:这是我的Java类,带有某些内部状态),并且所有N Future实例也都消耗内存(这些Java类具有状态,也仅用于等待,但不需要实际结果)。我查看了一个堆转储,我估计N = 1000万将占用1 GiB以上的内存。阵列中的1000万个长度只会消耗76 MiB。

是否有办法仅将ID保留在内存中,最好不采用低级并发编程来解决此问题?

4 个答案:

答案 0 :(得分:1)

是的:您可以有一个共享的long队列。您向执行器提交了n个Runnable,其中n是执行器中的线程数,在run方法的结尾,您从队列中获得了下一个long,然后重新提交了一个新的{ {1}}。

答案 1 :(得分:1)

创建特定的线程池而不是创建数百万个Runnable,这些线程池需要很长时间才能完成任务。 不用等待任务完成Future.get(),而使用CountdownLatch。

该线程池可以这样实现:

int N = 1000000;// number of tasks;
int T = 100; // number of threads;
CountdownLatch latch = new CountdownLatch(N);
ArrayBlockingQueue<Long> queue = new ArrayBlockingQueue<>();

for (int k=0; k<N; k++) {
   queue.put(createNumber(k));
}
for (int k=0; k<T; k++) {
  new WorkingThread().start();
}
CountdownLatch.await();

class WorkingThread extends Thread {
  public void run() {
      while (latch.getCount() != 0) {
           processNumber(queue.take());
           latch.countDown();
      }
  }
}

答案 2 :(得分:1)

使用ExecutorCompletionService怎么样?类似于以下内容(可能包含错误,但我没有对其进行测试):

import java.util.concurrent.Executor;
import java.util.concurrent.ExecutorCompletionService;
import java.util.function.LongFunction;

public class Foo {

  private final ExecutorCompletionService<Void> completionService;
  private final LongFunction<Runnable> taskCreator;
  private final long maxRunning; // max tasks running or queued

  public Foo(Executor executor, LongFunction<Runnable> taskCreator, long maxRunning) {
    this.completionService = new ExecutorCompletionService<>(executor);
    this.taskCreator = taskCreator;
    this.maxRunning = maxRunning;
  }

  public synchronized void processIds(long[] ids) throws InterruptedException {
    int completed = 0;

    int running = 0;
    for (long id : ids) {
      if (running < maxRunning) {
        completionService.submit(taskCreator.apply(id), null);
        running++;
      } else {
        completionService.take();
        running--;
        completed++;
      }
    }

    while (completed < ids.length) {
      completionService.take();
      completed++;
    }

  }

}

上述内容的另一个版本可以使用SemaphoreCountDownLatch,而不是CompletionService

public static void processIds(long[] ids, Executor executor,
                              int max, LongFunction<Runnable> taskSup) throws InterruptedException {
  CountDownLatch latch = new CountDownLatch(ids.length);
  Semaphore semaphore = new Semaphore(max);

  for (long id : ids) {
    semaphore.acquire();

    Runnable task = taskSup.apply(id);
    executor.execute(() -> {
      try {
        task.run();
      } finally {
        semaphore.release();
        latch.countDown();
      }
    });

  }

  latch.await();
}

答案 3 :(得分:1)

这是我通常用Producer / Consummer模式和一个BlockingQueue协调两者的事情,或者如果我有项目,可以使用Akka actor。

但是我认为我建议依靠Java的Stream行为来解决一些问题。

直觉是流的惰性执行将用于限制工作单元,期货及其结果的创建。

1.3.20

说实话,我不能确定规范是否保证了这种行为,但是根据我的经验,它是可行的。并行的“块”中的每一个都获取一些ID,将其馈送到管道(映射到工作单元,调度到线程池,等待结果,过滤异常),这意味着可以很快达到平衡平衡有效工作单位数量与public static void main(String[] args) { // So we have a list of ids, I stream it // (note : if we have an iterator, you could group it by a batch of, say 100, // and then flat map each batch) LongStream ids = LongStream.range(0, 10_000_000L); // This is were the actual tasks will be dispatched ExecutorService executor = Executors.newFixedThreadPool(4); // For each id to compute, create a runnable, which I call "WorkUnit" Optional<Exception> error = ids.mapToObj(WorkUnit::new) // create a parralel stream // this allows the stream engine to launch the next instructions concurrently .parallel() // We dispatch ("parallely") the work units to a thread and have them execute .map(workUnit -> CompletableFuture.runAsync(workUnit, executor)) // And then we wait for the unit of work to complete .map(future -> { try { future.get(); } catch (Exception e) { // we do care about exceptions return e; } finally { System.out.println("Done with a work unit "); } // we do not care for the result return null; }) // Keep exceptions on the stream .filter(Objects::nonNull) // Stop as soon as one is found .findFirst(); executor.shutdown(); System.out.println(error.isPresent()); } 的数量。

如果要微调并行“块”的数量,请在此处跟进:Custom thread pool in Java 8 parallel stream