Question

我一直在使用ThreadPoolExecutor和JDK6来解决线程池的不同策略。我有一个优先级队列工作，但不知道我是否喜欢keepAliveTime之后池的大小（你得到的是无限队列）。所以，我正在使用LinkedBlockingQueue和CallerRuns策略查看ThreadPoolExecutor。

我现在遇到的问题是，如果文档解释它应该，池会加速，但是在任务完成并且keepAliveTime开始运行后，getPoolSize显示池减少到零。下面的示例代码可以让您看到我的问题的基础：

public class ThreadPoolingDemo {
    private final static Logger LOGGER =
         Logger.getLogger(ThreadPoolingDemo.class.getName());

    public static void main(String[] args) throws Exception {
        LOGGER.info("MAIN THREAD:starting");
        runCallerTestPlain();   
    }

    private static void runCallerTestPlain() throws InterruptedException {
        //10 core threads, 
        //50 max pool size, 
        //100 tasks in queue, 
        //at max pool and full queue - caller runs task
        ThreadPoolExecutor tpe = new ThreadPoolExecutor(10, 50,
            5L, TimeUnit.SECONDS, new LinkedBlockingQueue<Runnable>(100),
            new ThreadPoolExecutor.CallerRunsPolicy());

        //dump 5000 tasks on the queue
        for (int i = 0; i < 5000; i++) {
            tpe.submit(new Runnable() {
                @Override
                public void run() {
                    //just to eat some time and give a little feedback
                    for (int j = 0; j < 20; j++) {
                        LOGGER.info("First-batch Task, looping:" + j + "["
                               + Thread.currentThread().getId() + "]");
                    }
                }
            }, null);
        }
        LOGGER.info("MAIN THREAD:!!Done queueing!!");

        //check tpe statistics forever
        while (true) {
            LOGGER.info("Active count: " + tpe.getActiveCount() + " Pool size: "
                 + tpe.getPoolSize() + " Largest Pool: " + tpe.getLargestPoolSize());
            Thread.sleep(1000);
        }
    }
}

我发现了一个似乎是这个问题的旧bug，但它被关闭了：http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6458662。这可能仍然存在于1.6中，还是我错过了什么？

看起来像我这样的橡皮鸭（http://www.codinghorror.com/blog/2012/03/rubber-duck-problem-solving.html）。我上面链接的错误与此问题有关：http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6576792，问题似乎在1.7中得到解决（我加载了1.7并经过验证 - 已修复...）。我想我的主要问题是这个基本问题已经持续了近十年。我花了太多时间写这篇文章，现在不发布，希望它可以帮助别人。

Answer 1

...任务完成后，keepAliveTime开始运行，getPoolSize显示池减少到零。

所以这看起来是ThreadPoolExecutor中的竞争条件。我想这是根据设计工作，虽然没有预料到。在工作线程循环以从阻塞队列中获取任务的getTask()方法中，您会看到以下代码：

if (state == SHUTDOWN)  // Help drain queue
    r = workQueue.poll();
else if (poolSize > corePoolSize || allowCoreThreadTimeOut)
    r = workQueue.poll(keepAliveTime, TimeUnit.NANOSECONDS);
else
    r = workQueue.take();
if (r != null)
    return r;
if (workerCanExit()) {
    if (runState >= SHUTDOWN) // Wake up others
        interruptIdleWorkers();
    return null;
}

如果poolSize增长到corePoolSize以上，那么如果轮询在keepAliveTime之后超时，则代码会降至workerCanExit()，因为r为null 1}}。所有线程都可以从该方法返回true，因为它只是测试poolSize的状态：

    mainLock.lock();
    boolean canExit;
    try {
        canExit = runState >= STOP ||
            workQueue.isEmpty() ||
            (allowCoreThreadTimeOut &&
             poolSize > Math.max(1, corePoolSize)); << test poolSize here
    } finally {
        mainLock.unlock();                         << race to workerDone() begins
    }

一旦返回true，则工作线程退出，然后 poolSize递减。如果所有工作线程同时进行该测试，则由于poolSize的测试与--poolSize发生时工作人员停止之间的竞争而全部退出。

令我惊讶的是这种竞争条件的一致性。如果你向sleep()里面的run()添加一些随机化，那么你可以得到一些不退出的核心线程，但我认为竞争条件会更难被击中。

您可以在以下测试中看到此行为：

@Test
public void test() throws Exception {
    int before = Thread.activeCount();
    int core = 10;
    int max = 50;
    int queueSize = 100;
    ThreadPoolExecutor tpe =
            new ThreadPoolExecutor(core, max, 1L, TimeUnit.SECONDS,
                    new LinkedBlockingQueue<Runnable>(queueSize),
                    new ThreadPoolExecutor.CallerRunsPolicy());
    tpe.allowCoreThreadTimeOut(false);
    assertEquals(0, tpe.getActiveCount());
    // if we start 1 more than can go into core or queue, poolSize goes to 0
    int startN = core + queueSize + 1;
    // if we only start jobs the core can take care of, then it won't go to 0
    // int startN = core + queueSize;
    for (int i = 0; i < startN; i++) {
        tpe.submit(new Runnable() {
            @Override
            public void run() {
                try {
                    Thread.sleep(100);
                } catch (InterruptedException e) {
                    e.printStackTrace();
                }
            }
        });
    }
    while (true) {
        System.out.println("active = " + tpe.getActiveCount() + ", poolSize = " + tpe.getPoolSize()
                + ", largest = " + tpe.getLargestPoolSize() + ", threads = " + (Thread.activeCount() - before));
        Thread.sleep(1000);
    }
}

如果您将sleep方法内的run()行更改为以下内容：

private final Random random = new Random();
...
    Thread.sleep(100 + random.nextInt(100));

这将使竞争条件更难以击中，因此一些核心线程仍将存在。

为什么ThreadPoolExecutor会在keepAliveTime之后减少corePoolSize以下的线程？

1 个答案: