Question

在网络关闭后，其中一个应用程序有数百个线程池线程待处理调用堆栈;

BTW，在网络关闭之前，只有几十个线程，但是在网络关闭后，线程数在很短的时间内增加到大约400个，并且在很长一段时间内保持这个数字不变，直到我们重新启动服务器。

00000020`a864fc58 00007fff`d4ea1118 ntdll!NtWaitForSingleObject+0xa
00000020`a864fc60 00007fff`ce50ce66 KERNELBASE!WaitForSingleObjectEx+0x94
00000020`a864fd00 00007fff`ce50d247 clr!CLRSemaphore::Wait+0x8a
00000020`a864fdc0 00007fff`ce50d330 
clr!ThreadpoolMgr::UnfairSemaphore::Wait+0x109
00000020`a864fe00 00007fff`ce5de8b6 
clr!ThreadpoolMgr::WorkerThreadStart+0x1b9
00000020`a864fea0 00007fff`d60613d2 clr!Thread::intermediateThreadProc+0x7d
00000020`a864fee0 00007fff`d7be5454 kernel32!BaseThreadInitThunk+0x22
00000020`a864ff10 00000000`00000000 ntdll!RtlUserThreadStart+0x34

1，那么为什么线程在网络故障后增加到数百个，有什么可能性呢？

另外在我的理解中（也来自下面的源代码！WorkerSemaphore-＆gt; Wait ...）线程将被销毁\ 20秒后退出如果没有任务分配给他们从调用堆栈他们只是在等待工作，没有任务分配给他们，但这些线程永远不会被摧毁; 在检查了CoreCLR的源代码之后，我注意到即使在超时20秒之后该线程也不会被破坏，如果有IO挂起的话。

2，问题是如何通过winDBG检查用户转储来检查这些线程是否有IO挂起？

https://github.com/dotnet/coreclr/blob/master/src/vm/win32threadpool.cpp

RetryWaitForWork:
if (!WorkerSemaphore->Wait(AppX::IsAppXProcess() ? WorkerTimeoutAppX : WorkerTimeout))
{
    if (!IsIoPending())
    {      
        DangerousNonHostedSpinLockHolder tal(&ThreadAdjustmentLock);

        counts = WorkerCounter.GetCleanCounts();
        while (true)
        {
            if (counts.NumActive == counts.NumWorking)
            {
                goto RetryWaitForWork;
            }

            newCounts = counts;
            newCounts.NumActive--;

            // if we timed out while active, then Hill Climbing needs to be told that we need fewer threads
            newCounts.MaxWorking = max(MinLimitTotalWorkerThreads, min(newCounts.NumActive, newCounts.MaxWorking));

            oldCounts = WorkerCounter.CompareExchangeCounts(newCounts, counts);

            if (oldCounts == counts)
            {
                HillClimbingInstance.ForceChange(newCounts.MaxWorking, ThreadTimedOut);
                goto Exit;
            }

            counts = oldCounts;
        }
    }
    else
    {
        goto RetryWaitForWork;
    }

顺便说一下，这些线程刚刚在网络发生故障时创建，并且从！runaway 0x4开始运行了10个多小时。

我只是需要一种方法来弄清楚为什么这些线程永远不会被破坏，即使在20秒内没有分配给它们的工作，我能弄明白的唯一原因是这些线程背后有IO待处理，但是如何证明它呢？ / p>

如何证明线程池工作线程后面有IO挂起？

0 个答案: