我几个星期后面临一个巨大的问题。我已经在IIS7(W2008 SP1)下托管了一个asp.net应用程序,并且每隔几个小时就开始消耗50%的CPU,可能没有用户连接。 这是可以理解的,因为我们使用Quartz.net来进行一些应用程序的重新定义,但是我们还没有重现这个问题。
这是使用JetBrains dotTrace 3.1制作的跟踪,而CPU很高:http://mycenter.info/tmp/DotTraceSnapshot.zip
通常浪费CPU的进程是w3wp.exe,但在过去的几天里,sqlserver(2008)和memcached(1.2.1,以及周一更新到1.2.4 beta)也杀了CPU。 奇怪的是有些时候memcached开始消耗100%并且它的统计数据显示它很安静,但是在请求时它可以正常工作。
这是w3wp的崩溃转储(或堆栈跟踪转储),使用WinDbg: (根据本指南:http://blogs.technet.com/marcelofartura/archive/2006/09/15/troubleshooting-iis-100-cpu-issues-step-by-step-intermediary.aspx)
0:000> ~
. 0 Id: 1be4.1d3c Suspend: 1 Teb: 7ffdf000 Unfrozen
1 Id: 1be4.b1c Suspend: 1 Teb: 7ffde000 Unfrozen
2 Id: 1be4.12a0 Suspend: 1 Teb: 7ffdd000 Unfrozen
3 Id: 1be4.19d0 Suspend: 1 Teb: 7ffdc000 Unfrozen
4 Id: 1be4.1714 Suspend: 1 Teb: 7ffd7000 Unfrozen
5 Id: 1be4.1a18 Suspend: 1 Teb: 7ffd6000 Unfrozen
6 Id: 1be4.12ac Suspend: 1 Teb: 7ffd5000 Unfrozen
7 Id: 1be4.dec Suspend: 1 Teb: 7ffd4000 Unfrozen
8 Id: 1be4.1e48 Suspend: 1 Teb: 7ffd8000 Unfrozen
9 Id: 1be4.1ca8 Suspend: 1 Teb: 7ffd3000 Unfrozen
10 Id: 1be4.1508 Suspend: 1 Teb: 7ffaf000 Unfrozen
11 Id: 1be4.1bc0 Suspend: 1 Teb: 7ffae000 Unfrozen
12 Id: 1be4.1f48 Suspend: 1 Teb: 7ffad000 Unfrozen
13 Id: 1be4.1994 Suspend: 1 Teb: 7ffac000 Unfrozen
14 Id: 1be4.1a48 Suspend: 1 Teb: 7ffab000 Unfrozen
15 Id: 1be4.12c8 Suspend: 1 Teb: 7ffa8000 Unfrozen
16 Id: 1be4.e44 Suspend: 1 Teb: 7ffa7000 Unfrozen
17 Id: 1be4.19e0 Suspend: 1 Teb: 7ffa6000 Unfrozen
18 Id: 1be4.19b0 Suspend: 1 Teb: 7ffa2000 Unfrozen
19 Id: 1be4.1b30 Suspend: 1 Teb: 7ffd9000 Unfrozen
20 Id: 1be4.1bfc Suspend: 1 Teb: 7ffa3000 Unfrozen
21 Id: 1be4.1be8 Suspend: 1 Teb: 7ffa1000 Unfrozen
22 Id: 1be4.1a54 Suspend: 1 Teb: 7ffa5000 Unfrozen
23 Id: 1be4.b74 Suspend: 1 Teb: 7ff3d000 Unfrozen
24 Id: 1be4.19b4 Suspend: 1 Teb: 7ff3c000 Unfrozen
25 Id: 1be4.1460 Suspend: 1 Teb: 7ffdb000 Unfrozen
26 Id: 1be4.1eac Suspend: 1 Teb: 7ffaa000 Unfrozen
27 Id: 1be4.1b90 Suspend: 1 Teb: 7ffa4000 Unfrozen
0:023> #23s
Search address set to 77dc9a94
*** WARNING: Unable to verify checksum for SMDiagnostics.ni.dll
*** WARNING: Unable to verify checksum for System.Data.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for Microsoft.Web.Services3.DLL
*** WARNING: Unable to verify checksum for System.Windows.Forms.ni.dll
*** WARNING: Unable to verify checksum for System.Web.ni.dll
*** WARNING: Unable to verify checksum for Ademy.UI.Web.DLL
*** ERROR: Module load completed but symbols could not be loaded for AjaxControlToolkit.DLL
*** ERROR: Module load completed but symbols could not be loaded for 7zSharp.DLL
*** WARNING: Unable to verify checksum for mscorlib.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for Iesi.Collections.DLL
*** WARNING: Unable to verify checksum for System.Design.ni.dll
*** WARNING: Unable to verify checksum for System.Core.ni.dll
*** WARNING: Unable to verify checksum for Ademy.Event.DLL
*** WARNING: Unable to verify checksum for System.ServiceModel.ni.dll
*** ERROR: Module load completed but symbols could not be loaded for System.ServiceModel.ni.dll
*** WARNING: Unable to verify checksum for App_Theme_Ocean.wgubmrqt.dll
*** WARNING: Unable to verify checksum for NHibernate.Burrow.AppBlock.DLL
*** ERROR: Module load completed but symbols could not be loaded for NHibernate.Burrow.AppBlock.DLL
*** WARNING: Unable to verify checksum for NHibernate.Caches.SysCache2.DLL
*** ERROR: Module load completed but symbols could not be loaded for NHibernate.Caches.SysCache2.DLL
*** WARNING: Unable to verify checksum for Ademy.UI.Web.Controls.DLL
*** WARNING: Unable to verify checksum for Microsoft.JScript.ni.dll
*** WARNING: Unable to verify checksum for System.Web.Mobile.ni.dll
*** WARNING: Unable to verify checksum for System.Runtime.Serialization.ni.dll
^ Memory access error in '#23s'
0:023> kb
ChildEBP RetAddr Args to Child
11c6ede4 77dc8ed4 766bc622 0000038c 00000000 ntdll!KiFastSystemCallRet
11c6ede8 766bc622 0000038c 00000000 11c6ee20 ntdll!NtSetEvent+0xc
11c6edf8 011011ef 0000038c 7f52be6e 0fda4888 kernel32!SetEvent+0x10
WARNING: Frame IP not in any known module. Following frames may be wrong.
11c6ee20 71b26ffe 060c5f9c 010039b0 010628a0 0x11011ef
*** WARNING: Unable to verify checksum for System.ni.dll
11c6ee4c 712c4b14 02528958 060c5f9c 11c6ee94 mscorlib_ni+0x216ffe
11c6ee5c 712c4abe 060c5fb0 02528958 060c600c System_ni+0x144b14
11c6ee94 71679260 060c5d24 7167926d 060c5d24 System_ni+0x144abe
11c6eec8 717d8373 060c5d24 11c6f3e8 712c4ce4 System_ni+0x4f9260
11c6ef14 712c4ce4 00000000 02528930 11c6ef74 System_ni+0x658373
11c6ef54 7129dbcb 098b6ac4 11c6efec 72f7eff8 System_ni+0x144ce4
11c6efa4 71b26d66 02df349c 11c6efc0 71b45681 System_ni+0x11dbcb
11c6efb0 71b45681 00000000 0dcfd2d8 11c6efd0 mscorlib_ni+0x216d66
11c6efc0 72f11b4c 766b45f1 00000000 11c6f050 mscorlib_ni+0x235681
11c6efd0 72f221f9 11c6f0a0 00000000 11c6f070 mscorwks!CallDescrWorker+0x33
11c6f050 72f36571 11c6f0a0 00000000 11c6f070 mscorwks!CallDescrWorkerWithHandler+0xa3
11c6f194 72f365a4 71a91ff0 11c6f2c8 11c6f1e8 mscorwks!MethodDesc::CallDescr+0x19c
11c6f1b0 72f365c2 71a91ff0 11c6f2c8 11c6f1e8 mscorwks!MethodDesc::CallTargetWorker+0x1f
11c6f1c8 7302a471 11c6f1e8 68e9b644 0dcfd2d8 mscorwks!MethodDescCallSite::CallWithValueTypes+0x1a
11c6f394 7302a5c6 11c6f424 68e9b194 02df34e4 mscorwks!ExecuteCodeWithGuaranteedCleanupHelper+0x9f
11c6f444 71b45577 11c6f3e8 02df17d0 01c177f8 mscorwks!ReflectionInvocation::ExecuteCodeWithGuaranteedCleanup+0x10f
提前感谢任何小费!!
更新
这是被绞死线程的托管堆栈: 我认为它看起来像memcached提供者,但还不确定我该怎么做。
0:023> !clrstack
OS Thread Id: 0xb74 (23)
ESP EIP
11c6ee38 77dc9a94 [NDirectMethodFrameStandaloneCleanup: 11c6ee38] Microsoft.Win32.Win32Native.SetEvent(Microsoft.Win32.SafeHandles.SafeWaitHandle)
11c6ee48 71b26ffe System.Threading.EventWaitHandle.Set()
11c6ee54 712c4b14 System.Net.TimerThread.Prod()
11c6ee64 712c4abe System.Net.TimerThread+TimerQueue.CreateTimer(Callback, System.Object)
11c6eea0 71679260 System.Net.ConnectionPool.CleanupCallbackWrapper(Timer, Int32, System.Object)
11c6eed4 717d8373 System.Net.TimerThread+TimerNode.Fire()
11c6ef1c 712c4ce4 System.Net.TimerThread+TimerQueue.Fire(Int32 ByRef)
11c6ef5c 7129dbcb System.Net.TimerThread.ThreadProc()
11c6efac 71b26d66 System.Threading.ThreadHelper.ThreadStart_Context(System.Object)
11c6efb8 71b45681 System.Threading.ExecutionContext.runTryCode(System.Object)
11c6f3e8 72f11b4c [HelperMethodFrame_PROTECTOBJ: 11c6f3e8] System.Runtime.CompilerServices.RuntimeHelpers.ExecuteCodeWithGuaranteedCleanup(TryCode, CleanupCode, System.Object)
11c6f450 71b45577 System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
11c6f46c 71b301c5 System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
11c6f484 71b26ce4 System.Threading.ThreadHelper.ThreadStart()
11c6f6b0 72f11b4c [GCFrame: 11c6f6b0]
11c6f9a0 72f11b4c [ContextTransitionFrame: 11c6f9a0]
解决方案:
这是因为在Windows 2008上运行时,Win32的memcached 1.2.1中存在一个错误。我更新到v1.2.6并且一切正常。我想我正在看w3wp进程,因为我用来连接到memcached的库有一个挂起的循环过程,即使memcached仍在响应。
解决方案2发现:
如果第一个解决方案不起作用,请阅读THIS POST。我想memcached解决方案只是隐藏真正的问题,这是SmtpClient中的一个错误。
答案 0 :(得分:2)
在windbg中,问题:
〜* e!clrstack
这将转储所有托管线程堆栈,并且应该让您了解该过程中发生的事情。
同时尝试一个!runaway,它会显示每个线程运行的时间。专注于顶线的堆栈,这些堆栈是运行时间最长的线程。
答案 1 :(得分:0)
这可能是由缓存问题引起的吗?例如,您是否将缓存的数据集设置为在数据库到期时自动从数据库重新加载?
我们有过这种情况一次。我们有一个大型数据集,我们希望始终可用。数据并没有经常改变,所以我们在缓存中设置了1小时到期,然后在我们的global.asax中,我们处理了删除(如here所描述的那样,没有注意到我们在小时过后将数据集重新加载到缓存中,这导致每小时使用高CPU和高数据库使用率。
编辑 - 添加
毋庸置疑,我们很快就看到了这一点并从错误中吸取了教训。