GC似乎导致java服务器应用程序无响应

时间:2010-10-01 13:20:21

标签: java garbage-collection

我们在12核IBM机器上的RedHat Linux 5.2上运行了JBoss 4.2.3中的webapp。最近我们看到了很长的响应时间,这似乎是GC活动造成的。模式或多或少如下: 所有请求似乎都以正常速率发送,但是在突然所有响应都被“刷新”之前不会发送任何字节。 同时gc.log具有以下模式:

// Here the unresponsiveness starts
36788.309: [GC [1 CMS-initial-mark: 1919297K(4456448K)] 1969797K(5111808K), 0.1473460 secs] [Times: user=0.15 sys=0.00, real=0.15 secs]
Total time for which application threads were stopped: 0.1530790 seconds
36788.457: [CMS-concurrent-mark-start]
36793.886: [CMS-concurrent-mark: 5.429/5.429 secs] (CMS-concurrent-mark yielded 46 times)
 [Times: user=6.69 sys=0.07, real=5.43 secs]
36793.886: [CMS-concurrent-preclean-start]
 (cardTable: 15956 cards, re-scanned 15956 cards, 1 iterations)
36793.988: [CMS-concurrent-preclean: 0.089/0.101 secs] (CMS-concurrent-preclean yielded 9 times)
 [Times: user=0.12 sys=0.01, real=0.10 secs]
36793.988: [CMS-concurrent-abortable-preclean-start]
 (cardTable: 75 cards, re-scanned 75 cards, 1 iterations)
Application time: 5.6923460 seconds
Total time for which application threads were stopped: 0.0170000 seconds
 (cardTable: 96 cards, re-scanned 96 cards, 1 iterations)
 (cardTable: 207 cards, re-scanned 207 cards, 1 iterations)
 (cardTable: 45 cards, re-scanned 45 cards, 1 iterations)
 (cardTable: 157 cards, re-scanned 157 cards, 1 iterations)
 (cardTable: 88 cards, re-scanned 88 cards, 1 iterations)
 (cardTable: 213 cards, re-scanned 213 cards, 1 iterations)
 (cardTable: 84 cards, re-scanned 84 cards, 1 iterations)
 (cardTable: 143 cards, re-scanned 143 cards, 1 iterations)
 (cardTable: 40 cards, re-scanned 40 cards, 1 iterations)
 (cardTable: 109 cards, re-scanned 109 cards, 1 iterations)
 (cardTable: 77 cards, re-scanned 77 cards, 1 iterations)
 (cardTable: 156 cards, re-scanned 156 cards, 1 iterations)
 (cardTable: 132 cards, re-scanned 132 cards, 1 iterations)
 (cardTable: 119 cards, re-scanned 119 cards, 1 iterations)
 (cardTable: 115 cards, re-scanned 115 cards, 1 iterations)
 (cardTable: 195 cards, re-scanned 195 cards, 1 iterations)
 (cardTable: 163 cards, re-scanned 163 cards, 1 iterations)
 (cardTable: 166 cards, re-scanned 166 cards, 1 iterations)
 (cardTable: 149 cards, re-scanned 149 cards, 1 iterations)
 (cardTable: 218 cards, re-scanned 218 cards, 1 iterations)
 (cardTable: 46 cards, re-scanned 46 cards, 1 iterations)
 (cardTable: 57 cards, re-scanned 57 cards, 1 iterations)
 (cardTable: 213 cards, re-scanned 213 cards, 1 iterations)
 (cardTable: 189 cards, re-scanned 189 cards, 1 iterations)
 (cardTable: 104 cards, re-scanned 104 cards, 1 iterations)
 (cardTable: 60 cards, re-scanned 60 cards, 1 iterations)
 (cardTable: 368 cards, re-scanned 368 cards, 1 iterations)
 (cardTable: 118 cards, re-scanned 118 cards, 1 iterations)
 (cardTable: 129 cards, re-scanned 129 cards, 1 iterations)
 (cardTable: 291 cards, re-scanned 291 cards, 1 iterations)
 (cardTable: 105 cards, re-scanned 105 cards, 1 iterations)
 (cardTable: 63 cards, re-scanned 63 cards, 1 iterations)
 (cardTable: 73 cards, re-scanned 73 cards, 1 iterations)
 (cardTable: 165 cards, re-scanned 165 cards, 1 iterations)
 (cardTable: 69 cards, re-scanned 69 cards, 1 iterations)
 (cardTable: 157 cards, re-scanned 157 cards, 1 iterations)
 CMS: abort preclean due to time  [37 iterations, 13 waits, 4954 cards)] 36799.161: [CMS-concurrent-abortable-preclean: 2.965/5.173 secs] (CMS-concurrent-abortable-preclean yielded 166 times)
 [Times: user=5.12 sys=0.10, real=5.17 secs]
Application time: 4.9945030 seconds
36799.172: [GC[YG occupancy: 572074 K (655360 K)]36799.172: [GC 36799.173: [ParNew
Desired survivor size 67108864 bytes, new threshold 2 (max 15)
- age   1:   59654872 bytes,   59654872 total
- age   2:   26827232 bytes,   86482104 total
- age   3:    2613360 bytes,   89095464 total
- age   4:     929600 bytes,   90025064 total
- age   5:     893176 bytes,   90918240 total
- age   6:     795736 bytes,   91713976 total
- age   7:     809240 bytes,   92523216 total
- age   8:     936216 bytes,   93459432 total
- age   9:     971376 bytes,   94430808 total
- age  10:     624472 bytes,   95055280 total
- age  11:     883616 bytes,   95938896 total
- age  12:     817296 bytes,   96756192 total
- age  13:     977992 bytes,   97734184 total
- age  14:     733816 bytes,   98468000 total
- age  15:     654744 bytes,   99122744 total
: 572074K->110988K(655360K), 0.1855140 secs] 2491372K->2031059K(5111808K) icms_dc=0 , 0.1866380 secs] [Times: user=0.72 sys=0.00, real=0.19 secs]
36799.359: [Rescan (parallel)  (Survivor:12chunks) Finished young gen rescan work in 3th thread: 0.032 sec
Finished young gen rescan work in 2th thread: 0.035 sec
Finished young gen rescan work in 0th thread: 0.035 sec
Finished young gen rescan work in 1th thread: 0.035 sec
Finished remaining root rescan work in 1th thread: 0.030 sec
Finished remaining root rescan work in 3th thread: 0.033 sec
Finished remaining root rescan work in 2th thread: 0.032 sec
Finished remaining root rescan work in 0th thread: 0.042 sec
Finished dirty card rescan work in 2th thread: 0.029 sec
Finished dirty card rescan work in 3th thread: 0.031 sec
Finished dirty card rescan work in 0th thread: 0.019 sec
Finished dirty card rescan work in 1th thread: 0.031 sec
Finished work stealing in 3th thread: 0.000 sec
Finished work stealing in 0th thread: 0.000 sec
Finished work stealing in 2th thread: 0.000 sec
Finished work stealing in 1th thread: 0.000 sec
, 0.0959640 secs]36799.455: [weak refs processing, 0.0121220 secs] [1 CMS-remark: 1920070K(4456448K)] 2031059K(5111808K), 0.2952350 secs] [Times: user=1.11 sys=0.00, real=0.30 secs]
Total time for which application threads were stopped: 0.3066430 seconds
36799.468: [CMS-concurrent-sweep-start]
36805.824: [CMS-concurrent-sweep: 6.355/6.356 secs] (CMS-concurrent-sweep yielded 515 times)
 [Times: user=8.76 sys=0.24, real=6.35 secs]
36805.824: [CMS-concurrent-reset-start]
36805.903: [CMS-concurrent-reset: 0.079/0.079 secs] (CMS-concurrent-reset yielded 0 times)
 [Times: user=0.64 sys=0.01, real=0.07 secs]
Application time: 6.5590510 seconds
// About here the server starts sending out responses
36806.044: [GC 36806.045: [ParNew
Desired survivor size 67108864 bytes, new threshold 2 (max 15)
- age   1:   55456640 bytes,   55456640 total
- age   2:   54278984 bytes,  109735624 total
: 635276K->131072K(655360K), 0.3230430 secs] 2554507K->2089990K(5111808K) icms_dc=0 , 0.3242750 secs] [Times: user=1.02 sys=0.02, real=0.33 secs]
Total time for which application threads were stopped: 0.3422430 seconds
Application time: 0.7628480 seconds

我们尝试了各种GC设置,但我们总是或多或少地这样做。 我们目前的参数:

-Xms4096m -Xmx5120m -XX:MaxPermSize=384m
-XX:+UseConcMarkSweepGC -XX:+ExplicitGCInvokesConcurrent 
-XX:+PrintClassHistogram -XX:+PrintGCApplicationConcurrentTime 
-XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails 
-XX:ReservedCodeCacheSize=64m -XX:NewSize=768m -XX:SurvivorRatio=4 
-XX:+CMSParallelRemarkEnabled -XX:+CMSScavengeBeforeRemark -XX:+CMSIncrementalMode 
-XX:+CMSIncrementalPacing -XX:CMSIncrementalDutyCycleMin=0 -XX:CMSIncrementalDutyCycle=10 
-XX:+UseTLAB -XX:ParallelGCThreads=4 
-XX:+UseParNewGC -XX:MaxGCPauseMillis=1000 -XX:GCTimeRatio=10 
-XX:+CMSPermGenSweepingEnabled 
-XX:+PrintCMSInitiationStatistics -XX:PrintCMSStatistics=1 -XX:+PrintGCTaskTimeStamps 
-XX:+PrintGCTimeStamps 
-verbose:gc -XX:+TraceClassLoading -XX:+TraceClassUnloading 
-XX:+PrintTenuringDistribution 

我们怎样才能避免无反应?

2 个答案:

答案 0 :(得分:1)

GC日志包括以下行:

  

应用程序线程停止的总时间:...秒

除非我弄错了,否则它们加起来不到1秒钟。如果您的网络应用程序在任何相当长的时间内没有响应,则不会这样做。

我会建议原因可能是其他原因。一种可能性是您正在推动机器的限制,并且运行GC的额外负载导致请求积压。例如,由于您已经配置了4个并行GC线程,因此在GC运行时(16秒?),理论上最大请求率会下降33%。这可能会导致重要的请求积压。

第二种可能性是你的机器物理内存不足,GC正在将其推入虚拟内存抖动的温和情况。

答案 1 :(得分:0)

添加问题作为答案,因为我还没有代表添加评论。你在运行什么JRE?