Java GC CMS暂停时间随时间增加

时间:2018-07-20 21:22:52

标签: java garbage-collection

我们有一个问题,大约在8小时的间隔内..我们的节点的GC暂停时间增加了,最终它获得了最长的暂停时间〜0.8-1.2秒(完整的gc)..然后又下降了大约需要0.1秒,或者恢复到8小时。..

我们在2个核心盒和4个核心盒上始终观察到的大约8小时的波动模式,这4个核心盒的平均停顿趋势通常较小,但峰值相同,即对我们的2个核心同样严重盒。

很明显,当我们的GC正在运行时,每次运行都做得不够,因为我们继续看到堆/暂停时间超过8小时。.所以我的问题是我们如何才能强制GC周期更有效。而不是等到它需要很长的整个gc周期。 (我想将堆切一半。.将迫使它运行更多..希望导致“一半”的暂停时间增加。.但这并不理想)

(PS,我们正在Java 8,4GB盒子上运行这些jvm)。 创建的对象通常是短暂的(系统本质上表示队列中有很多对象进入,很多对象离开。通常在几秒钟之内。它只是恒定的对象流,CPU使用率很低)。

我们的GC参数

-XX:+CMSClassUnloadingEnabled
-XX:+UseThreadPriorities
-XX:ThreadPriorityPolicy=42
-Xms3000M
-Xmx3000M
-Xmn1500M
-XX:+HeapDumpOnOutOfMemoryError
-Xss256k
-XX:StringTableSize=1000003
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:+CMSParallelRemarkEnabled
-XX:SurvivorRatio=8
-XX:MaxTenuringThreshold=1
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+UseTLAB
-XX:+PerfDisableSharedMem
-XX:CompileCommandFile=./../conf/hotspot_compiler
-XX:CMSWaitDuration=10000
-XX:+CMSParallelInitialMarkEnabled
-XX:+CMSEdenChunksRecordAlways
-XX:CMSWaitDuration=10000
-XX:+UseCondCardMark
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintHeapAtGC
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintPromotionFailure
-Xloggc:/logs/gc.log
-XX:+UseGCLogFileRotation
-XX:NumberOfGCLogFiles=5
-XX:GCLogFileSize=100M

2个事件的GC日志片段显示了较长的gc(以及cms-remark上的较长的擦洗字符串表),第二个事件我们也看到了我认为非常高的类卸载时间..(我们不做任何事情动态加载类)。

2018-07-17T06:20:43.738+0000: 33759.362: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1154831K(1536000K)] 1165177K(2918400K), 0.0048510 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-17T06:20:43.743+0000: 33759.367: Total time for which application threads were stopped: 0.0082435 seconds
2018-07-17T06:20:43.743+0000: 33759.367: [CMS-concurrent-mark-start]
2018-07-17T06:20:43.836+0000: 33759.460: [CMS-concurrent-mark: 0.093/0.093 secs] [Times: user=0.11 sys=0.00, real=0.09 secs] 
2018-07-17T06:20:43.836+0000: 33759.460: [CMS-concurrent-preclean-start]
2018-07-17T06:20:43.844+0000: 33759.468: [CMS-concurrent-preclean: 0.007/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-17T06:20:43.844+0000: 33759.468: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2018-07-17T06:20:48.887+0000: 33764.511: [CMS-concurrent-abortable-preclean: 4.699/5.043 secs] [Times: user=5.58 sys=0.02, real=5.04 secs] 

 2018-07-17T06:20:48.892+0000: 33764.517: [GC (CMS Final Remark) 
    [YG occupancy: 89219 K (1382400 K)]
    33764.517: [Rescan (parallel) , 0.0256303 secs]
    33764.543: [weak refs processing, 0.0124976 secs]
    33764.555: [class unloading, 0.0593605 secs]
    33764.615: [scrub symbol table, 0.0028103 secs]
    33764.617: [scrub string table, 0.3960129 secs]
    [1 CMS-remark: 1154831K(1536000K)] 
    1244051K(2918400K), 0.5055637 secs] 
    [Times: user=0.58 sys=0.00, real=0.50 secs] 
2018-07-17T06:20:49.399+0000: 33765.023: Total time for which application threads were stopped: 0.5111839 seconds

2018-07-17T06:20:49.399+0000: 33765.023: [CMS-concurrent-sweep-start]
2018-07-17T06:20:49.408+0000: 33765.032: Total time for which application threads were stopped: 0.0034313 seconds
2018-07-17T06:20:49.411+0000: 33765.035: Total time for which application threads were stopped: 0.0022739 seconds
2018-07-17T06:20:49.415+0000: 33765.039: Total time for which application threads were stopped: 0.0022818 seconds
2018-07-17T06:20:49.418+0000: 33765.042: Total time for which application threads were stopped: 0.0020566 seconds
2018-07-17T06:20:49.420+0000: 33765.044: Total time for which application threads were stopped: 0.0017496 seconds
2018-07-17T06:20:49.423+0000: 33765.047: Total time for which application threads were stopped: 0.0022639 seconds
2018-07-17T06:20:49.425+0000: 33765.049: Total time for which application threads were stopped: 0.0016516 seconds
2018-07-17T06:20:49.427+0000: 33765.051: Total time for which application threads were stopped: 0.0018342 seconds
2018-07-17T06:20:49.429+0000: 33765.053: Total time for which application threads were stopped: 0.0015327 seconds
2018-07-17T06:20:49.430+0000: 33765.054: Total time for which application threads were stopped: 0.0015723 seconds
2018-07-17T06:20:49.432+0000: 33765.056: Total time for which application threads were stopped: 0.0015356 seconds
2018-07-17T06:20:49.433+0000: 33765.058: Total time for which application threads were stopped: 0.0014560 seconds
2018-07-17T06:20:49.436+0000: 33765.060: Total time for which application threads were stopped: 0.0019394 seconds
2018-07-17T06:20:49.437+0000: 33765.061: Total time for which application threads were stopped: 0.0015594 seconds
2018-07-17T06:20:49.460+0000: 33765.084: Total time for which application threads were stopped: 0.0028257 seconds
2018-07-17T06:20:50.464+0000: 33766.088: Total time for which application threads were stopped: 0.0031651 seconds
2018-07-17T06:20:50.702+0000: 33766.326: [CMS-concurrent-sweep: 1.269/1.303 secs] [Times: user=1.58 sys=0.00, real=1.31 secs] 
2018-07-17T06:20:50.702+0000: 33766.326: [CMS-concurrent-reset-start]
2018-07-17T06:20:50.706+0000: 33766.330: [CMS-concurrent-reset: 0.004/0.004 secs] [Times: user=0.01 sys=0.00, real=0.00 secs] 
2018-07-17T06:20:51.469+0000: 33767.093: Total time for which application threads were stopped: 0.0046566 seconds



2018-07-22T19:38:00.619+0000: 513596.243: [GC (CMS Final Remark) 
     [YG occupancy: 56180 K (1382400 K)]
        513596.243: [Rescan (parallel) , 0.0230426 secs]
        513596.266: [weak refs processing, 0.0272667 secs]
        513596.293: [class unloading, 0.1770894 secs]
        513596.471: [scrub symbol table, 0.0101295 secs]
        513596.481: [scrub string table, 0.3882333 secs]
        [1 CMS-remark: 1152442K(1536000K)] 1208623K(2918400K), 0.6355867 secs] [Times: user=0.56 sys=0.00, real=0.63 secs] 
2018-07-22T19:38:01.255+0000: 513596.879: Total time for which application threads were stopped: 0.6396559 seconds


2018-07-22T19:37:55.241+0000: 513590.865: [GC (CMS Initial Mark) [1 CMS-initial-mark: 1152442K(1536000K)] 1173594K(2918400K), 0.0309287 secs] [Times: user=0.02 sys=0.00, real=0.03 secs] 
2018-07-22T19:37:55.272+0000: 513590.896: Total time for which application threads were stopped: 0.0344867 seconds
2018-07-22T19:37:55.272+0000: 513590.896: [CMS-concurrent-mark-start]
2018-07-22T19:37:55.443+0000: 513591.067: [CMS-concurrent-mark: 0.170/0.170 secs] [Times: user=0.16 sys=0.00, real=0.17 secs] 
2018-07-22T19:37:55.443+0000: 513591.067: [CMS-concurrent-preclean-start]
2018-07-22T19:37:55.451+0000: 513591.075: [CMS-concurrent-preclean: 0.007/0.008 secs] [Times: user=0.01 sys=0.00, real=0.01 secs] 
2018-07-22T19:37:55.451+0000: 513591.075: [CMS-concurrent-abortable-preclean-start]
 CMS: abort preclean due to time 2018-07-22T19:38:00.615+0000: 513596.239: [CMS-concurrent-abortable-preclean: 3.434/5.164 secs] [Times: user=4.10 sys=0.11, real=5.16 secs] 

2018-07-22T19:38:00.619+0000: 513596.243: [GC (CMS Final Remark) 
     [YG occupancy: 56180 K (1382400 K)]
        513596.243: [Rescan (parallel) , 0.0230426 secs]
        513596.266: [weak refs processing, 0.0272667 secs]
        513596.293: [class unloading, 0.1770894 secs]
        513596.471: [scrub symbol table, 0.0101295 secs]
        513596.481: [scrub string table, 0.3882333 secs]
        [1 CMS-remark: 1152442K(1536000K)] 1208623K(2918400K), 0.6355867 secs] [Times: user=0.56 sys=0.00, real=0.63 secs] 

2018-07-22T19:38:01.255+0000: 513596.879: Total time for which application threads were stopped: 0.6396559 seconds
2018-07-22T19:38:01.255+0000: 513596.880: [CMS-concurrent-sweep-start]
2018-07-22T19:38:02.209+0000: 513597.833: [CMS-concurrent-sweep: 0.953/0.953 secs] [Times: user=1.10 sys=0.00, real=0.96 secs] 
2018-07-22T19:38:02.209+0000: 513597.833: [CMS-concurrent-reset-start]
2018-07-22T19:38:02.213+0000: 513597.837: [CMS-concurrent-reset: 0.003/0.003 secs] [Times: user=0.00 sys=0.00, real=0.00 secs] 
2018-07-22T19:38:02.260+0000: 513597.884: Total time for which application threads were stopped: 0.0045430 seconds

1 个答案:

答案 0 :(得分:0)

您似乎在实习太多字符串。为什么需要一个100万的内部字符串表大小?不宜过早死亡的弦可以被年轻的收藏家清洗。只有最有可能在旧版本集合中生存下来的String才应该被插入。还是需要节省这么多字符串的内存?