突然长时间GC暂停的原因

时间:2018-05-23 07:31:10

标签: java garbage-collection

我正在使用测试批处理循环运行我的应用程序一天。 一个测试批次运行4个小时,我运行相同的测试批次30个循环并最终平稳运行一天,突然在一天结束时有一个尖峰,持续20秒。

我已经调整了GC并查看附带的图片,

Sudden spike in the GC pause

在一天结束的时候突然间,我看到GC停顿时出现了飙升,

1)为什么GC出现突然飙升?注意:我在循环中运行相同的测试,每个测试运行4个小时,突然在第10个循环中出现峰值?
2)它可能是什么以及如何跟踪它?

GC日志的相关部分:

{Heap before GC invocations=5451 (full 0):
 garbage-first heap   total 8388608K, used 4872081K [0x00000005c0000000, 0x00000005c0404000, 0x00000007c0000000)
  region size 4096K, 614 young (2514944K), 19 survivors (77824K)
 Metaspace       used 38562K, capacity 39526K, committed 39888K, reserved 1085440K
  class space    used 4096K, capacity 4277K, committed 4352K, reserved 1048576K
2018-05-22T16:36:06.323+0000: 85842.817: [GC pause (G1 Evacuation Pause) (young)
Desired survivor size 161480704 bytes, new threshold 15 (max 15)
- age   1:    4507344 bytes,    4507344 total
- age   2:   16799960 bytes,   21307304 total
- age   3:   15901408 bytes,   37208712 total
- age   4:   16061376 bytes,   53270088 total
, 20.3599018 secs]
   [Parallel Time: 16841.9 ms, GC Workers: 8]
      [GC Worker Start (ms): Min: 85846284.8, Avg: 85846414.4, Max: 85846458.0, Diff: 173.2]
      [Ext Root Scanning (ms): Min: 787.2, Avg: 1628.6, Max: 2662.7, Diff: 1875.5, Sum: 13029.0]
      [Update RS (ms): Min: 13521.3, Avg: 14190.0, Max: 14460.0, Diff: 938.7, Sum: 113520.2]
         [Processed Buffers: Min: 11, Avg: 27.8, Max: 50, Diff: 39, Sum: 222]
      [Scan RS (ms): Min: 165.8, Avg: 291.7, Max: 401.5, Diff: 235.7, Sum: 2333.4]
      [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.7]
      [Object Copy (ms): Min: 78.1, Avg: 279.5, Max: 1353.4, Diff: 1275.3, Sum: 2235.6]
      [Termination (ms): Min: 0.0, Avg: 3.8, Max: 5.0, Diff: 5.0, Sum: 30.6]
         [Termination Attempts: Min: 1, Avg: 5.1, Max: 15, Diff: 14, Sum: 41]
      [GC Worker Other (ms): Min: 0.0, Avg: 317.6, Max: 1258.0, Diff: 1258.0, Sum: 2541.0]
      [GC Worker Total (ms): Min: 16667.6, Avg: 16711.3, Max: 16841.3, Diff: 173.7, Sum: 133690.5]
      [GC Worker End (ms): Min: 85863125.6, Avg: 85863125.8, Max: 85863126.2, Diff: 0.6]
   [Code Root Fixup: 1.2 ms]
   [Code Root Purge: 0.0 ms]
   [String Dedup Fixup: 5.9 ms, GC Workers: 8]
      [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
      [Table Fixup (ms): Min: 0.0, Avg: 1.6, Max: 5.9, Diff: 5.9, Sum: 12.6]
   [Clear CT: 1.1 ms]
   [Other: 3509.7 ms]
      [Choose CSet: 0.0 ms]
      [Ref Proc: 30.7 ms]
      [Ref Enq: 2.4 ms]
      [Redirty Cards: 4.0 ms]
      [Humongous Register: 2339.6 ms]
      [Humongous Reclaim: 0.0 ms]
      [Free CSet: 3.5 ms]
   [Eden: 2380.0M(2380.0M)->0.0B(720.0M) Survivors: 76.0M->96.0M Heap: 4757.9M(8192.0M)->2395.9M(8192.0M)]
Heap after GC invocations=5452 (full 0):
 garbage-first heap   total 8388608K, used 2453393K [0x00000005c0000000, 0x00000005c0404000, 0x00000007c0000000)
  region size 4096K, 24 young (98304K), 24 survivors (98304K)
 Metaspace       used 38562K, capacity 39526K, committed 39888K, reserved 1085440K
  class space    used 4096K, capacity 4277K, committed 4352K, reserved 1048576K
}
 [Times: user=33.60 sys=5.40, real=20.36 secs] 
2018-05-22T16:36:26.684+0000: 85863.177: Total time for which application threads were stopped: 23.3452204 seconds, Stopping threads took: 1.5114697 seconds
2018-05-22T16:36:26.685+0000: 85863.179: [GC concurrent-string-deduplication, 6912.0B->64.0B(6848.0B), avg 88.1%, 0.0000411 secs]

完整的GC记录here

GC配置:

-XX:+UseG1GC
-XX:ParallelGCThreads=8
-XX:+ExplicitGCInvokesConcurrent
-XX:+ParallelRefProcEnabled
-XX:+UseStringDeduplication
-XX:+UnlockExperimentalVMOptions
-XX:G1NewSizePercent=10
-XX:G1MaxNewSizePercent=30
-XX:MaxGCPauseMillis=400

Java版本 - 8

1 个答案:

答案 0 :(得分:1)

我有一个小问题,你在这台机器上有多少核心或虚拟CPU?

对不起,我没有意识到你有8个内核我缺少的是Xms和Xmx的值,但我想你使用相同的值来避免调整大小。

根据我的经验,如果我是你,我宁愿使用并行垃圾收集器。你有什么特别的理由使用GC1吗?

我的意思是,如果我的机器具有比你更多的功能,我会使用GC1我在谈论核心数量和你应该分配的堆。根据{{​​3}}:

  

Garbage-First(G1)垃圾收集器是一种服务器式垃圾收集器,适用于具有大内存的多处理器机器。

更重要的是,我看不到并发阶段使用的并发线程数的配置,所以我可以根据Oracle的这些信息想象你正在使用2个线程的那些阶段。不幸的是,我找不到Java 8的默认值。

  

用于并发工作的最大线程数。默认情况下,   此值为-XX:ParallelGCThreads除以4

因此,我认为只有2个线程用于并发阶段,有更多的浮动垃圾,应该在更新记忆集(更新RS)期间进行处理,这或许可以解释为什么这个阶段需要花费很多时间。

根据机器拥有的内核数量以及JVM使用的内存,我建议您尝试使用这些标志的并行收集器来查看是否可以避免这种长时间停顿

-XX:+UseParallelGC -Xms8192M -Xmx8192M -Xmn3072M -XX:SurvivorRatio=3 -Xss1024K -XX:MetaspaceSize=1024M -XX:MaxMetaspaceSize=1024M -XX:TargetSurvivorRatio=90 -XX:ParallelGCThreads=8 -XX:+AlwaysPreTouch -XX:+ParallelRefProcEnabled -XX:-UseAdaptiveSizePolicy -XX:+DisableExplicitGC -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/some/path/ -verbose:gc -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:/some/path/gc.log -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=10M

最后但并非最不重要的是,请记住,对于G1GC,您应该尽可能地避免使用Full GC,因为它们只使用一个线程执行,您可以看到Java 9

  

您可能会遇到的第三种类型的集合,以及我们所拥有的集合   工作要避免的是Full GC。在G1中,Full GC是单个GC   螺纹停止世界(STW)暂停,将撤离和紧凑   所有地区。您可以有三个重要的信息   从Full GC日志中获取

我认为这会在here

中得到改善

致以最诚挚的问候,

RCC