我不是将G1GC垃圾收集器用于presto引擎,使用244Gb RAM节点作为slave,并使用以下gc相关配置。
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintGCCause
-XX:+PrintGCDateStamps
-XX:+PrintGCTimeStamps
-XX:+PrintGCDetails
-XX:+PrintClassHistogramAfterFullGC
-XX:+PrintClassHistogramBeforeFullGC
-XX:PrintFLSStatistics=2
-XX:+PrintAdaptiveSizePolicy
-XX:+PrintSafepointStatistics
-XX:PrintSafepointStatisticsCount=1
-Xloggc:/media/ephemeral0/presto/var/log/gc.log
-XX:+UseG1GC
-XX:+ExplicitGCInvokesConcurrent
-XX:+CMSClassUnloadingEnabled
-XX:+AggressiveOpts
-XX:PermSize=150M
-XX:MaxPermSize=150M
-XX:ReservedCodeCacheSize=150M
-Xmx210gb
我知道这些属性适用于G1GC(CMS相关的)。有了这个配置,当堆大小增加到70 gb时,我有时会看到GC疏散阶段(年轻)暂停7-9秒。 我做了一些阅读并试图通过配置属性,添加-XX:+ DisableExplicitGC没有帮助。 有关调整GC以减少这些超时的任何建议都将非常感激。 一个这样的实例的GC日志片段:
2017-08-09T18:01:43.529+0000: 6125.293: [GC pause (G1 Evacuation Pause) (young) 6125.295: [G1Ergonomics (CSet Construction) start choosing CSet, _pending_cards: 69969, predicted base time: 813.09 ms, remaining time: 0.00 ms, target pause time: 200.00 ms]
6125.295: [G1Ergonomics (CSet Construction) add young regions to CSet, eden: 297 regions, survivors: 39 regions, predicted young region time: 5339.39 ms]
6125.328: [G1Ergonomics (CSet Construction) finish choosing CSet, eden: 297 regions, survivors: 39 regions, old: 0 regions, predicted pause time: 6152.47 ms, target pause time: 200.00 ms]
, 6.7213894 secs]
[Parallel Time: 6255.9 ms, GC Workers: 23]
[GC Worker Start (ms): Min: 6125350.5, Avg: 6125355.7, Max: 6125361.1, Diff: 10.6]
[Ext Root Scanning (ms): Min: 11.1, Avg: 304.8, Max: 501.2, Diff: 490.1, Sum: 7011.5]
[Update RS (ms): Min: 0.0, Avg: 69.1, Max: 465.7, Diff: 465.7, Sum: 1589.2]
[Processed Buffers: Min: 0, Avg: 33.4, Max: 376, Diff: 376, Sum: 768]
[Scan RS (ms): Min: 173.8, Avg: 211.5, Max: 231.5, Diff: 57.7, Sum: 4865.2]
[Code Root Scanning (ms): Min: 0.0, Avg: 3.1, Max: 51.1, Diff: 51.1, Sum: 70.6]
[Object Copy (ms): Min: 5522.2, Avg: 5651.9, Max: 5992.8, Diff: 470.7, Sum: 129993.5]
[Termination (ms): Min: 0.0, Avg: 5.8, Max: 6.9, Diff: 6.9, Sum: 133.9]
[Termination Attempts: Min: 1, Avg: 31.7, Max: 44, Diff: 43, Sum: 730]
[GC Worker Other (ms): Min: 0.9, Avg: 2.0, Max: 3.4, Diff: 2.5, Sum: 47.1]
[GC Worker Total (ms): Min: 6243.1, Avg: 6248.3, Max: 6253.2, Diff: 10.1, Sum: 143710.9]
[GC Worker End (ms): Min: 6131603.1, Avg: 6131604.0, Max: 6131604.9, Diff: 1.8]
[Code Root Fixup: 1.5 ms]
[Code Root Purge: 0.3 ms]
[Clear CT: 302.2 ms]
[Other: 161.6 ms]
[Choose CSet: 32.7 ms]
[Ref Proc: 25.8 ms]
[Ref Enq: 0.8 ms]
[Redirty Cards: 24.6 ms]
[Humongous Register: 11.8 ms]
[Humongous Reclaim: 3.8 ms]
[Free CSet: 26.5 ms]
[Eden: 9504.0M(9504.0M)->0.0B(9920.0M) Survivors: 1248.0M->832.0M Heap: 68.9G(210.0G)->60.2G(210.0G)]
[Times: user=134.90 sys=6.83, real=6.72 secs]
2017-08-09T18:01:50.261+0000: 6132.025: Total time for which application threadspacket_write_wait: Connection to 54.225.198.1
我也研究了类似的问题: Java garbage collector G1GC taking long for 'Object Copy' (Evacuation Pause) Java garbage collector G1GC taking long for 'Object Copy' (Evacuation Pause) 虽然这些提供了见解,但我没有找到任何关于Xmx大约200gb的建议。