JVM标志-XX:+ UseDynamicNumberOfGCThreads -XX:+ TraceDynamicGCThreads启用以查看否。 GC期间的GC线程。请解释输出日志?

时间:2016-10-06 08:33:36

标签: java performance jvm jvm-hotspot

我们有一个以集群模式运行Wildfly应用服务器的应用程序(6个节点)。 当GC触发时,我们有时会看到JVM冻结16秒。 应用程序是时间敏感的,如果15secs中没有收到心跳响应,则群集中的其他节点认为该节点已死(JVM暂停)。 因此,JVM冻结导致应用程序不稳定。 要了解GC期间发生的情况,我们启用了热点,安全点日志,并在GC暂停时查看以下跟踪。

任何人都可以解释以下参数的含义。

1.) active_workers(): 13  
2.) new_acitve_workers: 13  
3.) prev_active_workers: 13
4.) active_workers_by_JT: 3556  
5.) active_workers_by_heap_size: 146

环境详情:   Linux 64位RHEL 7   OpenJDK 1.8   堆大小:12GB(年轻:4GB,任期:8GB)   CPU核心:16   VMware ESX 5.1

JVM参数:

-XX:ThreadStackSize=512 
-Xmx12288m 
-XX:+UseParallelGC 
-XX:+UseParallelOldGC 
-XX:MaxPermSize=1024m 
-XX:+DisableExplicitGC 
-XX:NewSize=4096m 
-XX:MaxNewSize=4096m 
-XX:ReservedCodeCacheSize=256m 
-XX:+UseCodeCacheFlushing
-XX:+UseDynamicNumberOfGCThreads

调整这些JVM参数以减少GC暂停时间的任何建议?

GC日志:

GCTaskManager::calc_default_active_workers() : active_workers(): 13  new_acitve_workers: 13  prev_active_workers: 13
 active_workers_by_JT: 3556  active_workers_by_heap_size: 146
GCTaskManager::set_active_gang(): all_workers_active()  1  workers 13  active  13  ParallelGCThreads 13
JT: 1778  workers 13  active  13  idle 0  more 0
2016-10-06T07:38:47.281+0530: 48313.522: [Full GC (Ergonomics) DrainStacksCompactionTask::do_it which = 3 which_stack_index = 3/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 7 which_stack_index = 7/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 2 which_stack_index = 2/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 0 which_stack_index = 0/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 11 which_stack_index = 11/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 6 which_stack_index = 6/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 1 which_stack_index = 1/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 12 which_stack_index = 12/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 4 which_stack_index = 4/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 5 which_stack_index = 5/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 9 which_stack_index = 9/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 8 which_stack_index = 8/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 10 which_stack_index = 10/empty(0) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 3 region_stack = 0x780be610  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 5 region_stack = 0x780be730  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 7 region_stack = 0x780be850  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 11 region_stack = 0x780bea90  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 1 region_stack = 0x780be4f0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 10 region_stack = 0x780bea00  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 8 region_stack = 0x780be8e0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 4 region_stack = 0x780be6a0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 0 region_stack = 0x780be460  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 2 region_stack = 0x780be580  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 6 region_stack = 0x780be7c0  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 12 region_stack = 0x780beb20  empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 9 region_stack = 0x780be970  empty (1) use all workers 1
[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs] 
[Times: user=180.57 sys=2.46, real=16.09 secs]
2016-10-06T07:39:03.373+0530: 48329.615: Total time for which application threads were stopped: 16.2510644 seconds, Stopping threads took: 0.0036805 seconds

安全点日志:

48313.363: ParallelGCFailedAllocation       [    2384          0              2    ]      [     0     0     3    35 16210    ]  0

提前感谢您的帮助。

1 个答案:

答案 0 :(得分:0)

根据ParallelGCFailedAllocation[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs]判断,我们有以下条件:

  1. YoungGen几乎是空的(只有63M在4G中占用)
  2. OldGen几乎已满(8,3G中只有42M)
  3. JVM尝试从YoungGen移动幸存的对象,或者在Survivor空间中分配它们并决定将它们移动到OldGen
  4. OldGen也没有足够的空间(如上所述只有42M)所以触发了FullGC
  5. 收集完OldGC 5G OldGen后(8346270K-> 3657870K)
  6. 即使是13个并行运行的GC线程也会收集这些5G 16秒。由于您只有16个核心,因此添加更多线程的速度提升空间不大。

    以下情况可能会发生在这里:

    • 您的对象对于YounGen 来说活得太长,因此您需要切换到CMS / G1,以便更频繁地收集OldGen,总共花费的时间更少。您需要根据需要调整InitiatingHeapOccupancyPercent。通过当前输出进行Jugding,您应该在4G附近启动。虽然如果你真的需要那些12G的堆,它会受到质疑,因为它会成为堆碎片问题的主题。
    • 您的对象是短暂的但是太大而无法在幸存者空间中容纳,因此您需要调整SurvivorRatio参数以使其更大。像SurvivorRatio = 4(在这种情况下会使它成为1G)。

    所以它真的取决于你的对象分配模式。最好的方法是在将其应用于生产之前在其他地方尝试。

    JVM GC parameters could be found here