我们有一个以集群模式运行Wildfly应用服务器的应用程序(6个节点)。 当GC触发时,我们有时会看到JVM冻结16秒。 应用程序是时间敏感的,如果15secs中没有收到心跳响应,则群集中的其他节点认为该节点已死(JVM暂停)。 因此,JVM冻结导致应用程序不稳定。 要了解GC期间发生的情况,我们启用了热点,安全点日志,并在GC暂停时查看以下跟踪。
任何人都可以解释以下参数的含义。
1.) active_workers(): 13
2.) new_acitve_workers: 13
3.) prev_active_workers: 13
4.) active_workers_by_JT: 3556
5.) active_workers_by_heap_size: 146
环境详情: Linux 64位RHEL 7 OpenJDK 1.8 堆大小:12GB(年轻:4GB,任期:8GB) CPU核心:16 VMware ESX 5.1
JVM参数:
-XX:ThreadStackSize=512
-Xmx12288m
-XX:+UseParallelGC
-XX:+UseParallelOldGC
-XX:MaxPermSize=1024m
-XX:+DisableExplicitGC
-XX:NewSize=4096m
-XX:MaxNewSize=4096m
-XX:ReservedCodeCacheSize=256m
-XX:+UseCodeCacheFlushing
-XX:+UseDynamicNumberOfGCThreads
调整这些JVM参数以减少GC暂停时间的任何建议?
GC日志:
GCTaskManager::calc_default_active_workers() : active_workers(): 13 new_acitve_workers: 13 prev_active_workers: 13
active_workers_by_JT: 3556 active_workers_by_heap_size: 146
GCTaskManager::set_active_gang(): all_workers_active() 1 workers 13 active 13 ParallelGCThreads 13
JT: 1778 workers 13 active 13 idle 0 more 0
2016-10-06T07:38:47.281+0530: 48313.522: [Full GC (Ergonomics) DrainStacksCompactionTask::do_it which = 3 which_stack_index = 3/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 7 which_stack_index = 7/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 2 which_stack_index = 2/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 0 which_stack_index = 0/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 11 which_stack_index = 11/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 6 which_stack_index = 6/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 1 which_stack_index = 1/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 12 which_stack_index = 12/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 4 which_stack_index = 4/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 5 which_stack_index = 5/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 9 which_stack_index = 9/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 8 which_stack_index = 8/empty(0) use all workers 1
DrainStacksCompactionTask::do_it which = 10 which_stack_index = 10/empty(0) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 3 region_stack = 0x780be610 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 5 region_stack = 0x780be730 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 7 region_stack = 0x780be850 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 11 region_stack = 0x780bea90 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 1 region_stack = 0x780be4f0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 10 region_stack = 0x780bea00 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 8 region_stack = 0x780be8e0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 4 region_stack = 0x780be6a0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 0 region_stack = 0x780be460 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 2 region_stack = 0x780be580 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 6 region_stack = 0x780be7c0 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 12 region_stack = 0x780beb20 empty (1) use all workers 1
StealRegionCompactionTask::do_it region_stack_index 9 region_stack = 0x780be970 empty (1) use all workers 1
[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs]
[Times: user=180.57 sys=2.46, real=16.09 secs]
2016-10-06T07:39:03.373+0530: 48329.615: Total time for which application threads were stopped: 16.2510644 seconds, Stopping threads took: 0.0036805 seconds
48313.363: ParallelGCFailedAllocation [ 2384 0 2 ] [ 0 0 3 35 16210 ] 0
提前感谢您的帮助。
答案 0 :(得分:0)
根据ParallelGCFailedAllocation
和[PSYoungGen: 63998K->0K(4082176K)] [ParOldGen: 8346270K->3657870K(8388608K)] 8410268K->3657870K(12470784K), [Metaspace: 465864K->465775K(1495040K)], 16.0898939 secs]
判断,我们有以下条件:
即使是13个并行运行的GC线程也会收集这些5G 16秒。由于您只有16个核心,因此添加更多线程的速度提升空间不大。
以下情况可能会发生在这里:
InitiatingHeapOccupancyPercent
。通过当前输出进行Jugding,您应该在4G附近启动。虽然如果你真的需要那些12G的堆,它会受到质疑,因为它会成为堆碎片问题的主题。SurvivorRatio
参数以使其更大。像SurvivorRatio = 4(在这种情况下会使它成为1G)。所以它真的取决于你的对象分配模式。最好的方法是在将其应用于生产之前在其他地方尝试。