我似乎遇到了一个场景,JVM在几小时后试图达到安全点时无限期地停留。但是,如果我使用-F选项执行jstack,它似乎就会退出等待并继续执行。
jdk1.8.0_45 / bin / jstack -F 39924> a.out
我在Centos上使用jdk1.8.0_45
我的问题是:
i)当从jstack发送中断时,似乎JVM可以从安全点无限期等待。如果没有jstack它怎么会出来。是否有一些jvm选项可以用来避免无限期等待。
ii)我是否可以获得导致问题的线程更明确的线程转储。安全点日志的输出似乎不精确。
我使用的选项是:。
-server
-XX:+AggressiveOpts
-XX:+UseG1GC
-XX:+UnlockExperimentalVMOptions
-XX:G1MixedGCLiveThresholdPercent=85
-XX:InitiatingHeapOccupancyPercent=30
-XX:G1HeapWastePercent=5
-XX:MaxGCPauseMillis=1000
-XX:G1HeapRegionSize=4M
-XX:+PrintGC
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-XX:+PrintGCDateStamps
-XX:+UnlockExperimentalVMOptions
-XX:G1LogLevel=finest
-Xmx6000m
-Xdebug
-Xrunjdwp:transport=dt_socket,server=y,suspend=n,address=999
-XX:+SafepointTimeout
-XX:+UnlockDiagnosticVMOptions
-XX:SafepointTimeoutDelay=20000
-XX:+PrintSafepointStatistics
-XX:PrintSafepointStatisticsCount=1
安全点日志
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17771.115: G1IncCollectionPause [ 170 0 0 ] [ 0 0 0 0 8 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17771.125: RevokeBias [ 170 1 2 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17771.127: RevokeBias [ 170 1 1 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17771.131: RevokeBias [ 170 1 2 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17771.955: RevokeBias [ 169 0 2 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17772.160: BulkRevokeBias [ 171 0 2 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17772.352: RevokeBias [ 170 1 3 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17773.596: RevokeBias [ 169 0 1 ] [ 0 0 0 0 0 ] 0
# SafepointSynchronize::begin: Timeout detected:
# SafepointSynchronize::begin: Timed out while spinning to reach a safepoint.
# SafepointSynchronize::begin: Threads which did not reach the safepoint:
# "Thread-14" #115 prio=5 os_prio=0 tid=0x00007f20c8029000 nid=0x9cd0 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE
# SafepointSynchronize::begin: (End of list)
在jstack中断之后,这是我从安全点日志
中看到的vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
17779.826: G1IncCollectionPause [ 169 1 1 ] [3315603 03315603 0 8 ] 1
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
21095.439: RevokeBias [ 169 2 13 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
21095.439: RevokeBias [ 169 1 2 ] [ 0 0 0 0 0 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
21095.441: RevokeBias [ 184 3 4 ] [ 0 0 3 0 1 ] 0
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
21095.447: RevokeBias [ 190 0 2 ] [ 0 0 4 0 2 ] 0
答案 0 :(得分:3)
由于您可以通过中断虚拟机来解决问题,而您在CentOS上的问题让我想起了this kernel bug。
该线程列出了以下受影响的版本(假设标准内核):
- RHEL 6(以及CentOS 6和SL 6):6.0-6.5是好的。 6.6是坏的。 6.6.z 很好。
- RHEL 7(和CentOS 7和SL 7):7.1是BAD。截至昨天。 似乎还没有7.x修复。
- RHEL 5(和CentOS 5,和 SL 5):所有版本都很好(包括5.11)。