更新:问题至少通过组合-XX:+UseCountedLoopSafepoints
和JVM版本等于或大于JDK 1.8.0_92(使用_161测试)来解决。
ORIG: 包含了JDK 1.8.0示例,但它也发生在JDK 9.0.4上。
我一直在研究一个奇怪的错误,它似乎表现为JVM以与安全点暂停非常一致的方式停止所有线程。 VM操作有时被列为“无VM操作”。正如-XX
选项输出AND jvisualVM
所确认的那样,GC没有参与减速。
这一切都在下面的100个线路中。我已经严格配对代码,以便根据简单的boolean
设置显示问题。查找MAKE_THE_JVM_PAUSE_UNEXPECTEDLY
boolean
。
我正在创建一个5000x5000的boolean
数组,并在第66列随机访问前10行。真正的代码导致了同样的问题。
这个例子的算法本身是荒谬的。这只是一个高度配对示例。
用于运行此选项的选项位于标题注释中。
package conway;
/**
* Run this with the following java VM options:
*
* -verbosegc
* -XX:+PrintGCApplicationStoppedTime
* -XX:+PrintSafepointStatistics
* -XX:PrintSafepointStatisticsCount=1
* -XX:+PrintGCDetails
* -XX:+PerfDisableSharedMem
*/
public class Conway
{
public static void main(String[] args)
{
//-----------------------[HEARTBEAT]-----------------------
new Thread(() ->
{
long startTime = System.currentTimeMillis();
long lastTime = startTime;
for (int count=0;; count++)
{
// Time stamp....
long now = System.currentTimeMillis();
System.out.printf("%d-->Delta time from Start[%dms], from last[%dms]\n",
count, now-startTime, now-lastTime);
System.out.flush();
lastTime = now;
// Waste time without sleeping...
for (long start = System.currentTimeMillis();
System.currentTimeMillis() - start < 200;);
}
}).start();
//-----------------------[COMPUTATION]-----------------------
one = new boolean[5000][5000];
boolean MAKE_THE_JVM_PAUSE_UNEXPECTEDLY = true;
for (;;)
{
if (MAKE_THE_JVM_PAUSE_UNEXPECTEDLY)
{
/**
* CALLING sweep() will cause all threads to periodically pause...
*/
sweep();
}
else
{
/**
* THIS ONE RUNS FINE. Exact same contents as in sweep() below...
*/
for (int y=0; y<5000; y++)
{
for (int x=0; x<5000; x++)
{
int living=0;
for (int yy=y-1; yy <= y+1; yy++)
for (int xx=x-1; xx <= x+1; xx++)
{
if (one[randomInt(0, 10)][66])
living++;
}
} // for(x...)
} // for(y...)
}
} // for(;;)
}
public static void sweep()
{
for (int y=0; y<5000; y++)
{
for (int x=0; x<5000; x++)
{
int living=0;
for (int yy=y-1; yy <= y+1; yy++)
for (int xx=x-1; xx <= x+1; xx++)
{
if (one[randomInt(0,10)][66])
living++;
}
} // for(x...)
} // for(y...)
}
public static int randomInt(int first, int last)
{
return (int)(((last-first+1)*Math.random())+first);
}
public static boolean[][] one;
}
现在当我将MAKE_THE_JVM_PAUSE_UNEXPECTEDLY
boolean
设置为true
来运行此事时,会导致奇怪的减速。
请注意,问题首先出现在心跳49和50之间.Heartbeats来自代码中的第一个(匿名启动)线程,算法本身使用主类线程,并由看起来类似于这样:
11-->Delta time from Start[2210ms], from last[200ms]
我自己终止了跑步。
[GC (Allocation Failure) [PSYoungGen: 512K->384K(1024K)] 512K->384K(523776K), 0.0004807 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.053: ParallelGCFailedAllocation [ 5 0 0 ] [ 0 0 0 0 0 ] 0
Total time for which application threads were stopped: 0.0005755 seconds
[GC (Allocation Failure) [PSYoungGen: 896K->480K(1024K)] 896K->488K(523776K), 0.0003818 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.063: ParallelGCFailedAllocation [ 5 0 0 ] [ 0 0 0 0 0 ] 0
Total time for which application threads were stopped: 0.0004564 seconds
[GC (Allocation Failure) [PSYoungGen: 992K->496K(1024K)] 1000K->536K(523776K), 0.0004566 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.072: ParallelGCFailedAllocation [ 10 1 1 ] [ 0 0 0 0 0 ] 0
Total time for which application threads were stopped: 0.0007704 seconds
[GC (Allocation Failure) [PSYoungGen: 1008K->480K(1536K)] 1048K->660K(524288K), 0.0009025 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.082: ParallelGCFailedAllocation [ 10 0 0 ] [ 0 0 0 0 0 ] 0
Total time for which application threads were stopped: 0.0009905 seconds
[GC (Allocation Failure) [PSYoungGen: 1499K->499K(1536K)] 1679K->1006K(524288K), 0.0006638 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.102: ParallelGCFailedAllocation [ 11 1 1 ] [ 0 0 0 0 0 ] 0
Total time for which application threads were stopped: 0.0009292 seconds
[GC (Allocation Failure) [PSYoungGen: 1518K->992K(2560K)] 2025K->2018K(525312K), 0.0007172 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.103: ParallelGCFailedAllocation [ 11 0 1 ] [ 0 0 0 0 0 ] 0
Total time for which application threads were stopped: 0.0007980 seconds
[GC (Allocation Failure) [PSYoungGen: 2526K->1507K(3072K)] 3552K->3515K(525824K), 0.0009654 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.107: ParallelGCFailedAllocation [ 11 0 1 ] [ 0 0 0 0 1 ] 0
Total time for which application threads were stopped: 0.0010672 seconds
[GC (Allocation Failure) [PSYoungGen: 3041K->2547K(5120K)] 5049K->5128K(527872K), 0.0010321 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.108: ParallelGCFailedAllocation [ 11 0 2 ] [ 0 0 0 0 1 ] 0
Total time for which application threads were stopped: 0.0011153 seconds
[GC (Allocation Failure) [PSYoungGen: 5102K->3072K(5632K)] 7684K->7718K(528384K), 0.0012587 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.109: ParallelGCFailedAllocation [ 11 0 1 ] [ 0 0 0 0 1 ] 0
Total time for which application threads were stopped: 0.0013348 seconds
[GC (Allocation Failure) [PSYoungGen: 5630K->4608K(7680K)] 10276K->10382K(530432K), 0.0016113 secs] [Times: user=0.03 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.111: ParallelGCFailedAllocation [ 11 0 1 ] [ 0 0 0 0 1 ] 0
Total time for which application threads were stopped: 0.0016921 seconds
0-->Delta time from Start[0ms], from last[0ms]
[GC (Allocation Failure) [PSYoungGen: 7678K->3328K(8704K)] 13452K->13341K(531456K), 0.0013899 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.113: ParallelGCFailedAllocation [ 11 0 1 ] [ 0 0 1 0 1 ] 0
Total time for which application threads were stopped: 0.0024590 seconds
[GC (Allocation Failure) [PSYoungGen: 6399K->3360K(11264K)] 16412K->16419K(534016K), 0.0011903 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.115: ParallelGCFailedAllocation [ 11 1 2 ] [ 0 0 0 0 1 ] 0
Total time for which application threads were stopped: 0.0013046 seconds
[GC (Allocation Failure) [PSYoungGen: 8477K->5536K(11264K)] 21537K->21642K(534016K), 0.0020710 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
0.117: ParallelGCFailedAllocation [ 11 0 1 ] [ 0 0 0 0 2 ] 0
Total time for which application threads were stopped: 0.0021520 seconds
1-->Delta time from Start[210ms], from last[210ms]
2-->Delta time from Start[410ms], from last[200ms]
3-->Delta time from Start[610ms], from last[200ms]
4-->Delta time from Start[810ms], from last[200ms]
5-->Delta time from Start[1010ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
1.119: no vm operation [ 11 2 2 ] [ 0 0 0 0 0 ] 2
Total time for which application threads were stopped: 0.0001547 seconds
6-->Delta time from Start[1210ms], from last[200ms]
7-->Delta time from Start[1410ms], from last[200ms]
8-->Delta time from Start[1610ms], from last[200ms]
9-->Delta time from Start[1810ms], from last[200ms]
10-->Delta time from Start[2010ms], from last[200ms]
11-->Delta time from Start[2210ms], from last[200ms]
12-->Delta time from Start[2410ms], from last[200ms]
13-->Delta time from Start[2610ms], from last[200ms]
14-->Delta time from Start[2810ms], from last[200ms]
15-->Delta time from Start[3010ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
3.120: no vm operation [ 11 2 2 ] [ 0 0 0 0 0 ] 2
Total time for which application threads were stopped: 0.0000670 seconds
16-->Delta time from Start[3210ms], from last[200ms]
17-->Delta time from Start[3410ms], from last[200ms]
18-->Delta time from Start[3610ms], from last[200ms]
19-->Delta time from Start[3810ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
4.084: EnableBiasedLocking [ 11 2 2 ] [ 0 0 0 0 0 ] 2
Total time for which application threads were stopped: 0.0000737 seconds
20-->Delta time from Start[4010ms], from last[200ms]
21-->Delta time from Start[4210ms], from last[200ms]
22-->Delta time from Start[4410ms], from last[200ms]
23-->Delta time from Start[4610ms], from last[200ms]
24-->Delta time from Start[4810ms], from last[200ms]
25-->Delta time from Start[5010ms], from last[200ms]
26-->Delta time from Start[5210ms], from last[200ms]
27-->Delta time from Start[5410ms], from last[200ms]
28-->Delta time from Start[5610ms], from last[200ms]
29-->Delta time from Start[5810ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
6.084: no vm operation [ 11 2 2 ] [ 0 0 0 0 0 ] 2
Total time for which application threads were stopped: 0.0000621 seconds
30-->Delta time from Start[6010ms], from last[200ms]
31-->Delta time from Start[6210ms], from last[200ms]
32-->Delta time from Start[6410ms], from last[200ms]
33-->Delta time from Start[6610ms], from last[200ms]
34-->Delta time from Start[6810ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
7.085: no vm operation [ 11 2 2 ] [ 0 0 0 0 0 ] 2
Total time for which application threads were stopped: 0.0000605 seconds
35-->Delta time from Start[7010ms], from last[200ms]
36-->Delta time from Start[7210ms], from last[200ms]
37-->Delta time from Start[7410ms], from last[200ms]
38-->Delta time from Start[7610ms], from last[200ms]
39-->Delta time from Start[7810ms], from last[200ms]
40-->Delta time from Start[8010ms], from last[200ms]
41-->Delta time from Start[8210ms], from last[200ms]
42-->Delta time from Start[8410ms], from last[200ms]
43-->Delta time from Start[8610ms], from last[200ms]
44-->Delta time from Start[8810ms], from last[200ms]
45-->Delta time from Start[9010ms], from last[200ms]
46-->Delta time from Start[9210ms], from last[200ms]
47-->Delta time from Start[9410ms], from last[200ms]
48-->Delta time from Start[9610ms], from last[200ms]
49-->Delta time from Start[9810ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
10.087: no vm operation [ 11 2 2 ] [ 3009 0 3009 0 0 ] 2
Total time for which application threads were stopped: 3.0091028 seconds
50-->Delta time from Start[12994ms], from last[3184ms]
51-->Delta time from Start[13194ms], from last[200ms]
52-->Delta time from Start[13394ms], from last[200ms]
53-->Delta time from Start[13594ms], from last[200ms]
54-->Delta time from Start[13794ms], from last[200ms]
55-->Delta time from Start[13994ms], from last[200ms]
56-->Delta time from Start[14194ms], from last[200ms]
57-->Delta time from Start[14394ms], from last[200ms]
58-->Delta time from Start[14594ms], from last[200ms]
59-->Delta time from Start[14794ms], from last[200ms]
60-->Delta time from Start[14994ms], from last[200ms]
61-->Delta time from Start[15194ms], from last[200ms]
62-->Delta time from Start[15394ms], from last[200ms]
63-->Delta time from Start[15594ms], from last[200ms]
64-->Delta time from Start[15794ms], from last[200ms]
65-->Delta time from Start[15994ms], from last[200ms]
vmop [threads: total initially_running wait_to_block] [time: spin block sync cleanup vmop] page_trap_count
16.098: no vm operation [ 11 2 2 ] [ 1221 0 1221 0 0 ] 2
Total time for which application threads were stopped: 1.2213130 seconds
66-->Delta time from Start[17217ms], from last[1223ms]
67-->Delta time from Start[17417ms], from last[200ms]
68-->Delta time from Start[17617ms], from last[200ms]
69-->Delta time from Start[17817ms], from last[200ms]
70-->Delta time from Start[18017ms], from last[200ms]
71-->Delta time from Start[18217ms], from last[200ms]
72-->Delta time from Start[18417ms], from last[200ms]
73-->Delta time from Start[18617ms], from last[200ms]
74-->Delta time from Start[18817ms], from last[200ms]
75-->Delta time from Start[19017ms], from last[200ms]
76-->Delta time from Start[19217ms], from last[200ms]
77-->Delta time from Start[19417ms], from last[200ms]
78-->Delta time from Start[19617ms], from last[200ms]
79-->Delta time from Start[19817ms], from last[200ms]
80-->Delta time from Start[20017ms], from last[200ms]
Process finished with exit code 1
现在当我将MAKE_THE_JVM_PAUSE_UNEXPECTEDLY
boolean
设置为false
来运行此事件时,它会导致任何安全点都没有减速。
为什么在代码中放置此代码会导致此问题?
答案 0 :(得分:1)
由于安全点大约每秒发生一次,因此可能是GuaranteedSafepointInterval
正在运行,即它是由计时器触发的。
我还没有测试过您的代码,但是一旦触发了安全点,很可能需要很长时间才能到达安全点,因为有一个优化可以避免JIT知道的循环中的安全点轮询在有限的时间内终止或者调用其他方法,这些方法将在他们的身体中进行安全点轮询。问题是&#34;有限的时间&#34;对于CPU密集的纯计算循环,它仍然可能很大,从而导致安全点停顿。在java 8和9中,可以使用-XX:+UseCountedLoopSafepoints
以轻微的性能损失禁用此功能。 Java 10将引入优化以平衡有限循环中的安全点延迟和每循环轮询(JDK-8186027)。