生成更少的垃圾导致更长的GC暂停

时间:2016-07-28 21:41:45

标签: java performance garbage-collection chronicle

我们最近对应用程序的持久性机制进行了更改,通过反射替换了慢速序列化,以创建通过OpenHFT Chronicle进行序列化对MySQL数据库执行的SQL语句字符串,其中包含硬编码的附加内容。各种对象的字段到摘录。正如预期的那样,这大大减少了持续交易所花费的时间并减少了我们的延迟。但是我们注意到在变化之前和之后有关垃圾收集模式的一些令人费解的事情。虽然我们生成的垃圾更少,但每次垃圾收集所花费的时间也在增加,而且我们无法理解原因。

以下是两次运行的示例,在更改之前和之后,在 RedHat 4.4.7 CentOS 6.7上使用Java 1.8.0_74运行相同的VM args。 "之后"运行似乎消耗较少的年轻代,并且每个GC开始时的总量较少,在GC结束时大致相同的堆,但实际运行GC需要更多的时间,并且运行频率低得多(大约300秒之间) GC而不是70-80s。

我们正在努力弄清楚这是怎么回事,以及当我们生成垃圾的速度快得多时,我们能做些什么来让GC暂停回到原来的位置。有没有关于Chronicle及其使用堆外内存映射文件导致这种情况的方法?这似乎不太可能,但它似乎有牵连,除非它真的只是关于生成垃圾的模式是什么导致它。 FWIW,作为一项实验我们尝试了G1和CMS,至少在默认参数下,它们显示相同的模式,如果有更糟糕的时间。

VM args:

-Xms75776M
-Xmx75776M
-XX:+PrintGCDetails
-XX:+PrintGCTimeStamps
-Dsun.rmi.dgc.server.gcInterval=54000000
-Dsun.rmi.dgc.client.gcinterval=54000000
-XX:+DisableExplicitGC
-XX:MaxNewSize=19000M
-XX:+PrintGCApplicationStoppedTime
-XX:InitialSurvivorRatio=3

在:

15009.601: [GC (Allocation Failure) [PSYoungGen: 19226640K->142320K(19273216K)] 22796282K->3814635K(77411840K), 0.0596127 secs] [Times: user=0.53 sys=0.05, real=0.06 secs]
15080.410: [GC (Allocation Failure) [PSYoungGen: 19235824K->137920K(19282432K)] 22908139K->3913367K(77421056K), 0.0570436 secs] [Times: user=0.50 sys=0.06, real=0.06 secs]
15154.485: [GC (Allocation Failure) [PSYoungGen: 19243712K->129840K(19279360K)] 23019159K->4007451K(77417984K), 0.0605860 secs] [Times: user=0.55 sys=0.05, real=0.06 secs]
.
.
bunch of lines omitted to get to approximately the same end-point
.
16070.663: [GC (Allocation Failure) [PSYoungGen: 19195664K->146464K(19262976K)] 24408408K->5460932K(77401600K), 0.0616580 secs] [Times: user=0.55 sys=0.06, real=0.06 secs]

后:

15313.256: [GC (Allocation Failure) [PSYoungGen: 15092194K->2230066K(15050752K)] 16440428K->3981360K(73189376K), 0.3563350 secs] [Times: user=3.40 sys=0.15, real=0.36 secs]
15770.413: [GC (Allocation Failure) [PSYoungGen: 15050546K->2922738K(15321600K)] 16801840K->4674033K(73460224K), 0.3584541 secs] [Times: user=3.56 sys=0.00, real=0.36 secs]
16090.016: [GC (Allocation Failure) [PSYoungGen: 14896370K->3571346K(15244288K)] 16647665K->5322641K(73382912K), 0.4276084 secs] [Times: user=4.25 sys=0.00, real=0.42 secs]

更新 所以我根据下面的一些建议进行了以下实验(这是在UAT机器上,而不是PROD机器,所以它有点不同......例如linux版本是CentOS 7.2而不是6.7)。我们将堆的大小限制为32G,并将显式GC暂停目标设置为200ms,并打开更多诊断信息。如果有什么结果甚至更糟。它开始时非常合理,但建议的暂停时间最长可达1.94秒。

java -XX:MaxGCPauseMillis=200 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintHeapAtGC -XX:+PrintTenuringDistribution -Dsun.rmi.dgc.server.gcInterval=54000000 -Dsun.rmi.dgc.client.gcinterval=54000000 -XX:+DisableExplicitGC -Xms32000M -Xmx32000M


{Heap before GC invocations=1 (full 0):
 PSYoungGen      total 9557504K, used 4096257K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 50% used  [0x00007f58fe580000,0x00007f59f85c0458,0x00007f5af2600000)
  from space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
  to   space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
 ParOldGen       total 21845504K, used 0K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9000000,0x00007f58fe580000)
 Metaspace       used 20877K, capacity 21118K, committed 21296K, reserved 1069056K
  class space    used 2499K, capacity 2599K, committed 2608K, reserved 1048576K
2.420: [GC (Metadata GC Threshold) 
Desired survivor size 1397751808 bytes, new threshold 7 (max 15)
[PSYoungGen: 4096257K->13270K(9557504K)] 4096257K->13366K(31403008K), 0.0181755 secs] [Times: user=0.11 sys=0.05, real=0.02 secs] 
Heap after GC invocations=1 (full 0):
 PSYoungGen      total 9557504K, used 13270K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af32f59a8,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 96K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9018010,0x00007f58fe580000)
 Metaspace       used 20877K, capacity 21118K, committed 21296K, reserved 1069056K
  class space    used 2499K, capacity 2599K, committed 2608K, reserved 1048576K
}
{Heap before GC invocations=2 (full 1):
 PSYoungGen      total 9557504K, used 13270K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af32f59a8,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 96K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9018010,0x00007f58fe580000)
 Metaspace       used 20877K, capacity 21118K, committed 21296K, reserved 1069056K
  class space    used 2499K, capacity 2599K, committed 2608K, reserved 1048576K
2.438: [Full GC (Metadata GC Threshold) [PSYoungGen: 13270K->0K(9557504K)] [ParOldGen: 96K->12747K(21845504K)] 13366K->12747K(31403008K), [Metaspace: 20877K->20877K(1069056K)], 0.0530698 secs] [Times: user=0.27 sys=0.05, real=0.05 secs] 
Heap after GC invocations=2 (full 1):
 PSYoungGen      total 9557504K, used 0K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 12747K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c72e90,0x00007f58fe580000)
 Metaspace       used 20877K, capacity 21118K, committed 21296K, reserved 1069056K
  class space    used 2499K, capacity 2599K, committed 2608K, reserved 1048576K
}
{Heap before GC invocations=3 (full 1):
 PSYoungGen      total 9557504K, used 8192512K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 100% used [0x00007f58fe580000,0x00007f5af2600000,0x00007f5af2600000)
  from space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 12747K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c72e90,0x00007f58fe580000)
 Metaspace       used 22891K, capacity 23188K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
1332.241: [GC (Allocation Failure) 
Desired survivor size 1397751808 bytes, new threshold 7 (max 15)
[PSYoungGen: 8192512K->291885K(9557504K)] 8205259K->304649K(31403008K), 0.2443282 secs] [Times: user=0.48 sys=3.83, real=0.25 secs] 
Heap after GC invocations=3 (full 1):
 PSYoungGen      total 9557504K, used 291885K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 21% used [0x00007f5b45b00000,0x00007f5b5780b650,0x00007f5b99000000)
  to   space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
 ParOldGen       total 21845504K, used 12763K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c76e90,0x00007f58fe580000)
 Metaspace       used 22891K, capacity 23188K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
}
{Heap before GC invocations=4 (full 1):
 PSYoungGen      total 9557504K, used 8484397K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 100% used [0x00007f58fe580000,0x00007f5af2600000,0x00007f5af2600000)
  from space 1364992K, 21% used [0x00007f5b45b00000,0x00007f5b5780b650,0x00007f5b99000000)
  to   space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
 ParOldGen       total 21845504K, used 12763K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c76e90,0x00007f58fe580000)
 Metaspace       used 22939K, capacity 23252K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
2858.320: [GC (Allocation Failure) 
Desired survivor size 1397751808 bytes, new threshold 7 (max 15)
[PSYoungGen: 8484397K->677978K(9557504K)] 8497161K->690750K(31403008K), 0.5290410 secs] [Times: user=0.74 sys=8.70, real=0.53 secs] 
Heap after GC invocations=4 (full 1):
 PSYoungGen      total 9557504K, used 677978K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 49% used [0x00007f5af2600000,0x00007f5b1bc16a80,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 12771K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c78e90,0x00007f58fe580000)
 Metaspace       used 22939K, capacity 23252K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
}
{Heap before GC invocations=5 (full 1):
 PSYoungGen      total 9557504K, used 8870490K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 100% used [0x00007f58fe580000,0x00007f5af2600000,0x00007f5af2600000)
  from space 1364992K, 49% used [0x00007f5af2600000,0x00007f5b1bc16a80,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 12771K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c78e90,0x00007f58fe580000)
 Metaspace       used 22946K, capacity 23252K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
4467.385: [GC (Allocation Failure) 
Desired survivor size 1397751808 bytes, new threshold 7 (max 15)
[PSYoungGen: 8870490K->1096779K(9557504K)] 8883262K->1109558K(31403008K), 0.9701847 secs] [Times: user=1.96 sys=15.40, real=0.97 secs] 
Heap after GC invocations=5 (full 1):
 PSYoungGen      total 9557504K, used 1096779K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 80% used [0x00007f5b45b00000,0x00007f5b88a12d00,0x00007f5b99000000)
  to   space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
 ParOldGen       total 21845504K, used 12779K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c7ae90,0x00007f58fe580000)
 Metaspace       used 22946K, capacity 23252K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
}
{Heap before GC invocations=6 (full 1):
 PSYoungGen      total 9557504K, used 9289291K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 100% used [0x00007f58fe580000,0x00007f5af2600000,0x00007f5af2600000)
  from space 1364992K, 80% used [0x00007f5b45b00000,0x00007f5b88a12d00,0x00007f5b99000000)
  to   space 1364992K, 0% used [0x00007f5af2600000,0x00007f5af2600000,0x00007f5b45b00000)
 ParOldGen       total 21845504K, used 12779K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 0% used [0x00007f53c9000000,0x00007f53c9c7ae90,0x00007f58fe580000)
 Metaspace       used 22951K, capacity 23252K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
6153.256: [GC (Allocation Failure) 
Desired survivor size 2375548928 bytes, new threshold 6 (max 15)
[PSYoungGen: 9289291K->1364987K(9557504K)] 9302070K->2238887K(31403008K), 1.9590450 secs] [Times: user=5.38 sys=29.68, real=1.96 secs] 
Heap after GC invocations=6 (full 1):
 PSYoungGen      total 9557504K, used 1364987K [0x00007f58fe580000, 0x00007f5b99000000, 0x00007f5b99000000)
  eden space 8192512K, 0% used [0x00007f58fe580000,0x00007f58fe580000,0x00007f5af2600000)
  from space 1364992K, 99% used [0x00007f5af2600000,0x00007f5b45afed70,0x00007f5b45b00000)
  to   space 1364992K, 0% used [0x00007f5b45b00000,0x00007f5b45b00000,0x00007f5b99000000)
 ParOldGen       total 21845504K, used 873899K [0x00007f53c9000000, 0x00007f58fe580000, 0x00007f58fe580000)
  object space 21845504K, 4% used [0x00007f53c9000000,0x00007f53fe56ae90,0x00007f58fe580000)
 Metaspace       used 22951K, capacity 23252K, committed 23472K, reserved 1071104K
  class space    used 2602K, capacity 2688K, committed 2736K, reserved 1048576K
}

3 个答案:

答案 0 :(得分:1)

您需要一些更详细的统计数据,考虑使用visualgc来观察负载开发时堆的不同部分发生的情况。像-XX:+PrintTenuringDistribution这样的标志用于深入研究与伊甸相关的空间中发生的事情。

第二种情况下总堆大小上升和上升的事实表明你正在逐渐流入终身所以增加的时间可能是幸存者在来自&之间ping的成本。幸存者空间(然后最终晋升为终身)。请注意,这是基于可用数据的假设,我们需要更详细的统计数据才能真正了解正在发生的事情(或理解编年史的行为,而我个人并不具备这种情况)。

这可能意味着你有一个内存泄漏或者eden现在太小了(无论是maxtenuringthreshold还是仅仅是eden的大小,一般来说,可能需要调整一些eden来适应你的新堆使用模式)。

答案 1 :(得分:1)

您关注暂停时间,但您没有设置暂停时间目标(MaxGCPauseMillis)。默认收集器(ParallelGC)没有默认的暂停时间目标,因此您必须明确指定一个。

当设置时,GC将尝试调整年轻的基因大小和期限阈值以实现这些目标。

  

FWIW,作为一项实验我们尝试了G1和CMS,至少在默认参数下,它们显示相同的模式,如果有更糟糕的时间。

G1的默认暂停目标为200毫秒,因此进入该范围的暂停时间不会出乎意料。

答案 2 :(得分:1)

我会考虑将堆保持在32 GB以下,以便您可以使用压缩的oops。

通过拥有更多收藏品,您可能会随着时间推移工作。我也会尝试缩小新空间。