我正在容器中(带有k8s)运行Java应用,并发现了一个长STW gc:
2019-07-10T16:45:31.081+0800: 1620992.943: [GC (Allocation Failure) 2019-07-10T16:45:31.082+0800: 1620992.944: [ParNew: 1232340K->105476K(1258304K), 0.0558525 secs] 1412255K->290236K(4054528K), 0.0571538 secs] [Times: user=0.23 sys=0.20, real=0.06 secs]
2019-07-10T16:46:08.906+0800: 1621030.767: [GC (Allocation Failure) 2019-07-10T16:46:08.907+0800: 1621030.768: [ParNew: 1224004K->97149K(1258304K), 5.4008859 secs] 1408764K->286575K(4054528K), 5.4022113 secs] [Times: user=37.65 sys=0.00, real=5.41 secs]
2019-07-10T16:46:48.426+0800: 1621070.287: [GC (Allocation Failure) 2019-07-10T16:46:48.426+0800: 1621070.288: [ParNew: 1215677K->106022K(1258304K), 0.0545431 secs] 1405103K->300294K(4054528K), 0.0557196 secs] [Times: user=0.41 sys=0.00, real=0.06 secs]
第二个GC回收的内存量与其上一个和下一个GC几乎相同(1.1 GB),同时花费大量时间(5.4秒)。这与ParNew GC中非常长的用户时间有关。
我已经在Google上搜索了它,却发现大多数博客和stackoverflow答案都涉及大量的系统时间和实时性,这与我的问题无关。
我的Java版本:
$ java -version
java version "1.8.0_102"
Java(TM) SE Runtime Environment (build 1.8.0_102-b14)
Java HotSpot(TM) 64-Bit Server VM (build 25.102-b14, mixed mode)
jstack的GC线程是:
"Concurrent Mark-Sweep GC Thread" os_prio=0 tid=0x00007f3be809b000 nid=0xb5 runnable
"Gang worker#0 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be801d000 nid=0xab runnable
"Gang worker#1 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be801f000 nid=0xac runnable
"Gang worker#2 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be8020800 nid=0xad runnable
"Gang worker#3 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be8022800 nid=0xae runnable
"Gang worker#4 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be8024800 nid=0xaf runnable
"Gang worker#5 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be8026000 nid=0xb0 runnable
"Gang worker#6 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be8028000 nid=0xb1 runnable
"Gang worker#7 (Parallel GC Threads)" os_prio=0 tid=0x00007f3be802a000 nid=0xb2 runnable
"Surrogate Locker Thread (Concurrent GC)" #4 daemon prio=9 os_prio=0 tid=0x00007f3be812d800 nid=0xb9 waiting on condition [0x0000000000000000]
"Gang worker#0 (Parallel CMS Threads)" os_prio=0 tid=0x00007f3be8097000 nid=0xb3 runnable
"Gang worker#1 (Parallel CMS Threads)" os_prio=0 tid=0x00007f3be8099000 nid=0xb4 runnable
答案 0 :(得分:0)
实时时间是墙上时间,即实际经过了多少时间。用户时间是指在用户空间中花费的所有内核上完成工作所累积的CPU周期。由于您有8个并行GC线程,这仅意味着这5秒钟中的大部分时间都花在了执行收集工作的大多数内核上。
这本身并不能告诉我们为什么要花费这么多时间。据我所知,要从ParNew中获取更多信息没有什么可做的,这是一个非常简单的收集器。您可以切换到提供更多详细日志的G1。
Understanding GC pauses in JVM, HotSpot's minor GC的Alexey Ragozin分解了ParNew的时间间隔。
您可能还想监视系统中的CPU / IO /交换争用(例如,使用当前Linux内核中带有时间戳的/ proc / pressure / *进行记录),这可能会减慢垃圾收集器的活动。