在过去的几周里,我一直在为Glassfish服务器测试不同的JVM设置。堆(以及其他)的主要设置是:-Xms512m,-Xmx512m,-XX:NewRatio = 2。我尝试了不同的GC设置,但是在启动服务器几天后我仍然遇到长时间暂停的问题。我注意到以下几点:
1. -XX:+ UseParallelGC -XX:+ UseParallelOldGC - 每分钟都会发生次要GC,主要GC每18小时发生一次。我对次要GC没有任何问题,但5天后主要GC出现问题。起初主要的GC暂停持续约100-200ms,但最后一次停顿持续70秒
2. -XX:+ UseConcMarkSweepGC -XX:+ UseParNewGC - 与上面几乎相同。次要GC很好,但主要的GC(非完整)暂停时间变得非常长。我注意到GC(CMS Final Remark)阶段的高级卸载问题已经停止了世界阶段
3. -XX:+ UseConcMarkSweepGC -XX:+ UseParNewGC和-XX:MaxGCPauseMillis = 5000。我只测试了一天,因为第二个主要的GC最后(不完整)已经持续了大约20秒,所以我认为还有其他错误。
4. -XX:+ UseG1GC,-XX:MaxGCPauseMillis = 5000,-XX:+ UseStringDeduplication,不带-XX:NewRatio = 2选项 - 主要GC(未满)每12小时发生一次,我已经注意到一些问题:< BR />
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-mark-start]
2015-05-31T18:25:35.563+0200: 83394.312: [GC concurrent-mark-end, 10.4145795 secs]
2015-05-31T18:25:35.563+0200: 83394.312: [GC remark 83394.312: [Finalize Marking, 0.0002939 secs] 83394.312: [GC ref-proc, 1.2128584 secs] 83395.525: [Unloading, 14.5180500 secs], 15.7320854 secs]
[Times: user=0.14 sys=0.22, real=15.73 secs]
GC评论阶段需要15秒,这对我来说是不可接受的。您可以看到卸载大部分时间都在进行。这也发生在使用其他GC之前,所以我认为类卸载一定存在问题。
总结:所有GC都运行一段时间,但几天后问题开始出现,暂停时间很长。我不知道为什么它在前几天工作正常然后突然结果非常糟糕。我注意到更高的暂停时间是由类卸载引起的,所以我想知道是否有一些设置可以获得更好的结果。另外我想知道你推荐给我使用哪种GC?我有内部Web应用程序在PC上的glassfish服务器上运行,具有8GB的RAM,i7处理器和Windows 8操作系统。最多可同时连接10个客户端,但它必须具有较长的正常运行时间,并且不能有很长的暂停时间(最长5秒)。请告诉我如何缩短暂停时间。
还有一个问题:在我的情况下,使用G1GC而不是CMS或ParallelGC会有什么不利之处?堆是否小到使用G1GC?
编辑:G1GC记录在长时间暂停GC备注阶段之前和之后
2015-05-31T18:25:25.004+0200: 83383.755: [GC pause (G1 Evacuation Pause) (young) (initial-mark), 0.1280453 secs]
[Parallel Time: 116.2 ms, GC Workers: 4]
[GC Worker Start (ms): Min: 83383757.6, Avg: 83383757.7, Max: 83383757.7, Diff: 0.0]
[Ext Root Scanning (ms): Min: 97.8, Avg: 98.3, Max: 98.5, Diff: 0.7, Sum: 393.1]
[Update RS (ms): Min: 0.2, Avg: 4.0, Max: 14.8, Diff: 14.6, Sum: 16.1]
[Processed Buffers: Min: 1, Avg: 6.0, Max: 16, Diff: 15, Sum: 24]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.1]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 0.2, Avg: 2.5, Max: 3.7, Diff: 3.5, Sum: 10.2]
[Termination (ms): Min: 0.0, Avg: 8.5, Max: 11.4, Diff: 11.4, Sum: 34.2]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[GC Worker Total (ms): Min: 113.4, Avg: 113.4, Max: 113.5, Diff: 0.0, Sum: 453.8]
[GC Worker End (ms): Min: 83383871.1, Avg: 83383871.1, Max: 83383871.1, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[String Dedup Fixup: 2.2 ms, GC Workers: 4]
[Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Table Fixup (ms): Min: 2.0, Avg: 2.1, Max: 2.1, Diff: 0.1, Sum: 8.3]
[Clear CT: 0.1 ms]
[Other: 9.5 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 8.8 ms]
[Ref Enq: 0.1 ms]
[Redirty Cards: 0.3 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.1 ms]
[Eden: 215.0M(215.0M)->0.0B(215.0M) Survivors: 7168.0K->7168.0K Heap: 451.5M(512.0M)->236.6M(512.0M)]
[Times: user=0.08 sys=0.02, real=0.13 secs]
2015-05-31T18:25:25.129+0200: 83383.883: [GC concurrent-root-region-scan-start]
2015-05-31T18:25:25.129+0200: 83383.883: [GC concurrent-string-deduplication, 160.0B->0.0B(160.0B), avg 48.3%, 0.0000070 secs]
[Last Exec: 0.0000070 secs, Idle: 23.1834927 secs, Blocked: 0/0.0000000 secs]
[Inspected: 3]
[Skipped: 0( 0.0%)]
[Hashed: 3(100.0%)]
[Known: 0( 0.0%)]
[New: 3(100.0%) 160.0B]
[Deduplicated: 3(100.0%) 160.0B(100.0%)]
[Young: 3(100.0%) 160.0B(100.0%)]
[Old: 0( 0.0%) 0.0B( 0.0%)]
[Total Exec: 2868/0.1946124 secs, Idle: 2868/83382.9701762 secs, Blocked: 13/0.0032760 secs]
[Inspected: 304493]
[Skipped: 0( 0.0%)]
[Hashed: 163708( 53.8%)]
[Known: 44808( 14.7%)]
[New: 259685( 85.3%) 21.9M]
[Deduplicated: 160467( 61.8%) 10.6M( 48.3%)]
[Young: 83546( 52.1%) 6270.6K( 57.8%)]
[Old: 76921( 47.9%) 4571.3K( 42.2%)]
[Table]
[Memory Usage: 4291.8K]
[Size: 131072, Min: 1024, Max: 16777216]
[Entries: 133319, Load: 101.7%, Cached: 6107, Added: 142389, Removed: 9070]
[Resize Count: 7, Shrink Threshold: 87381(66.7%), Grow Threshold: 262144(200.0%)]
[Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
[Age Threshold: 3]
[Queue]
[Dropped: 0]
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-root-region-scan-end, 0.0140467 secs]
2015-05-31T18:25:25.145+0200: 83383.897: [GC concurrent-mark-start]
2015-05-31T18:25:35.563+0200: 83394.312: [GC concurrent-mark-end, 10.4145795 secs]
2015-05-31T18:25:35.563+0200: 83394.312: [GC remark 83394.312: [Finalize Marking, 0.0002939 secs] 83394.312: [GC ref-proc, 1.2128584 secs] 83395.525: [Unloading, 14.5180500 secs], 15.7320854 secs]
[Times: user=0.14 sys=0.22, real=15.73 secs]
2015-05-31T18:25:51.288+0200: 83410.045: [GC cleanup 334M->326M(512M), 0.2836092 secs]
[Times: user=0.00 sys=0.00, real=0.28 secs]
2015-05-31T18:25:51.570+0200: 83410.328: [GC concurrent-cleanup-start]
2015-05-31T18:25:51.570+0200: 83410.328: [GC concurrent-cleanup-end, 0.0000669 secs]
2015-05-31T18:26:03.732+0200: 83422.482: [GC pause (G1 Evacuation Pause) (young), 0.1031257 secs]
[Parallel Time: 91.6 ms, GC Workers: 4]
[GC Worker Start (ms): Min: 83422481.7, Avg: 83422481.7, Max: 83422481.8, Diff: 0.0]
[Ext Root Scanning (ms): Min: 1.3, Avg: 1.7, Max: 2.7, Diff: 1.4, Sum: 6.9]
[Update RS (ms): Min: 0.0, Avg: 22.7, Max: 89.8, Diff: 89.8, Sum: 90.8]
[Processed Buffers: Min: 0, Avg: 7.3, Max: 15, Diff: 15, Sum: 29]
[Scan RS (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Object Copy (ms): Min: 0.5, Avg: 2.4, Max: 3.4, Diff: 2.9, Sum: 9.5]
[Termination (ms): Min: 0.0, Avg: 64.7, Max: 86.3, Diff: 86.3, Sum: 258.9]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 91.6, Avg: 91.6, Max: 91.6, Diff: 0.0, Sum: 366.3]
[GC Worker End (ms): Min: 83422573.3, Avg: 83422573.3, Max: 83422573.3, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[String Dedup Fixup: 2.1 ms, GC Workers: 4]
[Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Table Fixup (ms): Min: 1.9, Avg: 1.9, Max: 1.9, Diff: 0.1, Sum: 7.7]
[Clear CT: 0.1 ms]
[Other: 9.3 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 8.8 ms]
[Ref Enq: 0.1 ms]
[Redirty Cards: 0.0 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.1 ms]
[Eden: 215.0M(215.0M)->0.0B(19.0M) Survivors: 7168.0K->6144.0K Heap: 443.6M(512.0M)->228.2M(512.0M)]
[Times: user=0.30 sys=0.01, real=0.10 secs]
2015-05-31T18:26:03.848+0200: 83422.597: [GC concurrent-string-deduplication, 160.0B->0.0B(160.0B), avg 48.3%, 0.0123951 secs]
[Last Exec: 0.0123951 secs, Idle: 38.7017788 secs, Blocked: 0/0.0000000 secs]
[Inspected: 3]
[Skipped: 0( 0.0%)]
[Hashed: 3(100.0%)]
[Known: 0( 0.0%)]
[New: 3(100.0%) 160.0B]
[Deduplicated: 3(100.0%) 160.0B(100.0%)]
[Young: 3(100.0%) 160.0B(100.0%)]
[Old: 0( 0.0%) 0.0B( 0.0%)]
[Total Exec: 2869/0.2070075 secs, Idle: 2869/83421.6719550 secs, Blocked: 13/0.0032760 secs]
[Inspected: 304496]
[Skipped: 0( 0.0%)]
[Hashed: 163711( 53.8%)]
[Known: 44808( 14.7%)]
[New: 259688( 85.3%) 21.9M]
[Deduplicated: 160470( 61.8%) 10.6M( 48.3%)]
[Young: 83549( 52.1%) 6270.8K( 57.8%)]
[Old: 76921( 47.9%) 4571.3K( 42.2%)]
[Table]
[Memory Usage: 2565.5K]
[Size: 65536, Min: 1024, Max: 16777216]
[Entries: 81061, Load: 123.7%, Cached: 6553, Added: 142396, Removed: 61335]
[Resize Count: 8, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
[Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
[Age Threshold: 3]
[Queue]
[Dropped: 0]
2015-05-31T18:26:05.769+0200: 83424.518: [GC pause (G1 Evacuation Pause) (mixed), 0.2232916 secs]
[Parallel Time: 216.7 ms, GC Workers: 4]
[GC Worker Start (ms): Min: 83424518.3, Avg: 83424518.3, Max: 83424518.3, Diff: 0.0]
[Ext Root Scanning (ms): Min: 1.2, Avg: 1.6, Max: 2.6, Diff: 1.4, Sum: 6.5]
[Update RS (ms): Min: 0.0, Avg: 0.3, Max: 0.4, Diff: 0.4, Sum: 1.2]
[Processed Buffers: Min: 0, Avg: 4.3, Max: 7, Diff: 7, Sum: 17]
[Scan RS (ms): Min: 56.1, Avg: 102.3, Max: 144.4, Diff: 88.3, Sum: 409.2]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.3]
[Object Copy (ms): Min: 50.4, Avg: 97.6, Max: 157.7, Diff: 107.2, Sum: 390.2]
[Termination (ms): Min: 0.0, Avg: 14.8, Max: 19.8, Diff: 19.8, Sum: 59.1]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 216.6, Avg: 216.6, Max: 216.6, Diff: 0.0, Sum: 866.5]
[GC Worker End (ms): Min: 83424734.9, Avg: 83424734.9, Max: 83424734.9, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[String Dedup Fixup: 1.5 ms, GC Workers: 4]
[Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Table Fixup (ms): Min: 1.4, Avg: 1.4, Max: 1.4, Diff: 0.0, Sum: 5.6]
[Clear CT: 0.2 ms]
[Other: 4.8 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 0.9 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.2 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.2 ms]
[Eden: 19.0M(19.0M)->0.0B(21.0M) Survivors: 6144.0K->4096.0K Heap: 247.2M(512.0M)->175.2M(512.0M)]
[Times: user=0.09 sys=0.00, real=0.22 secs]
2015-05-31T18:26:05.992+0200: 83424.742: [GC concurrent-string-deduplication, 640.0B->152.0B(488.0B), avg 48.3%, 0.0000246 secs]
[Last Exec: 0.0000246 secs, Idle: 2.1442834 secs, Blocked: 0/0.0000000 secs]
[Inspected: 6]
[Skipped: 0( 0.0%)]
[Hashed: 5( 83.3%)]
[Known: 0( 0.0%)]
[New: 6(100.0%) 640.0B]
[Deduplicated: 5( 83.3%) 488.0B( 76.3%)]
[Young: 5(100.0%) 488.0B(100.0%)]
[Old: 0( 0.0%) 0.0B( 0.0%)]
[Total Exec: 2870/0.2070321 secs, Idle: 2870/83423.8162384 secs, Blocked: 13/0.0032760 secs]
[Inspected: 304502]
[Skipped: 0( 0.0%)]
[Hashed: 163716( 53.8%)]
[Known: 44808( 14.7%)]
[New: 259694( 85.3%) 21.9M]
[Deduplicated: 160475( 61.8%) 10.6M( 48.3%)]
[Young: 83554( 52.1%) 6271.2K( 57.8%)]
[Old: 76921( 47.9%) 4571.3K( 42.2%)]
[Table]
[Memory Usage: 2564.6K]
[Size: 65536, Min: 1024, Max: 16777216]
[Entries: 81026, Load: 123.6%, Cached: 6553, Added: 142397, Removed: 61371]
[Resize Count: 8, Shrink Threshold: 43690(66.7%), Grow Threshold: 131072(200.0%)]
[Rehash Count: 0, Rehash Threshold: 120, Hash Seed: 0x0]
[Age Threshold: 3]
[Queue]
[Dropped: 0]
2015-05-31T18:26:08.157+0200: 83426.906: [GC pause (G1 Evacuation Pause) (mixed), 0.6216666 secs]
[Parallel Time: 618.5 ms, GC Workers: 4]
[GC Worker Start (ms): Min: 83426906.5, Avg: 83426906.5, Max: 83426906.5, Diff: 0.0]
[Ext Root Scanning (ms): Min: 0.3, Avg: 8.0, Max: 15.7, Diff: 15.3, Sum: 31.9]
[Update RS (ms): Min: 0.0, Avg: 4.5, Max: 8.5, Diff: 8.5, Sum: 17.9]
[Processed Buffers: Min: 0, Avg: 7.0, Max: 18, Diff: 18, Sum: 28]
[Scan RS (ms): Min: 13.4, Avg: 28.4, Max: 65.2, Diff: 51.8, Sum: 113.7]
[Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.2]
[Object Copy (ms): Min: 532.6, Avg: 577.3, Max: 604.5, Diff: 71.9, Sum: 2309.1]
[Termination (ms): Min: 0.0, Avg: 0.2, Max: 0.3, Diff: 0.3, Sum: 0.7]
[GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
[GC Worker Total (ms): Min: 618.4, Avg: 618.4, Max: 618.4, Diff: 0.0, Sum: 2473.6]
[GC Worker End (ms): Min: 83427524.9, Avg: 83427524.9, Max: 83427524.9, Diff: 0.0]
[Code Root Fixup: 0.1 ms]
[Code Root Purge: 0.0 ms]
[String Dedup Fixup: 1.3 ms, GC Workers: 4]
[Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
[Table Fixup (ms): Min: 1.2, Avg: 1.2, Max: 1.3, Diff: 0.1, Sum: 4.9]
[Clear CT: 0.1 ms]
[Other: 1.6 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 1.0 ms]
[Ref Enq: 0.0 ms]
[Redirty Cards: 0.0 ms]
[Humongous Reclaim: 0.0 ms]
[Free CSet: 0.2 ms]
[Eden: 21.0M(21.0M)->0.0B(21.0M) Survivors: 4096.0K->4096.0K Heap: 196.2M(512.0M)->129.4M(512.0M)]
[Times: user=0.08 sys=0.02, real=0.62 secs]
编辑:几小时后的结果:
有很多页/秒和页面输入/秒以及页面错误。这是正常的吗?我可以在哪里设置监视页面/秒和页面输入/秒仅适用于JVM(我只发现页面错误)?
答案 0 :(得分:3)
我猜你正在咆哮错误的树 - 我怀疑垃圾收集不是你的问题......
你只运行一个512 MiB堆 - 对我来说,一个长时间停顿,堆的大小将是1或2秒。使用巨大的(32 GiB)堆可以在几毫秒内停止主要停顿。
我希望问题实际上与您的服务器有关 - 您提到的其他应用程序正在使用足够的内存将您的Java进程(比堆大约50%)推入交换/虚拟内存 - 或者您正在虚拟化环境中运行应用程序(可能存在内存过量使用/内存缓冲问题)。
作为一个非常粗略的指标,任何GC算法都应该能够以每秒100 MiB的速度流失 - 所以如果你看到的情况比这更糟糕,那就去找别处找问题的原因。
在这种情况下,我认为GC是症状,而不是问题。
答案 1 :(得分:2)
[时间:用户= 0.14 sys = 0.22,实际= 15.73秒]
这意味着它花费的壁时间GCing比实际CPU时间多得多。您尝试过的所有GC都是多线程的,纯粹是CPU限制的,这意味着它们通常应该比壁时间耗费更多的CPU时间。
有两种可能的原因可以改变
-XX:+PrintSafepointStatistics -XX:PrintSafepointStatisticsCount=1
此外,由于它发生在使用-XX:-ClassUnloadingWithConcurrentMark
的备注暂停期间可能会修复它,但我想它只是将问题转移到常规GC。
也许跟踪-XX:+TraceClassUnloading
实际尝试卸载的数量可能会有用。像glassfish这样的应用程序容器可能会做一些奇怪的事情,导致很多类堆积起来。
编辑:对于监控,您主要希望关注空闲物理内存(减去缓存), CPU负载,页面输入/输出/故障。理想情况下,在进程级别监视分页,因为它与JVM无关,是否有另一个进程在等待磁盘。
至于CMS与G1:这可能与您的问题无关。
答案 2 :(得分:1)
看一下这篇文章,你能告诉你在哪个主要领域GC的长时间暂停会落在你的情况下(文章有关于如何确定这一点的步骤)?不成?其他操作系统活动?
另外,可能是由于JVM错误。检查您的JVM版本。
链接:https://blogs.oracle.com/poonam/entry/troubleshooting_long_gc_pauses