线程转储,CPU利用率分析

时间:2015-10-05 10:49:41

标签: multithreading garbage-collection jvm cpu thread-dump

我们的申请中有一个问题。

由于应用程序运行一天,在某些时候cpu利用率达到100%,这导致应用程序响应缓慢。

当我浏览了所有链接时,我得到了线程顶移+ h,还有线程转储。

将进程id转换为hex,并在线程转储中搜索。

我在线程转储中找到以下详细信息


"Concurrent Mark-Sweep GC Thread" prio=10 tid=0x0000000045cce000 nid=0x10d1 runnable
"Gang worker#0 (Parallel CMS Threads)" prio=10 tid=0x0000000045cc6000 nid=0x10cd runnable

"Gang worker#1 (Parallel CMS Threads)" prio=10 tid=0x0000000045cc8000 nid=0x10ce runnable

"Gang worker#2 (Parallel CMS Threads)" prio=10 tid=0x0000000045cc9800 nid=0x10cf runnable

"Gang worker#3 (Parallel CMS Threads)" prio=10 tid=0x0000000045ccb800 nid=0x10d0 runnable

"VM Periodic Task Thread" prio=10 tid=0x00002b8c48012000 nid=0x10da waiting on condition

JNI global references: 1231

Heap
 par new generation   total 436928K, used 93221K [0x00000006b0000000, 0x00000006d0000000, 0x00000006d0000000)
  eden space 349568K,  21% used [0x00000006b0000000, 0x00000006b484f438, 0x00000006c5560000)
  from space 87360K,  21% used [0x00000006c5560000, 0x00000006c681a020, 0x00000006caab0000)
  to   space 87360K,   0% used [0x00000006caab0000, 0x00000006caab0000, 0x00000006d0000000)
 concurrent mark-sweep generation total 4718592K, used 3590048K [0x00000006d0000000, 0x00000007f0000000, 0x00000007f0000000)
 concurrent-mark-sweep perm gen total 262144K, used 217453K [0x00000007f0000000, 0x0000000800000000, 0x0000000800000000)

 CMS: abort preclean due to time 2015-09-24T14:16:14.752+0200: 4505865.908: [CMS-concurrent-abortable-preclean: 4.332/5.134 secs] [Times: user=5.22 sys=0.08, real=5.14 secs]
2015-09-24T14:16:14.756+0200: 4505865.912: [GC[YG occupancy: 127725 K (436928 K)]4505865.912: [Rescan (parallel) , 0.0602290 secs]4505865.973: [weak refs processing, 0.0000220 secs] [1 CMS-remark: 3590048K(4718592K)] 3717774K(5155520K), 0.0604150 secs] [Times: user=0.64 sys=0.00, real=0.06 secs]
2015-09-24T14:16:14.817+0200: 4505865.973: [CMS-concurrent-sweep-start]
2015-09-24T14:16:18.048+0200: 4505869.204: [CMS-concurrent-sweep: 3.227/3.231 secs] [Times: user=3.37 sys=0.03, real=3.23 secs]
2015-09-24T14:16:18.048+0200: 4505869.204: [CMS-concurrent-reset-start]
2015-09-24T14:16:18.058+0200: 4505869.214: [CMS-concurrent-reset: 0.010/0.010 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-09-24T14:16:18.312+0200: 4505869.468: [GC [1 CMS-initial-mark: 3590044K(4718592K)] 3788126K(5155520K), 0.2487070 secs] [Times: user=0.25 sys=0.00, real=0.25 secs]
2015-09-24T14:16:18.561+0200: 4505869.717: [CMS-concurrent-mark-start]
2015-09-24T14:16:23.202+0200: 4505874.358: [CMS-concurrent-mark: 4.626/4.641 secs] [Times: user=17.89 sys=0.39, real=4.64 secs]
2015-09-24T14:16:23.202+0200: 4505874.358: [CMS-concurrent-preclean-start]
2015-09-24T14:16:24.094+0200: 4505875.250: [CMS-concurrent-preclean: 0.891/0.891 secs] [Times: user=0.95 sys=0.01, real=0.90 secs]
2015-09-24T14:16:24.094+0200: 4505875.250: [CMS-concurrent-abortable-preclean-start]
2015-09-24T14:16:25.347+0200: 4505876.503: [GC 4505876.503: [ParNew: 368744K->19384K(436928K), 0.0492700 secs] 3958788K->3609428K(5155520K), 0.0494530 secs] [Times: user=0.52 sys=0.00, real=0.05 secs]
 CMS: abort preclean due to time 2015-09-24T14:16:29.105+0200: 4505880.261: [CMS-concurrent-abortable-preclean: 3.972/5.012 secs] [Times: user=4.87 sys=0.08, real=5.01 secs]
2015-09-24T14:16:29.109+0200: 4505880.265: [GC[YG occupancy: 123643 K (436928 K)]4505880.266: [Rescan (parallel) , 0.0643880 secs]4505880.330: [weak refs processing, 0.0000180 secs] [1 CMS-remark: 3590044K(4718592K)] 3713687K(5155520K), 0.0645660 secs] [Times: user=0.68 sys=0.00, real=0.06 secs]
2015-09-24T14:16:29.175+0200: 4505880.331: [CMS-concurrent-sweep-start]
2015-09-24T14:16:32.406+0200: 4505883.562: [CMS-concurrent-sweep: 3.227/3.231 secs] [Times: user=3.35 sys=0.03, real=3.23 secs]
2015-09-24T14:16:32.406+0200: 4505883.562: [CMS-concurrent-reset-start]
2015-09-24T14:16:32.416+0200: 4505883.572: [CMS-concurrent-reset: 0.010/0.010 secs] [Times: user=0.01 sys=0.00, real=0.01 secs]
2015-09-24T14:16:34.047+0200: 4505885.203: [GC [1 CMS-initial-mark: 3590040K(4718592K)] 3814265K(5155520K), 0.2704050 secs] [Times: user=0.27 sys=0.00, real=0.27 secs]

2015-09-16T23:18:46.554+0200: 3847217.710: [CMS-concurrent-mark-start]
2015-09-16T23:18:46.926+0200: 3847218.083: [Full GC 3847218.083: [CMS2015-09-16T23:18:50.249+0200: 3847221.405: [CMS-concurrent-mark: 3.688/3.695 secs] [Times: user=13.96 sys=0.31, real=3.70 secs]
 (concurrent mode failure): 3073996K->3011216K(4718592K), 20.7183280 secs] 3348996K->3011216K(5155520K), [CMS Perm : 262143K->40538K(262144K)], 20.7185010 secs] [Times: user=29.87 sys=0.31, real=20.71 secs]


我正在使用java 1.6版本,CMS是否会导致这个高CPU问题?

应用程序中的JVM参数:

-server
-d64
-Xms5000M
-Xmx5000M
-XX:+DisableExplicitGC
-XX:NewSize=512M
-XX:MaxNewSize=512M
-XX:SurvivorRatio=4
-XX:PermSize=256M
-XX:MaxPermSize=256M
-XX:+UseParNewGC
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=65
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+CMSPermGenSweepingEnabled
-XX:MaxTenuringThreshold=30

我无法弄清楚这个问题的解决方案,我需要在JVM中更改任何参数吗?


 (concurrent mode failure): 3073996K->3011216K(4718592K), 20.7183280 secs] 3348996K->3011216K(5155520K), [CMS Perm : 262143K->40538K(262144K)], 20.7185010 secs] [Times: user=29.87 sys=0.31, real=20.71 secs]
 (concurrent mode failure): 3258153K->3197547K(4718592K), 17.8924530 secs] 3644714K->3197547K(5155520K), [CMS Perm : 262143K->40572K(262144K)], 17.8926620 secs] [Times: user=17.89 sys=0.01, real=17.89 secs]
 (concurrent mode failure): 3439590K->3370903K(4718592K), 18.0448510 secs] 3548868K->3370903K(5155520K), [CMS Perm : 262143K->40526K(262144K)], 18.0450480 secs] [Times: user=17.94 sys=0.01, real=18.04 secs]

1 个答案:

答案 0 :(得分:0)

我建议您添加标记java,因为您在FullGC暂停时遇到了Java特定问题。它总是听起来像:

  

由于应用程序运行了一天,在某个时间点cpu   利用率达到100%,导致响应缓慢   应用

我希望this article可以帮到你。但首先,您需要向java参数添加一些日志记录:

-verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamp

此外,打印堆统计信息会很有帮助:

 -XX:+PrintClassHistogramAfterFullGC -XX:+PrintClassHistogramBeforeFullGC

也可能是内存泄漏:)