Linux OOM Killer和Java Process

时间:2014-10-16 11:43:15

标签: java linux tomcat memory linux-kernel

我经常在Tomcat进程的生产环境中面临被Linux OOM杀死的问题。

检查/ var / log / messages它说java没有污染,java调用了OOM杀手。

在32 GB的盒子上

-Xms20480m -Xmx20480m

我在下面看到崩溃 -

OOM是否导致此次崩溃?还是因为OOM而发生了崩溃? 我该如何调试此问题?

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f4c3230aad7, pid=16248, tid=139964439296320
#
# JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 1.7.0_45-b18)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (24.45-b08 mixed mode linux-amd64 compressed oops)
# Problematic frame:
# V  [libjvm.so+0x674ad7]  JVM_Clone+0x97
#
# Failed to write core dump. Core dumps have been disabled. To enable core dumping, try "ulimit -c unlimited" before starting Java again
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  T H R E A D  ---------------

Current thread (0x0000000001313800):  JavaThread "http-bio-14080-exec-17" daemon [_thread_in_native, id=18943, stack(0x00007f4c029f7000,0x00007f4c02af8000)]

siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), si_addr=0x0000000000000000

Registers:
RAX=0x0000000000010000, RBX=0x80000001000004db, RCX=0x0000000302bd6fb0, RDX=0x00007f4c32acda50
RSP=0x00007f4c02af2c88, RBP=0x00007f4c02af2d70, RSI=0x00007f4c02af2d80, RDI=0x00000000013139e8
R8 =0x0000000000000001, R9 =0x00000002f8191228, R10=0x00007f4c2e231658, R11=0x00007f4c2e231638
R12=0x0000000302bd6fb0, R13=0x00000007e0002420, R14=0x0000000000000000, R15=0x0000000001313800
RIP=0x00007f4c3230aad7, EFLAGS=0x0000000000010202, CSGSFS=0x0000000000000033, ERR=0x0000000000000000
  TRAPNO=0x000000000000000d

Top of Stack: (sp=0x00007f4c02af2c88)
0x00007f4c02af2c88:   00007f4c3230aa63 00000000129196d1
0x00007f4c02af2c98:   00000002f817b620 00000000013139e8
0x00007f4c02af2ca8:   00007f4c02af2d18 00000002f8191080
0x00007f4c02af2cb8:   0000000000000009 0000000001313800
0x00007f4c02af2cc8:   0000000000000000 0000000001313800
0x00007f4c02af2cd8:   00000000138f5509 0000000000000000
0x00007f4c02af2ce8:   00007f4c2e382ae8 0000000000000000
0x00007f4c02af2cf8:   00007f4c02af2cf8 00000002f817b620
0x00007f4c02af2d08:   00000002f8190fd8 00000007e01ab278
0x00007f4c02af2d18:   00000002f8190ec0 00000007e0002420
0x00007f4c02af2d28:   0000000000000000 00000007e0002420
0x00007f4c02af2d38:   0000000000000000 0000000000000000
0x00007f4c02af2d48:   00000007e0002420 0000000000000000
0x00007f4c02af2d58:   00007f4c02af2de0 0000000000000000
0x00007f4c02af2d68:   0000000001313800 00007f4c02af2dc0
0x00007f4c02af2d78:   00007f4c2e2316c6 0000000302bd6fb0
0x00007f4c02af2d88:   0000000000000000 00007f4c02af2e08
0x00007f4c02af2d98:   00007f4c2e1bc8e1 00007f4c02af2db8
0x00007f4c02af2da8:   00007f4c2e1bc8e1 00000002f8190fd8
0x00007f4c02af2db8:   00000002f817b620 00007f4c02af2e28
0x00007f4c02af2dc8:   00007f4c2e1bc233 00000007e1ec4e9f
0x00007f4c02af2dd8:   00007f4c2e1bc233 0000000302bd6fb0
0x00007f4c02af2de8:   00007f4c02af2de8 00000007e1b7a5db
0x00007f4c02af2df8:   00007f4c02af2e30 00000007e37575a8
0x00007f4c02af2e08:   0000000000000000 00000007e1b7a5e8
0x00007f4c02af2e18:   00007f4c02af2de0 00007f4c02af2e40
0x00007f4c02af2e28:   00007f4c02af2ed0 00007f4c2edaaf34
0x00007f4c02af2e38:   00007f4c2edaaf34 00007f4c0000000a
0x00007f4c02af2e48:   00000007e416ec20 0000000000000000
0x00007f4c02af2e58:   00000007e416e2b8 00007f4c02af2ed0
0x00007f4c02af2e68:   00007f4c2e1bc233 00007f4c02af2ed0
0x00007f4c02af2e78:   00007f4c2e1bc233 000000000000000a 

Instructions: (pc=0x00007f4c3230aad7)
0x00007f4c3230aab7:   85 60 02 00 00 06 00 00 00 4c 89 6d b0 4c 8b 23
0x00007f4c3230aac7:   4d 85 e4 0f 84 68 02 00 00 49 8b 9d 20 01 00 00
0x00007f4c3230aad7:   48 83 7b 10 f7 0f 87 6e 02 00 00 48 8b 43 10 48
0x00007f4c3230aae7:   8d 50 08 48 3b 53 18 0f 87 9c 02 00 00 48 89 53 

Register to memory mapping:

RAX=0x0000000000010000 is an unknown value
RBX=0x80000001000004db is an unknown value
RCX=0x0000000302bd6fb0 is an oop
[Lcom.mycompany.MyClass$Type; 
 - klass: 'com/mycompany/MyClass$Type'[]
 - length: 16
RDX=0x00007f4c32acda50: <offset 0xe37a50> in /usr/java/jre64-1.7.0_45/jre/lib/amd64/server/libjvm.so at 0x00007f4c31c96000
RSP=0x00007f4c02af2c88 is pointing into the stack for thread: 0x0000000001313800
RBP=0x00007f4c02af2d70 is pointing into the stack for thread: 0x0000000001313800
RSI=0x00007f4c02af2d80 is pointing into the stack for thread: 0x0000000001313800
RDI=0x00000000013139e8 is an unknown value
R8 =0x0000000000000001 is an unknown value
R9 =
[error occurred during error reporting (printing register info), id 0xb]

Stack: [0x00007f4c029f7000,0x00007f4c02af8000],  sp=0x00007f4c02af2c88,  free space=1007k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
V  [libjvm.so+0x674ad7]  JVM_Clone+0x97
J  java.lang.Object.clone()Ljava/lang/Object;
j  com.mycompany.MyClass$Type.values()[Lcom/mycompany/MyClass$Type;+3

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J  java.lang.Object.clone()Ljava/lang/Object;
j  com.mycompany.MyClass$Type.values()[Lcom/mycompany/MyClass$Type;+3

/ var / log / messages 输出 -

myhostname kernel: java invoked oom-killer: gfp_mask=0x201da, order=0, oom_adj=0, oom_score_adj=0
myhostname kernel: java cpuset=/ mems_allowed=0
myhostname kernel: Pid: 32307, comm: java Not tainted 2.6.39-400.209.1.el5uek #1
myhostname kernel: Call Trace:
myhostname kernel:  [<ffffffff811136b4>] dump_header+0x94/0xe0
myhostname kernel:  [<ffffffff811137fd>] oom_kill_process+0x6d/0x160
myhostname kernel:  [<ffffffff811139ec>] out_of_memory+0xfc/0x210
myhostname kernel:  [<ffffffff811187ec>] __alloc_pages_slowpath+0x64c/0x660
myhostname kernel:  [<ffffffff811189b4>] __alloc_pages_nodemask+0x1b4/0x200
myhostname kernel:  [<ffffffff8111b140>] ? __do_page_cache_readahead+0xe0/0x170
myhostname kernel:  [<ffffffff81150893>] alloc_pages_current+0xb3/0x120
myhostname kernel:  [<ffffffff811100da>] __page_cache_alloc+0x9a/0xb0
myhostname kernel:  [<ffffffff8111097f>] page_cache_read+0x4f/0xb0
myhostname kernel:  [<ffffffff81111a54>] filemap_fault+0x174/0x270
myhostname kernel:  [<ffffffff81137a2c>] __do_fault+0x5c/0x550
myhostname kernel:  [<ffffffff81137fc6>] do_linear_fault+0x36/0x40
myhostname kernel:  [<ffffffff81510b6e>] ? call_function_interrupt+0xe/0x20
myhostname kernel:  [<ffffffff81138044>] handle_pte_fault+0x74/0x190
myhostname kernel:  [<ffffffff815106ae>] ? apic_timer_interrupt+0xe/0x20
myhostname kernel:  [<ffffffff8113828f>] handle_mm_fault+0x12f/0x1b0
myhostname kernel:  [<ffffffff8150b1cd>] do_page_fault+0x17d/0x4b0
myhostname kernel:  [<ffffffff8117b821>] ? user_path_at+0x11/0x20
myhostname kernel:  [<ffffffff81170516>] ? vfs_fstatat+0x56/0x90
myhostname kernel:  [<ffffffff8117067b>] ? vfs_stat+0x1b/0x20
myhostname kernel:  [<ffffffff81507cd5>] page_fault+0x25/0x30
myhostname kernel: Mem-Info:
myhostname kernel: Node 0 DMA per-cpu:
myhostname kernel: CPU    0: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    1: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    2: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    3: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    4: hi:    0, btch:   1 usd:   0
myhostname kernel: CPU    5: hi:    0, btch:   1 usd:   0
myhostname kernel: Node 0 DMA32 per-cpu:
myhostname kernel: CPU    0: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    1: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    2: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    3: hi:  186, btch:  31 usd:  11
myhostname kernel: CPU    4: hi:  186, btch:  31 usd:  30
myhostname kernel: CPU    5: hi:  186, btch:  31 usd:   0
myhostname kernel: Node 0 Normal per-cpu:
myhostname kernel: CPU    0: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    1: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    2: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    3: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    4: hi:  186, btch:  31 usd:   0
myhostname kernel: CPU    5: hi:  186, btch:  31 usd:   0
myhostname kernel: active_anon:4372468 inactive_anon:275213 isolated_anon:0
myhostname kernel:  active_file:17 inactive_file:21 isolated_file:0
myhostname kernel:  unevictable:6002 dirty:24 writeback:4 unstable:0
myhostname kernel:  free:38369 slab_reclaimable:3708 slab_unreclaimable:9265
myhostname kernel:  mapped:1253 shmem:67 pagetables:10880 bounce:0
myhostname kernel: Node 0 DMA free:15880kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
myhostname kernel: lowmem_reserve[]: 0 3000 32290 32290
myhostname kernel: Node 0 DMA32 free:119288kB min:2136kB low:2668kB high:3204kB active_anon:30224kB inactive_anon:9152kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3072096kB mlocked:0kB dirty:16kB writeback:4kB mapped:36kB shmem:0kB slab_reclaimable:40kB slab_unreclaimable:232kB kernel_stack:24kB pagetables:48kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:67 all_unreclaimable? yes
myhostname kernel: lowmem_reserve[]: 0 0 29290 29290
myhostname kernel: Node 0 Normal free:17564kB min:20856kB low:26068kB high:31284kB active_anon:17459648kB inactive_anon:1091700kB active_file:120kB inactive_file:88kB unevictable:24008kB isolated(anon):0kB isolated(file):0kB present:29992960kB mlocked:24008kB dirty:80kB writeback:12kB mapped:4976kB shmem:268kB slab_reclaimable:14792kB slab_unreclaimable:36828kB kernel_stack:2768kB pagetables:43472kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:14972 all_unreclaimable? yes
myhostname kernel: lowmem_reserve[]: 0 0 0 0
myhostname kernel: Node 0 DMA: 0*4kB 1*8kB 0*16kB 0*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15880kB
myhostname kernel: Node 0 DMA32: 458*4kB 494*8kB 284*16kB 55*32kB 7*64kB 4*128kB 5*256kB 7*512kB 7*1024kB 6*2048kB 20*4096kB = 119288kB
myhostname kernel: Node 0 Normal: 3333*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 17428kB
myhostname kernel: 115783 total pagecache pages
myhostname kernel: 114466 pages in swap cache
myhostname kernel: Swap cache stats: add 2183060, delete 2068594, find 1860109/1985849
myhostname kernel: Free swap  = 0kB
myhostname kernel: Total swap = 2097148kB
myhostname kernel: 8388592 pages RAM
myhostname kernel: 134806 pages reserved
myhostname kernel: 6845 pages shared
myhostname kernel: 8210059 pages non-shared

2 个答案:

答案 0 :(得分:0)

OOM问题更可能导致崩溃。使用比-Xmx参数更多的内存的一个常见原因是使用本机内存。这可能是因为您正在使用分配大量对象的JNI库,或者您使用的是内存映射文件等。

你应该尝试在你的java代码中添加一些日志语句来打印Java认为使用Runtime.totalMemory等的内存。然后将这些值与你通过top看到的值进行比较,看看是否有&#39;消耗内存的其他东西。

答案 1 :(得分:0)

这是linux限制线程号。

linux默认限制为1024,最大为65535,

您可以使用命令ulimit -c unlimited无限制。