了解OOM奇怪的行为?

时间:2015-06-24 18:45:17

标签: linux memory linux-kernel

我的服务器触发了OOM杀手,我试图了解原因。系统有很多RAM 128 GB,看起来大约有70GB的实际使用。通过阅读有关OOM的先前问题,看起来这可能是内存碎片的情况。请参阅syslog输出

Jun 23 17:20:10 server1 kernel: [517262.504589] gmond invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Jun 23 17:20:10 server1 kernel: [517262.504593] gmond cpuset=/ mems_allowed=0-1
Jun 23 17:20:10 server1 kernel: [517262.504598] CPU: 4 PID: 1522 Comm: gmond Tainted: P           OE 3.15.1-031501-lowlatency #201406161841
Jun 23 17:20:10 server1 kernel: [517262.504599] Hardware name: Dell Inc. PowerEdge R420/0K29HN, BIOS 2.3.3 07/10/2014
Jun 23 17:20:10 server1 kernel: [517262.504601]  0000000000000000 ffff880fce2ab848 ffffffff817746ec 0000000000000007
Jun 23 17:20:10 server1 kernel: [517262.504603]  ffff880f74691950 ffff880fce2ab898 ffffffff8176a980 ffff880f00000000
Jun 23 17:20:10 server1 kernel: [517262.504605]  000201da81383df8 ffff881470376540 ffff881dcf7ab2a0 0000000000000000
Jun 23 17:20:10 server1 kernel: [517262.504607] Call Trace:
Jun 23 17:20:10 server1 kernel: [517262.504615]  [<ffffffff817746ec>] dump_stack+0x4e/0x71
Jun 23 17:20:10 server1 kernel: [517262.504618]  [<ffffffff8176a980>] dump_header+0x7e/0xbd
Jun 23 17:20:10 server1 kernel: [517262.504620]  [<ffffffff8176aa16>] oom_kill_process.part.6+0x57/0x30a
Jun 23 17:20:10 server1 kernel: [517262.504623]  [<ffffffff811654e7>] oom_kill_process+0x47/0x50
Jun 23 17:20:10 server1 kernel: [517262.504625]  [<ffffffff81165825>] out_of_memory+0x145/0x1d0
Jun 23 17:20:10 server1 kernel: [517262.504628]  [<ffffffff8116c1ba>] __alloc_pages_nodemask+0xb1a/0xc40
Jun 23 17:20:10 server1 kernel: [517262.504634]  [<ffffffff811adba3>] alloc_pages_current+0xb3/0x180
Jun 23 17:20:10 server1 kernel: [517262.504636]  [<ffffffff81161737>] __page_cache_alloc+0xb7/0xd0
Jun 23 17:20:10 server1 kernel: [517262.504638]  [<ffffffff81163f80>] filemap_fault+0x280/0x430
Jun 23 17:20:10 server1 kernel: [517262.504642]  [<ffffffff8118a0d9>] __do_fault+0x39/0x90
Jun 23 17:20:10 server1 kernel: [517262.504644]  [<ffffffff8118e31e>] do_read_fault.isra.59+0x10e/0x1d0
Jun 23 17:20:10 server1 kernel: [517262.504646]  [<ffffffff8118e870>] do_linear_fault.isra.61+0x70/0x80
Jun 23 17:20:10 server1 kernel: [517262.504647]  [<ffffffff8118e986>] handle_pte_fault+0x76/0x1b0
Jun 23 17:20:10 server1 kernel: [517262.504652]  [<ffffffff81095fe0>] ? lock_hrtimer_base.isra.25+0x30/0x60
Jun 23 17:20:10 server1 kernel: [517262.504654]  [<ffffffff8118eea4>] __handle_mm_fault+0x1b4/0x360
Jun 23 17:20:10 server1 kernel: [517262.504655]  [<ffffffff8118f101>] handle_mm_fault+0xb1/0x160
Jun 23 17:20:10 server1 kernel: [517262.504658]  [<ffffffff81784667>] ? __do_page_fault+0x2b7/0x5a0
Jun 23 17:20:10 server1 kernel: [517262.504660]  [<ffffffff81784522>] __do_page_fault+0x172/0x5a0
Jun 23 17:20:10 server1 kernel: [517262.504664]  [<ffffffff8111fdec>] ? acct_account_cputime+0x1c/0x20
Jun 23 17:20:10 server1 kernel: [517262.504667]  [<ffffffff810a73a9>] ? account_user_time+0x99/0xb0
Jun 23 17:20:10 server1 kernel: [517262.504669]  [<ffffffff810a79dd>] ? vtime_account_user+0x5d/0x70
Jun 23 17:20:10 server1 kernel: [517262.504671]  [<ffffffff8178498e>] do_page_fault+0x3e/0x80
Jun 23 17:20:10 server1 kernel: [517262.504673]  [<ffffffff817811f8>] page_fault+0x28/0x30
Jun 23 17:20:10 server1 kernel: [517262.504674] Mem-Info:
Jun 23 17:20:10 server1 kernel: [517262.504675] Node 0 DMA per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504677] CPU    0: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504678] CPU    1: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504679] CPU    2: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504680] CPU    3: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504681] CPU    4: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504682] CPU    5: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504683] CPU    6: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504684] CPU    7: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504685] CPU    8: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504686] CPU    9: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504687] CPU   10: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504687] CPU   11: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504688] CPU   12: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504689] CPU   13: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504690] CPU   14: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504691] CPU   15: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504692] CPU   16: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504693] CPU   17: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504694] CPU   18: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504695] CPU   19: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504696] CPU   20: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504697] CPU   21: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504698] CPU   22: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504698] CPU   23: hi:    0, btch:   1 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504699] Node 0 DMA32 per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504701] CPU    0: hi:  186, btch:  31 usd:  30
Jun 23 17:20:10 server1 kernel: [517262.504702] CPU    1: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504703] CPU    2: hi:  186, btch:  31 usd:  34
Jun 23 17:20:10 server1 kernel: [517262.504704] CPU    3: hi:  186, btch:  31 usd:  27
Jun 23 17:20:10 server1 kernel: [517262.504705] CPU    4: hi:  186, btch:  31 usd:  30
Jun 23 17:20:10 server1 kernel: [517262.504705] CPU    5: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504706] CPU    6: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504707] CPU    7: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504708] CPU    8: hi:  186, btch:  31 usd: 173
Jun 23 17:20:10 server1 kernel: [517262.504709] CPU    9: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504710] CPU   10: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504711] CPU   11: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504712] CPU   12: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504713] CPU   13: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504714] CPU   14: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504715] CPU   15: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504716] CPU   16: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504717] CPU   17: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504718] CPU   18: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504719] CPU   19: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504720] CPU   20: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504721] CPU   21: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504722] CPU   22: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504722] CPU   23: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504723] Node 0 Normal per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504724] CPU    0: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504725] CPU    1: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504726] CPU    2: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504727] CPU    3: hi:  186, btch:  31 usd:  14
Jun 23 17:20:10 server1 kernel: [517262.504728] CPU    4: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504729] CPU    5: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504730] CPU    6: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504731] CPU    7: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504732] CPU    8: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504733] CPU    9: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504734] CPU   10: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504735] CPU   11: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504736] CPU   12: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504737] CPU   13: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504738] CPU   14: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504739] CPU   15: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504740] CPU   16: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504740] CPU   17: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504741] CPU   18: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504742] CPU   19: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504743] CPU   20: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504744] CPU   21: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504745] CPU   22: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504746] CPU   23: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504747] Node 1 Normal per-cpu:
Jun 23 17:20:10 server1 kernel: [517262.504748] CPU    0: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504749] CPU    1: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504750] CPU    2: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504751] CPU    3: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504752] CPU    4: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504753] CPU    5: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504754] CPU    6: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504755] CPU    7: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504756] CPU    8: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504757] CPU    9: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504758] CPU   10: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504758] CPU   11: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504759] CPU   12: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504760] CPU   13: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504761] CPU   14: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504762] CPU   15: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504763] CPU   16: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504764] CPU   17: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504765] CPU   18: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504766] CPU   19: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504767] CPU   20: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504768] CPU   21: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504769] CPU   22: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504770] CPU   23: hi:  186, btch:  31 usd:   0
Jun 23 17:20:10 server1 kernel: [517262.504773] active_anon:17833290 inactive_anon:2465707 isolated_anon:0
Jun 23 17:20:10 server1 kernel: [517262.504773]  active_file:573 inactive_file:595 isolated_file:36
Jun 23 17:20:10 server1 kernel: [517262.504773]  unevictable:0 dirty:4 writeback:0 unstable:0
Jun 23 17:20:10 server1 kernel: [517262.504773]  free:82698 slab_reclaimable:43224 slab_unreclaimable:11476749
Jun 23 17:20:10 server1 kernel: [517262.504773]  mapped:2465518 shmem:2465767 pagetables:66385 bounce:0
Jun 23 17:20:10 server1 kernel: [517262.504773]  free_cma:0
Jun 23 17:20:10 server1 kernel: [517262.504776] Node 0 DMA free:14804kB min:8kB low:8kB high:12kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15968kB managed:15828kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504779] lowmem_reserve[]: 0 2933 64370 64370
Jun 23 17:20:10 server1 kernel: [517262.504782] Node 0 DMA32 free:247776kB min:2048kB low:2560kB high:3072kB active_anon:1774744kB inactive_anon:607052kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3083200kB managed:3003592kB mlocked:0kB dirty:16kB writeback:0kB mapped:607068kB shmem:607068kB slab_reclaimable:25524kB slab_unreclaimable:302060kB kernel_stack:4928kB pagetables:3100kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2660 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504785] lowmem_reserve[]: 0 0 61436 61436
Jun 23 17:20:10 server1 kernel: [517262.504787] Node 0 Normal free:34728kB min:42952kB low:53688kB high:64428kB active_anon:30286072kB inactive_anon:9255576kB active_file:236kB inactive_file:640kB unevictable:0kB isolated(anon):0kB isolated(file):16kB present:63963136kB managed:62911420kB mlocked:0kB dirty:0kB writeback:0kB mapped:9255000kB shmem:9255724kB slab_reclaimable:86416kB slab_unreclaimable:22165372kB kernel_stack:21072kB pagetables:121112kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:13936 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504791] lowmem_reserve[]: 0 0 0 0
Jun 23 17:20:10 server1 kernel: [517262.504793] Node 1 Normal free:33484kB min:45096kB low:56368kB high:67644kB active_anon:39272344kB inactive_anon:200kB active_file:2112kB inactive_file:1752kB unevictable:0kB isolated(anon):0kB isolated(file):128kB present:67108864kB managed:66056916kB mlocked:0kB dirty:0kB writeback:0kB mapped:4kB shmem:276kB slab_reclaimable:60956kB slab_unreclaimable:23439564kB kernel_stack:13536kB pagetables:141328kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:18448 all_unreclaimable? yes
Jun 23 17:20:10 server1 kernel: [517262.504797] lowmem_reserve[]: 0 0 0 0
Jun 23 17:20:10 server1 kernel: [517262.504799] Node 0 DMA: 1*4kB (U) 0*8kB 1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 0*1024kB 1*2048kB (R) 3*4096kB (M) = 14804kB
Jun 23 17:20:10 server1 kernel: [517262.504807] Node 0 DMA32: 4660*4kB (UEM) 2172*8kB (EM) 1739*16kB (EM) 1046*32kB (UEM) 629*64kB (EM) 344*128kB (UEM) 155*256kB (E) 46*512kB (UE) 3*1024kB (E) 0*2048kB 0*4096kB = 247904kB
Jun 23 17:20:10 server1 kernel: [517262.504816] Node 0 Normal: 9038*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36152kB
Jun 23 17:20:10 server1 kernel: [517262.504822] Node 1 Normal: 9055*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36220kB
Jun 23 17:20:10 server1 kernel: [517262.504829] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 23 17:20:10 server1 kernel: [517262.504830] Node 1 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Jun 23 17:20:10 server1 kernel: [517262.504831] 2467056 total pagecache pages
Jun 23 17:20:10 server1 kernel: [517262.504832] 0 pages in swap cache
Jun 23 17:20:10 server1 kernel: [517262.504833] Swap cache stats: add 0, delete 0, find 0/0
Jun 23 17:20:10 server1 kernel: [517262.504834] Free swap  = 0kB
Jun 23 17:20:10 server1 kernel: [517262.504834] Total swap = 0kB
Jun 23 17:20:10 server1 kernel: [517262.504835] 33542792 pages RAM
Jun 23 17:20:10 server1 kernel: [517262.504836] 0 pages HighMem/MovableOnly
Jun 23 17:20:10 server1 kernel: [517262.504837] 262987 pages reserved
Jun 23 17:20:10 server1 kernel: [517262.504838] 0 pages hwpoisoned
Jun 23 17:20:10 server1 kernel: [517262.504839] [ pid ]   uid  tgid total_vm      rss nr_ptes swapents oom_score_adj name
Jun 23 17:20:10 server1 kernel: [517262.504866] [  569]     0   569     4997      144      13        0             0 upstart-udev-br
Jun 23 17:20:10 server1 kernel: [517262.504868] [  578]     0   578    12891      187      29        0         -1000 systemd-udevd
Jun 23 17:20:10 server1 kernel: [517262.504873] [  692]   101   692    80659     2295      59        0             0 rsyslogd
Jun 23 17:20:10 server1 kernel: [517262.504875] [  750]     0   750     4084      331      13        0             0 upstart-file-br
Jun 23 17:20:10 server1 kernel: [517262.504877] [  792]     0   792     3815       53      13        0             0 upstart-socket-
Jun 23 17:20:10 server1 kernel: [517262.504877] [  792]     0   792     3815       53      13        0             0 upstart-socket-
Jun 23 17:20:10 server1 kernel: [517262.504879] [  842]   111   842    27001      275      53        0             0 dbus-daemon
Jun 23 17:20:10 server1 kernel: [517262.504880] [  851]     0   851     8834      101      22        0             0 systemd-logind
Jun 23 17:20:10 server1 kernel: [517262.504886] [ 1232]     0  1232     2558      572       8        0             0 dhclient
Jun 23 17:20:10 server1 kernel: [517262.504888] [ 1342]   104  1342    24484      281      49        0             0 ntpd
Jun 23 17:20:10 server1 kernel: [517262.504890] [ 1440]     0  1440     3955       41      12        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504891] [ 1443]     0  1443     3955       41      12        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504893] [ 1448]     0  1448     3955       39      13        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504895] [ 1450]     0  1450     3955       41      13        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504896] [ 1452]     0  1452     3955       42      13        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504898] [ 1469]     0  1469     4785       40      13        0             0 atd
Jun 23 17:20:10 server1 kernel: [517262.504900] [ 1470]     0  1470    15341      168      32        0         -1000 sshd
Jun 23 17:20:10 server1 kernel: [517262.504902] [ 1472]     0  1472     5914       65      17        0             0 cron
Jun 23 17:20:10 server1 kernel: [517262.504904] [ 1478]   999  1478    16020     3710      31        0             0 gmond
Jun 23 17:20:10 server1 kernel: [517262.504905] [ 1486]     0  1486     4821       65      14        0             0 irqbalance
Jun 23 17:20:10 server1 kernel: [517262.504907] [ 1500]     0  1500   343627     1730      85        0             0 nscd                                                                                                          743,1          1%Jun 23 17:20:10 server1 kernel: [517262.504909] [ 1559]     0  1559     1092       37       8        0             0 acpid
Jun 23 17:20:10 server1 kernel: [517262.504911] [ 1641]     0  1641     4978       71      13        0             0 master
Jun 23 17:20:10 server1 kernel: [517262.504913] [ 1650]   103  1650     5427       72      14        0             0 qmgr
Jun 23 17:20:10 server1 kernel: [517262.504917] [ 1895]     0  1895     1900       30       9        0             0 getty
Jun 23 17:20:10 server1 kernel: [517262.504919] [ 1906]  1000  1906  2854329     2610    2594        0             0 thttpd
Jun 23 17:20:10 server1 kernel: [517262.504927] [ 3163]  1000  3163     2432       39      10        0             0 searchd
Jun 23 17:20:10 server1 kernel: [517262.504928] [ 3167]  1000  3167  2727221  2467025    4863        0             0 sphinx-daemon
Jun 23 17:20:10 server1 kernel: [517262.504931] [47622]  1000 47622 17834794 17329575   33989        0             0 MyExec

<.................Trimmed bunch of processes with low mem usage.......................................>


Jun 23 17:20:10 server1 kernel: [517262.508350] Out of memory: Kill process 47622 (MyExec) score 526 or sacrifice child
Jun 23 17:20:10 server1 kernel: [517262.508375] Killed process 47622 (MyExec) total-vm:71339176kB, anon-rss:69318300kB, file-rss:0kB

看看以下几行,似乎问题就是碎片化。

Jun 23 17:20:10 server1 kernel: [517262.504816] Node 0 Normal: 9038*4kB (M) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36152kB
Jun 23 17:20:10 server1 kernel: [517262.504822] Node 1 Normal: 9055*4kB (UM) 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 36220kB

我不知道为什么系统会如此严重碎片化。发生这种情况时,它只运行了5天。另外看一下调用oom杀手的过程(gmond调用oom-killer:gfp_mask = 0x201da,order = 0,oom_score_adj = 0),看起来它只是请求4K块,并且有很多可用的块。

  1. 在这种情况下,我对碎片的理解是否正确?
  2. 我如何理解为什么记忆如此碎片?
  3. 我可以做些什么来避免陷入这种情况。
  4. 你可以注意到的一件事是,我已经完全关闭了交换并将swappiness设置为0.原因是我的系统有足够的RAM并且永远不应该交换掉。我打算启用它并将swappiness设置为10.我不确定在这种情况下是否有帮助。

    感谢您的意见。

3 个答案:

答案 0 :(得分:1)

从日志的最后几行,您可以看到内核报告的总vm使用量为71339176kB(~71GiB),而总vm应包括物理内存和交换空间。此外,您的日志显示约69GiB的驻留内存。

  

在这种情况下,我对碎片的理解是否正确?

如果捕获系统在发生问题或sosreport期间进行诊断,请检查/proc/buddyinfo文件是否存在任何内存碎片。如果您打算复制此脚本,最好编写脚本并备份此信息。

  

我怎么能理解为什么内存如此碎片化?      我该怎么做才能避免陷入这种情况。   有时,应用程序会过度使用系统无法承受的内存,从而可能导致OOM。您可能希望修改并检查其他内核可调参数,或尝试使用sysctl -a禁用内存过量使用以读取设置值。

vm.overcommit_memory=2 vm.overcommit_ratio=80

注意:在/etc/sysctl.conf中添加上述行后,最好重新启动系统。

vm.overcommit:某些应用程序需要为程序分配更多虚拟内存,而不是系统上可用的内存。 vm.overcommit采用不同的值,0 - 使用启发式过度使用算法 1 - 无论内存是否可用,总是过度使用(很可能在服务器上设置为0或1)。 2 - 这告诉内核允许应用程序提交所有交换+%的ram,为此,还应设置以下值(例如:设置为80%) 2-使用它会禁止过度使用内存使用量(超出可用RAM +交换空间的80%)

答案 1 :(得分:1)

了解碎片是不正确的。由于内存水印被打破,因此发布了oom。看看这个:

Node 0 Normal free:34728kB min:42952kB low:53688kB
Node 1 Normal free:33484kB min:45096kB low:56368kB

答案 2 :(得分:0)

使用slabinfo进行更新这是在重新启动节点之后。

# name            <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
kvm_async_pf           0      0    136   30    1 : tunables    0    0    0 : slabdata      0      0      0
kvm_vcpu               0      0  16256    2    8 : tunables    0    0    0 : slabdata      0      0      0
kvm_mmu_page_header      0      0    168   48    2 : tunables    0    0    0 : slabdata      0      0      0
fusion_ioctx        5005   5005    296   55    4 : tunables    0    0    0 : slabdata     91     91      0
fusion_user_ll_request      0      0   3960    8    8 : tunables    0    0    0 : slabdata      0      0      0
ext4_groupinfo_4k 131670 131670    136   30    1 : tunables    0    0    0 : slabdata   4389   4389      0
ip6_dst_cache       1260   1260    384   42    4 : tunables    0    0    0 : slabdata     30     30      0
UDPLITEv6              0      0   1088   30    8 : tunables    0    0    0 : slabdata      0      0      0
UDPv6                330    330   1088   30    8 : tunables    0    0    0 : slabdata     11     11      0
tw_sock_TCPv6        128    128    256   32    2 : tunables    0    0    0 : slabdata      4      4      0
TCPv6                288    288   1984   16    8 : tunables    0    0    0 : slabdata     18     18      0
kcopyd_job             0      0   3312    9    8 : tunables    0    0    0 : slabdata      0      0      0
dm_uevent              0      0   2632   12    8 : tunables    0    0    0 : slabdata      0      0      0
cfq_queue              0      0    232   35    2 : tunables    0    0    0 : slabdata      0      0      0
bsg_cmd                0      0    312   52    4 : tunables    0    0    0 : slabdata      0      0      0
mqueue_inode_cache     36     36    896   36    8 : tunables    0    0    0 : slabdata      1      1      0
fuse_request           0      0    416   39    4 : tunables    0    0    0 : slabdata      0      0      0
fuse_inode             0      0    768   42    8 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_key_record_cache      0      0    576   28    4 : tunables    0    0    0 : slabdata      0      0      0
ecryptfs_inode_cache      0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
fat_inode_cache        0      0    712   46    8 : tunables    0    0    0 : slabdata      0      0      0
fat_cache              0      0     40  102    1 : tunables    0    0    0 : slabdata      0      0      0
hugetlbfs_inode_cache     54     54    600   54    8 : tunables    0    0    0 : slabdata      1      1      0
jbd2_journal_handle   2040   2040     48   85    1 : tunables    0    0    0 : slabdata     24     24      0
jbd2_journal_head   5071   5364    112   36    1 : tunables    0    0    0 : slabdata    149    149      0
jbd2_revoke_table_s   1792   1792     16  256    1 : tunables    0    0    0 : slabdata      7      7      0
jbd2_revoke_record_s   1536   1536     32  128    1 : tunables    0    0    0 : slabdata     12     12      0
ext4_inode_cache   75129  78771    984   33    8 : tunables    0    0    0 : slabdata   2387   2387      0
ext4_free_data      5952   6656     64   64    1 : tunables    0    0    0 : slabdata    104    104      0
ext4_allocation_context    768    768    128   32    1 : tunables    0    0    0 : slabdata     24     24      0
ext4_io_end         1344   1344     72   56    1 : tunables    0    0    0 : slabdata     24     24      0
ext4_extent_status  37921  38352     40  102    1 : tunables    0    0    0 : slabdata    376    376      0
dquot                768    768    256   32    2 : tunables    0    0    0 : slabdata     24     24      0
dnotify_mark         782    782    120   34    1 : tunables    0    0    0 : slabdata     23     23      0
pid_namespace          0      0   2192   14    8 : tunables    0    0    0 : slabdata      0      0      0
posix_timers_cache      0      0    248   33    2 : tunables    0    0    0 : slabdata      0      0      0
UDP-Lite               0      0    896   36    8 : tunables    0    0    0 : slabdata      0      0      0
xfrm_dst_cache         0      0    448   36    4 : tunables    0    0    0 : slabdata      0      0      0
ip_fib_trie          146    146     56   73    1 : tunables    0    0    0 : slabdata      2      2      0
UDP                  828    828    896   36    8 : tunables    0    0    0 : slabdata     23     23      0
tw_sock_TCP          992   1152    256   32    2 : tunables    0    0    0 : slabdata     36     36      0
TCP                  450    450   1792   18    8 : tunables    0    0    0 : slabdata     25     25      0
blkdev_queue         120    136   1896   17    8 : tunables    0    0    0 : slabdata      8      8      0
blkdev_requests     3358   3569    376   43    4 : tunables    0    0    0 : slabdata     83     83      0
blkdev_ioc           964   1287    104   39    1 : tunables    0    0    0 : slabdata     33     33      0
user_namespace         0      0    264   31    2 : tunables    0    0    0 : slabdata      0      0      0
sock_inode_cache    1377   1377    640   51    8 : tunables    0    0    0 : slabdata     27     27      0
net_namespace          0      0   4736    6    8 : tunables    0    0    0 : slabdata      0      0      0
shmem_inode_cache   2112   2112    672   48    8 : tunables    0    0    0 : slabdata     44     44      0
ftrace_event_file   1196   1196     88   46    1 : tunables    0    0    0 : slabdata     26     26      0
taskstats            196    196    328   49    4 : tunables    0    0    0 : slabdata      4      4      0
proc_inode_cache   63037  63250    648   50    8 : tunables    0    0    0 : slabdata   1265   1265      0
sigqueue            1224   1224    160   51    2 : tunables    0    0    0 : slabdata     24     24      0
bdev_cache           819    819    832   39    8 : tunables    0    0    0 : slabdata     21     21      0
kernfs_node_cache  54360  54360    112   36    1 : tunables    0    0    0 : slabdata   1510   1510      0
mnt_cache            510    510    320   51    4 : tunables    0    0    0 : slabdata     10     10      0
inode_cache        16813  19712    584   28    4 : tunables    0    0    0 : slabdata    704    704      0
dentry            144206 144606    192   42    2 : tunables    0    0    0 : slabdata   3443   3443      0
iint_cache             0      0     72   56    1 : tunables    0    0    0 : slabdata      0      0      0
buffer_head       6905641 6922305    104   39    1 : tunables    0    0    0 : slabdata 177495 177495      0
vm_area_struct     16764  16764    184   44    2 : tunables    0    0    0 : slabdata    381    381      0
mm_struct           1008   1008    896   36    8 : tunables    0    0    0 : slabdata     28     28      0
files_cache         1377   1377    640   51    8 : tunables    0    0    0 : slabdata     27     27      0
signal_cache        1380   1380   1088   30    8 : tunables    0    0    0 : slabdata     46     46      0
sighand_cache       1020   1020   2112   15    8 : tunables    0    0    0 : slabdata     68     68      0
task_xstate         1638   1638    832   39    8 : tunables    0    0    0 : slabdata     42     42      0
task_struct          837    855   6480    5    8 : tunables    0    0    0 : slabdata    171    171      0
Acpi-ParseExt       2968   2968     72   56    1 : tunables    0    0    0 : slabdata     53     53      0
Acpi-State           561    561     80   51    1 : tunables    0    0    0 : slabdata     11     11      0
Acpi-Namespace      3162   3162     40  102    1 : tunables    0    0    0 : slabdata     31     31      0
anon_vma           19313  19584     64   64    1 : tunables    0    0    0 : slabdata    306    306      0
shared_policy_node   7735   7735     48   85    1 : tunables    0    0    0 : slabdata     91     91      0
numa_policy          170    170     24  170    1 : tunables    0    0    0 : slabdata      1      1      0
radix_tree_node   2870899 2871624    584   28    4 : tunables    0    0    0 : slabdata 102558 102558      0
idr_layer_cache      555    555   2112   15    8 : tunables    0    0    0 : slabdata     37     37      0
dma-kmalloc-8192       0      0   8192    4    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-4096       0      0   4096    8    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-2048       0      0   2048   16    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-1024       0      0   1024   32    8 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-512        0      0    512   32    4 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-256        0      0    256   32    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-128        0      0    128   32    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-64         0      0     64   64    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-32         0      0     32  128    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-16         0      0     16  256    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-8          0      0      8  512    1 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-192        0      0    192   42    2 : tunables    0    0    0 : slabdata      0      0      0
dma-kmalloc-96         0      0     96   42    1 : tunables    0    0    0 : slabdata      0      0      0
kmalloc-8192         180    180   8192    4    8 : tunables    0    0    0 : slabdata     45     45      0
kmalloc-4096         636    720   4096    8    8 : tunables    0    0    0 : slabdata     90     90      0
kmalloc-2048        6498   6688   2048   16    8 : tunables    0    0    0 : slabdata    418    418      0
kmalloc-1024        4677   4800   1024   32    8 : tunables    0    0    0 : slabdata    150    150      0
kmalloc-512         9029   9056    512   32    4 : tunables    0    0    0 : slabdata    283    283      0
kmalloc-256        31542  31840    256   32    2 : tunables    0    0    0 : slabdata    995    995      0
kmalloc-192        16548  16548    192   42    2 : tunables    0    0    0 : slabdata    394    394      0
kmalloc-128         8449   8544    128   32    1 : tunables    0    0    0 : slabdata    267    267      0
kmalloc-96         20607  21462     96   42    1 : tunables    0    0    0 : slabdata    511    511      0
kmalloc-64         71408  75968     64   64    1 : tunables    0    0    0 : slabdata   1187   1187      0
kmalloc-32          5760   5760     32  128    1 : tunables    0    0    0 : slabdata     45     45      0
kmalloc-16         13824  13824     16  256    1 : tunables    0    0    0 : slabdata     54     54      0
kmalloc-8          45056  45056      8  512    1 : tunables    0    0    0 : slabdata     88     88      0
kmem_cache_node      551    576     64   64    1 : tunables    0    0    0 : slabdata      9      9      0
kmem_cache           256    256    256   32    2 : tunables    0    0    0 : slabdata      8      8      0