c ++应用程序无法分配比某个限制更多的大页面

时间:2018-04-11 15:12:32

标签: c++ memory-management heap-memory huge-pages

概述

我有一个读取大量数据的c ++应用程序(~1T)。我使用largepages(在2M的614400页)运行它,这是有效的 - 直到它达到128G。

为了测试我在c ++中创建了一个简单的应用程序,它分配了2M的块,直到它不能。

使用以下命令运行应用程序:

LD_PRELOAD=/usr/lib64/libhugetlbfs.so HUGETLB_MORECORE=yes ./a.out

在运行时,我会监控nr的免费largepages(来自 / proc / meminfo )。 我可以看到它以预期的速度消耗大量页面。

然而,在128G分配(或65536页)时,应用程序崩溃并出现 std :: bad_alloc 异常。

如果我同时运行两个或更多个实例,它们都会以128G的速度崩溃。

如果我将cgroup限制降低到较小的值,例如16G,则应用程序会在此时正确崩溃并出现“总线错误”。

我错过了一些微不足道的事情吗?请看下面的详细信息。

我的想法已经不多了......

详细

机器,操作系统和软件:

CPU    :  Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Memory :  1.5T
Kernel :  3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
OS     :  CentOS Linux release 7.4.1708 (Core)

hugetlbfs : 2.16-12.el7
gcc       : 7.2.1 20170829

我使用的简单测试代码(分配2M的块直到自由大页面低于限制)

#include <iostream>
#include <fstream>
#include <vector>
#include <array>
#include <string>


#define MEM512K 512*1024ul
#define MEM2M   4*MEM512K

// data block
template <size_t N>
struct DataBlock {
  char data[N];
};

// Hugepage info
struct HugePageInfo {
  size_t memfree;
  size_t total;
  size_t free;
  size_t size;
  size_t used;
  double used_size;
};

// dump hugepage info
void dumpHPI(const HugePageInfo & hpi) {
  std::cout << "HugePages total : " << hpi.total << std::endl;
  std::cout << "HugePages free  : " << hpi.free << std::endl;
  std::cout << "HugePages size  : " << hpi.size << std::endl;
}

// dump hugepage info in one line
void dumpHPIline(const size_t i, const HugePageInfo & hpi) {
  std::cout << i << " "
            << hpi.memfree << " "
            << hpi.total-hpi.free << " "
        << hpi.free << " "
            << hpi.used_size
            << std::endl;
}

// get hugepage info from /proc/meminfo
void getHugePageInfo( HugePageInfo & hpi ) {
  std::ifstream fmeminfo;

  fmeminfo.open("/proc/meminfo",std::ifstream::in);
  std::string line;
  size_t n=0;
  while (fmeminfo.good()) {
    std::getline(fmeminfo,line);
    const size_t sep = line.find_first_of(':');
    if (sep==std::string::npos) continue;

    const std::string lblstr = line.substr(0,sep);
    const size_t      endpos = line.find(" kB");
    const std::string trmstr = line.substr(sep+1,(endpos==std::string::npos ? line.size() : endpos-sep-1));
    const size_t      startpos = trmstr.find_first_not_of(' ');
    const std::string valstr = (startpos==std::string::npos ? trmstr : trmstr.substr(startpos) );
    if (lblstr=="HugePages_Total") {
      hpi.total = std::stoi(valstr);
    } else if (lblstr=="HugePages_Free") {
      hpi.free = std::stoi(valstr);
    } else if (lblstr=="Hugepagesize") {
      hpi.size = std::stoi(valstr);
    } else if (lblstr=="MemFree") {
      hpi.memfree = std::stoi(valstr);
    }
  }
  hpi.used = hpi.total - hpi.free;
  hpi.used_size = double(hpi.used*hpi.size)/1024.0/1024.0;
}
// allocate data
void test_rnd_data() {
  typedef DataBlock<MEM2M> elem_t;
  HugePageInfo hpi;
  getHugePageInfo(hpi);
  dumpHPIline(0,hpi);
  std::array<elem_t *,MEM512K> memmap;
  for (size_t i=0; i<memmap.size(); i++) memmap[i]=nullptr;

  for (size_t i=0; i<memmap.size(); i++) {
    // allocate a new 2M block
    memmap[i] = new elem_t();

    // output progress
    if (i%1000==0) {
      getHugePageInfo(hpi); 
      dumpHPIline(i,hpi);
      if (hpi.free<1000) break;
    }
  }
  std::cout << "Cleaning up...." << std::endl;
  for (size_t i=0; i<memmap.size(); i++) {
    if (memmap[i]==nullptr) continue;
    delete memmap[i];
  }
}

int main(int argc, const char** argv) {
  test_rnd_data();
}

Hugepages在启动时设置为使用614400页,每页2M。

来自/ proc / meminfo:

MemTotal:       1584978368 kB
MemFree:        311062332 kB
MemAvailable:   309934096 kB
Buffers:            3220 kB
Cached:           613396 kB
SwapCached:            0 kB
Active:           556884 kB
Inactive:         281648 kB
Active(anon):     224604 kB
Inactive(anon):    15660 kB
Active(file):     332280 kB
Inactive(file):   265988 kB
Unevictable:           0 kB
Mlocked:               0 kB
SwapTotal:       2097148 kB
SwapFree:        2097148 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:        222280 kB
Mapped:            89784 kB
Shmem:             18348 kB
Slab:             482556 kB
SReclaimable:     189720 kB
SUnreclaim:       292836 kB
KernelStack:       11248 kB
PageTables:        14628 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:    165440732 kB
Committed_AS:    1636296 kB
VmallocTotal:   34359738367 kB
VmallocUsed:     7789100 kB
VmallocChunk:   33546287092 kB
HardwareCorrupted:     0 kB
AnonHugePages:         0 kB
HugePages_Total:   614400
HugePages_Free:    614400
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:      341900 kB
DirectMap2M:    59328512 kB
DirectMap1G:    1552941056 kB

ulimit的限制:

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 6191203
max locked memory       (kbytes, -l) 1258291200
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

cgroup限制:

> cat /sys/fs/cgroup/hugetlb/hugetlb.2MB.limit_in_bytes
9223372036854771712

测试

使用HUGETLB_DEBUG = 1运行测试代码时的输出:

...
libhugetlbfs [abc:185885]: INFO: Attempting to map 2097152 bytes
libhugetlbfs [abc:185885]: INFO: ... = 0x1ffb200000
libhugetlbfs [abc:185885]: INFO: hugetlbfs_morecore(2097152) = ...
libhugetlbfs [abc:185885]: INFO: heapbase = 0xa00000, heaptop = 0x1ffb400000, mapsize = 1ffaa00000, delta=2097152
libhugetlbfs [abc:185885]: INFO: Attempting to map 2097152 bytes
libhugetlbfs [abc:185885]: WARNING: New heap segment map at 0x1ffb400000 failed: Cannot allocate memory
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
Aborted (core dumped)

使用strace:

...
mmap(0x1ffb400000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa200000) = 0x1ffb400000
mmap(0x1ffb600000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa400000) = 0x1ffb600000
mmap(0x1ffb800000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa600000) = 0x1ffb800000
mmap(0x1ffba00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffa800000) = 0x1ffba00000
mmap(0x1ffbc00000, 2097152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_HUGETLB, -1, 0x1ffaa00000) = -1 ENOMEM (Cannot allocate memory)
write(2, "libhugetlbfs", 12)            = 12
write(2, ": WARNING: New heap segment map "..., 79) = 79
mmap(NULL, 3149824, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 134217728, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 67108864, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_NORESERVE, -1, 0) = -1 ENOMEM (Cannot allocate memory)
mmap(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
write(2, "terminate called after throwing "..., 48) = 48
write(2, "std::bad_alloc", 14)          = 14
write(2, "'\n", 2)                      = 2
write(2, "  what():  ", 11)             = 11
write(2, "std::bad_alloc", 14)          = 14
write(2, "\n", 1)                       = 1
rt_sigprocmask(SIG_UNBLOCK, [ABRT], NULL, 8) = 0
gettid()                                = 188617
tgkill(188617, 188617, SIGABRT)         = 0
--- SIGABRT {si_signo=SIGABRT, si_code=SI_TKILL, si_pid=188617, si_uid=1001} ---

最后在/ proc / pid / numa_maps:

...
1ffb000000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb200000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb400000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb600000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
1ffb800000 default file=/anon_hugepage\040(deleted) huge anon=1 dirty=1 N1=1 kernelpagesize_kB=2048
...

1 个答案:

答案 0 :(得分:27)

  

然而,在128G分配(或65536页)时,应用程序崩溃并出现std :: bad_alloc异常。

您正在分配太多小型细分,每个流程可以获得的地图细分数量有限。

sysctl -n vm.max_map_count

您尝试至少使用1024 * 512 * 4 == 2097152 MAP,对阵列使用一个,但默认值vm.max_map_count仅为65536。

您可以使用以下命令进行更改:

sysctl -w vm.max_map_count=3000000

或者您可以在代码中分配更大的细分。