在k8s pod中mlock巨大页面返回ENOMEM

时间:2019-06-04 17:22:02

标签: kubernetes huge-pages

我在具有SYS_ADMIN功能的k8s pod容器中运行一个程序。程序分配了2 MB的大页面,该页面成功。然后在该内存上调用mlock(),这将失败。

我查看了ENOMEM的手册页,没有任何原因可以解释该问题。

我尝试在主机上运行该程序,工作正常。

我尝试在具有相同映像的SYS_ADMIN上在docker容器上运行程序。

我检查了如下所示的直接Docker案例与k8s案例的OCI config.json文件之间的区别,我看不到任何有趣的东西。

版本

axe@axe-tester:~$ cat /proc/version
Linux version 4.15.0-29-generic (buildd@lgw01-amd64-057) (gcc version 7.3.0 (Ubuntu 7.3.0-16ubuntu3)) #31-Ubuntu SMP Tue Jul 17 15:39:52 UTC 2018
axe@axe-tester:~$ kubectl version
Client Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:23:09Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"14", GitVersion:"v1.14.2", GitCommit:"66049e3b21efe110454d67df4fa62b08ea79a19b", GitTreeState:"clean", BuildDate:"2019-05-16T16:14:56Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
axe@axe-tester:~$ docker version
Client:
 Version:           18.09.2
 API version:       1.39
 Go version:        go1.10.4
 Git commit:        6247962
 Built:             Tue Feb 26 23:52:23 2019
 OS/Arch:           linux/amd64
 Experimental:      false

Server:
 Engine:
  Version:          18.09.2
  API version:      1.39 (minimum version 1.12)
  Go version:       go1.10.4
  Git commit:       6247962
  Built:            Wed Feb 13 00:24:14 2019
  OS/Arch:          linux/amd64
  Experimental:     false

以下yaml文件已使用 / tmp / test

中的测试程序
apiVersion: v1
kind: Pod
metadata:
  name: test
  annotations:
    seccomp.security.alpha.kubernetes.io/pod: docker/default
spec:
  restartPolicy: Never
  containers:
    - name: t
      image: amazonlinux:2
      imagePullPolicy: Never
      command: ["sleep", "1200"]
      securityContext:
        capabilities:
          add: ["SYS_ADMIN", "IPC_LOCK"]
      volumeMounts:
      - mountPath: /test
        name: test
      resources:
        limits:
          hugepages-2Mi: 100Mi
          memory: 100Mi
        requests:
          memory: 100Mi
  volumes:
  - name: test
    hostPath:
      path: /tmp/test

复制步骤:

 kubectl create -f /tmp/test.yml
 kubectl exec -it  test  -- /bin/bash
 # in the container..
 bash-4.2# /test 
 Previous limits: soft=16777216; hard=16777216
 mlock failed: Cannot allocate memory

测试程序

#define MMAP_FLAGS (MAP_PRIVATE | MAP_HUGETLB | MAP_HUGE_2MB| MAP_ANONYMOUS)
#define MMAP_MIN_SIZE (2 * 1024 * 1024)
void *dma_mp_mmap_hugetlb(size_t size)
{
    int err = 0;
    void *va = NULL;

    va = mmap(0, size, PROT_READ | PROT_WRITE, MMAP_FLAGS, -1, 0);
    if (va == MAP_FAILED) {
        perror("mmap failed");
        return MAP_FAILED;
    }

    /* Pin the memory */
    err = mlock(va, size);
    if (err) {
        perror("mlock failed");
        return MAP_FAILED;
    }

    return va;
}


int main (void) {
    struct rlimit old;
    getrlimit(RLIMIT_MEMLOCK, &old);
              printf("Previous limits: soft=%lld; hard=%lld\n", (long long) old.rlim_cur, (long long) old.rlim_max);
    assert(dma_mp_mmap_hugetlb(MMAP_MIN_SIZE) != NULL);
}

1 个答案:

答案 0 :(得分:0)

检查父cgroup,例如:/sys/fs/cgroup/hugetlb/kubepods.slice/hugetlb.2MB.limit_in_bytes,它可能是0或小于您的pod请求,然后您必须重新启动kubelet以刷新cgroup