Question

在Linux上如果我去malloc(1024 * 1024 * 1024)，malloc实际上做了什么？

我确定它为分配分配了一个虚拟地址（通过遍历空闲列表并在必要时创建新映射），但它实际上是否创建了1 GiB的交换页面？或者它是mprotect地址范围并在您实际触摸它们时创建页面，如mmap吗？

（我正在指定Linux，因为the standard对这些细节没有提及，但我有兴趣知道其他平台也会这样做。）

Answer 1

Linux确实推迟了页面分配。 '乐观的记忆分配'。从malloc返回的内存没有任何支持，当你触摸它时，你实际上可能会得到一个OOM条件（如果你请求的页面没有交换空间），在这种情况下a process is unceremoniously terminated。

参见例如http://www.linuxdevcenter.com/pub/a/linux/2006/11/30/linux-out-of-memory.html

Answer 2

9. Memory （Andries Brouwer的The Linux kernel, Some remarks on the Linux Kernel的一部分）是一份很好的文档。

它包含以下程序，演示Linux处理物理内存与实际内存的比较，并解释了内核的内部结构。

通常，第一个演示程序在malloc（）返回NULL之前会获得非常大的内存。第二个演示程序将获得更少的内存量，现在实际使用的是先前获得的内存。第三个程序将获得与第一个程序相同的大量程序，然后在它想要使用其内存时被终止。

演示程序1：不使用它来分配内存。

#include <stdio.h>
#include <stdlib.h>

int main (void) {
    int n = 0;

    while (1) {
        if (malloc(1<<20) == NULL) {
                printf("malloc failure after %d MiB\n", n);
                return 0;
        }
        printf ("got %d MiB\n", ++n);
    }
}

演示程序2：分配内存并实际触摸它。

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

int main (void) {
    int n = 0;
    char *p;

    while (1) {
        if ((p = malloc(1<<20)) == NULL) {
                printf("malloc failure after %d MiB\n", n);
                return 0;
        }
        memset (p, 0, (1<<20));
        printf ("got %d MiB\n", ++n);
    }
}

演示程序3：首先分配，然后再使用。

#include <stdio.h>
#include <string.h>
#include <stdlib.h>

#define N       10000

int main (void) {
    int i, n = 0;
    char *pp[N];

    for (n = 0; n < N; n++) {
        pp[n] = malloc(1<<20);
        if (pp[n] == NULL)
            break;
    }
    printf("malloc failure after %d MiB\n", n);

    for (i = 0; i < n; i++) {
        memset (pp[i], 0, (1<<20));
        printf("%d\n", i+1);
    }

    return 0;
}

（在运行良好的系统上，如Solaris，三个演示程序获得相同数量的内存并且不会崩溃，但是看到malloc（）返回NULL。）

Answer 3

我对同一主题的类似帖子给出了这个答案：

Are some allocators lazy?

这开始有点偏离主题（然后我将它与你的问题联系起来），但是发生的事情类似于在Linux中分叉进程时发生的情况。在分叉时，有一种称为写入时复制的机制，它只在写入内存时复制新进程的内存空间。这样，如果分叉进程立即执行一个新程序，那么你就节省了复制原始程序内存的开销。

回到你的问题，这个想法是类似的。正如其他人所指出的那样，请求内存会立即获得虚拟内存空间，但实际页面只会在写入时分配。

这是为了什么目的？它基本上使mallocing内存成为一个或多或少恒定的时间操作Big O（1）而不是Big O（n）操作（类似于Linux调度程序传播它的工作方式，而不是在一个大块中执行）。

为了证明我的意思，我做了以下实验：

rbarnes@rbarnes-desktop:~/test_code$ time ./bigmalloc

real    0m0.005s
user    0m0.000s
sys 0m0.004s
rbarnes@rbarnes-desktop:~/test_code$ time ./deadbeef

real    0m0.558s
user    0m0.000s
sys 0m0.492s
rbarnes@rbarnes-desktop:~/test_code$ time ./justwrites

real    0m0.006s
user    0m0.000s
sys 0m0.008s

bigmalloc程序分配了2000万个整数，但对它们没有任何作用。 deadbeef在每个页面写入一个int，导致19531次写入，而justwrits分配19531个int并将它们归零。正如你所看到的，deadbeef的执行时间比bigmalloc长大约100倍，比justwrites长约50倍。

#include <stdlib.h>

int main(int argc, char **argv) {

    int *big = malloc(sizeof(int)*20000000); // Allocate 80 million bytes

    return 0;
}

#include <stdlib.h>

int main(int argc, char **argv) {

    int *big = malloc(sizeof(int)*20000000); // Allocate 80 million bytes

    // Immediately write to each page to simulate an all-at-once allocation
    // assuming 4k page size on a 32-bit machine.

    for (int* end = big + 20000000; big < end; big += 1024)
        *big = 0xDEADBEEF;

    return 0;
}

#include <stdlib.h>

int main(int argc, char **argv) {

    int *big = calloc(sizeof(int), 19531); // Number of writes

    return 0;
}

Answer 4

Malloc从libc管理的块中分配内存。当需要额外的内存时，库会使用brk系统调用进入内核。

内核将虚拟内存页面分配给调用进程。页面作为流程拥有的资源的一部分进行管理。当内存为brk时，不会分配物理页面。当进程访问其中一个brk页面中的任何内存位置时，会发生页面错误。内核验证已分配虚拟内存并继续将物理页面映射到虚拟页面。

页面分配不仅限于写入，与写入时的复制完全不同。任何访问，读取或写入都会导致页面错误和物理页面的映射。

请注意，堆栈内存会自动映射。也就是说，不需要显式brk将页面映射到堆栈使用的虚拟内存。

Answer 5

在Windows上，页面已提交（即可用内存不足），但在您触摸页面（读取或写入）之前，它们实际上不会被分配。

Answer 6

在大多数类Unix系统上，它管理brk边界。 VM在处理器命中时添加页面。至少Linux和BSDs这样做。

malloc是否懒惰地为Linux（和其他平台）上的分配创建支持页面？

6 个答案: