Question

我想创建一个程序来模拟Unix服务器上的内存不足（OOM）情况。我创造了这个超级简单的记忆食者：

#include <stdio.h>
#include <stdlib.h>

unsigned long long memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
void *memory = NULL;

int eat_kilobyte()
{
    memory = realloc(memory, (eaten_memory * 1024) + 1024);
    if (memory == NULL)
    {
        // realloc failed here - we probably can't allocate more memory for whatever reason
        return 1;
    }
    else
    {
        eaten_memory++;
        return 0;
    }
}

int main(int argc, char **argv)
{
    printf("I will try to eat %i kb of ram\n", memory_to_eat);
    int megabyte = 0;
    while (memory_to_eat > 0)
    {
        memory_to_eat--;
        if (eat_kilobyte())
        {
            printf("Failed to allocate more memory! Stucked at %i kb :(\n", eaten_memory);
            return 200;
        }
        if (megabyte++ >= 1024)
        {
            printf("Eaten 1 MB of ram\n");
            megabyte = 0;
        }
    }
    printf("Successfully eaten requested memory!\n");
    free(memory);
    return 0;
}

它占用memory_to_eat中定义的内存，现在正好是50 GB的内存。它将内存分配1 MB并准确打印出无法分配更多内存的点，以便我知道它能够吃到的最大值。

问题是它有效。即使在具有1 GB物理内存的系统上也是如此。

当我检查顶部时，我看到该进程占用了50 GB的虚拟内存，只有不到1 MB的驻留内存。有没有办法创建一个真正消耗它的记忆食者？

系统规范：Linux内核3.16（Debian）最有可能启用过度使用（不确定如何检查），没有交换和虚拟化。

Answer 1

当你的malloc()实现从系统内核请求内存时（通过sbrk()或mmap()系统调用），内核只会记下你已经请求了内存以及它在哪里将被放置在您的地址空间内。 它实际上并没有映射那些页面。

当进程随后访问新区域内的内存时，硬件会识别出分段错误并向内核发出警报。然后内核在其自己的数据结构中查找页面，并发现您应该在那里有一个零页面，因此它映射到零页面（可能首先从页面缓存中逐出页面）并从中断返回。你的进程没有意识到发生任何这种情况，内核操作是完全透明的（除了内核工作时的短暂延迟）。

此优化允许系统调用非常快速地返回，最重要的是，它可以避免在进行映射时将任何资源提交到您的进程。这允许进程保留在正常情况下从不需要的相当大的缓冲区，而不用担心会占用太多内存。

所以，如果你想编程一个内存吃，你绝对必须对你分配的内存做些什么。为此，您只需在代码中添加一行：

int eat_kilobyte()
{
    if (memory == NULL)
        memory = malloc(1024);
    else
        memory = realloc(memory, (eaten_memory * 1024) + 1024);
    if (memory == NULL)
    {
        return 1;
    }
    else
    {
        //Force the kernel to map the containing memory page.
        ((char*)memory)[1024*eaten_memory] = 42;

        eaten_memory++;
        return 0;
    }
}

请注意，写入每个页面中的单个字节（在X86上包含4096个字节）就足够了。这是因为从内核到进程的所有内存分配都是以内存页面粒度完成的，而这又是因为硬件不允许以较小的粒度进行分页。

Answer 2

所有虚拟页面都开始写入映射到同一个零化物理页面的写入时复制。要使用物理页面，可以通过向每个虚拟页面写入内容来弄脏它们。

如果以root身份运行，您可以使用mlock(2)或mlockall(2)让内核在分配页面时将其连接起来，而不必弄脏它们。（普通的非root用户只有ulimit -l只有64kiB。）

正如许多其他人所说，似乎Linux内核并没有真正分配内存，除非你写信给它

代码的改进版本，它执行OP想要的内容：

这也修复了printf格式字符串与memory_to_eat和eaten_memory类型的不匹配，使用%zi来打印size_t整数。要吃的内存大小（以kiB为单位）可以选择指定为命令行arg。

使用全局变量的混乱设计，增长1k而不是4k页，没有改变。

#include <stdio.h>
#include <stdlib.h>

size_t memory_to_eat = 1024 * 50000;
size_t eaten_memory = 0;
char *memory = NULL;

void write_kilobyte(char *pointer, size_t offset)
{
    int size = 0;
    while (size < 1024)
    {   // writing one byte per page is enough, this is overkill
        pointer[offset + (size_t) size++] = 1;
    }
}

int eat_kilobyte()
{
    if (memory == NULL)
    {
        memory = malloc(1024);
    } else
    {
        memory = realloc(memory, (eaten_memory * 1024) + 1024);
    }
    if (memory == NULL)
    {
        return 1;
    }
    else
    {
        write_kilobyte(memory, eaten_memory * 1024);
        eaten_memory++;
        return 0;
    }
}

int main(int argc, char **argv)
{
    if (argc >= 2)
        memory_to_eat = atoll(argv[1]);

    printf("I will try to eat %zi kb of ram\n", memory_to_eat);
    int megabyte = 0;
    int megabytes = 0;
    while (memory_to_eat-- > 0)
    {
        if (eat_kilobyte())
        {
            printf("Failed to allocate more memory at %zi kb :(\n", eaten_memory);
            return 200;
        }
        if (megabyte++ >= 1024)
        {
            megabytes++;
            printf("Eaten %i  MB of ram\n", megabytes);
            megabyte = 0;
        }
    }
    printf("Successfully eaten requested memory!\n");
    free(memory);
    return 0;
}

Answer 3

这里正在进行合理的优化。在您使用它之前，运行时实际上获取内存。

简单的memcpy足以绕过这种优化。（您可能会发现calloc仍然优化了内存分配，直到使用点。）

Answer 4

不确定这个，但我唯一能解释的是linux是一个写时复制操作系统。当一个人调用fork时，两个进程都指向相同的物理内存。只有当一个进程实际写入内存时，才会复制内存。

我想在这里，实际的物理内存只在一个人试图写东西时分配。调用sbrk或mmap可能只会更新内核的内存簿保留。实际的RAM只能在我们实际尝试访问内存时分配。

为什么这个记忆中的人真的不吃记忆？

4 个答案:

代码的改进版本，它执行OP想要的内容：