Question

我一直在考虑如果我将较长的字符串文字分配给较小尺寸的字符数组会发生什么。（我知道如果我使用字符串文字作为初始化程序，我可能会忽略大小并让编译器计算字符数，或者使用strlen（）+ 1作为大小。）

我有以下代码：

#include <stdio.h>

int main() {
    char a[3] = "abc"; // a[2] gives an error of initializer-string for array of chars is too long
    printf("%s\n", a);
    printf("%p\n", a);
}

我希望它会崩溃，但它实际上会在没有警告的情况下进行编译，并且可以打印出来。但是使用valgrind，我收到以下错误消息。

==19195== Memcheck, a memory error detector
==19195== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==19195== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==19195== Command: ./a.out
==19195== 
==19195== Conditional jump or move depends on uninitialised value(s)
==19195==    at 0x4E88CC0: vfprintf (vfprintf.c:1632)
==19195==    by 0x4E8F898: printf (printf.c:33)
==19195==    by 0x4005CC: main (main.c:5)
==19195== 
==19195== Conditional jump or move depends on uninitialised value(s)
==19195==    at 0x4EB475D: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:850)
==19195==    by 0x4EB56AF: _IO_default_xsputn (genops.c:455)
==19195==    by 0x4EB32C6: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1352)
==19195==    by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195==    by 0x4E8F898: printf (printf.c:33)
==19195==    by 0x4005CC: main (main.c:5)
==19195== 
==19195== Conditional jump or move depends on uninitialised value(s)
==19195==    at 0x4EB478A: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:858)
==19195==    by 0x4EB56AF: _IO_default_xsputn (genops.c:455)
==19195==    by 0x4EB32C6: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1352)
==19195==    by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195==    by 0x4E8F898: printf (printf.c:33)
==19195==    by 0x4005CC: main (main.c:5)
==19195== 
==19195== Conditional jump or move depends on uninitialised value(s)
==19195==    at 0x4EB56B3: _IO_default_xsputn (genops.c:455)
==19195==    by 0x4EB32C6: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1352)
==19195==    by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195==    by 0x4E8F898: printf (printf.c:33)
==19195==    by 0x4005CC: main (main.c:5)
==19195== 
==19195== Syscall param write(buf) points to uninitialised byte(s)
==19195==    at 0x4F306E0: __write_nocancel (syscall-template.S:84)
==19195==    by 0x4EB2BFE: _IO_file_write@@GLIBC_2.2.5 (fileops.c:1263)
==19195==    by 0x4EB4408: new_do_write (fileops.c:518)
==19195==    by 0x4EB4408: _IO_do_write@@GLIBC_2.2.5 (fileops.c:494)
==19195==    by 0x4EB347C: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1331)
==19195==    by 0x4E8792C: vfprintf (vfprintf.c:1663)
==19195==    by 0x4E8F898: printf (printf.c:33)
==19195==    by 0x4005CC: main (main.c:5)
==19195==  Address 0x5203043 is 3 bytes inside a block of size 1,024 alloc'd
==19195==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==19195==    by 0x4EA71D4: _IO_file_doallocate (filedoalloc.c:127)
==19195==    by 0x4EB5593: _IO_doallocbuf (genops.c:398)
==19195==    by 0x4EB48F7: _IO_file_overflow@@GLIBC_2.2.5 (fileops.c:820)
==19195==    by 0x4EB328C: _IO_file_xsputn@@GLIBC_2.2.5 (fileops.c:1331)
==19195==    by 0x4E8850A: vfprintf (vfprintf.c:1632)
==19195==    by 0x4E8F898: printf (printf.c:33)
==19195==    by 0x4005CC: main (main.c:5)
==19195== 
abc?
0xfff0003f0
==19195== 
==19195== HEAP SUMMARY:
==19195==     in use at exit: 0 bytes in 0 blocks
==19195==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==19195== 
==19195== All heap blocks were freed -- no leaks are possible
==19195== 
==19195== For counts of detected and suppressed errors, rerun with: -v
==19195== Use --track-origins=yes to see where uninitialised values come from
==19195== ERROR SUMMARY: 10 errors from 5 contexts (suppressed: 0 from 0)

我认为未初始化的值/字节部分是有意义的，因为没有为终止字符'\ 0'分配内存，当我打印出来时，最后一个字符是垃圾值。

但最后一条错误消息对我来说并不熟悉。

地址0x5203043是大小为1,024 alloc'd
的块内的3个字节

我知道缓冲区大小定义为1024.由于内存使用效率低，我不确定此错误是否存在。

另外我想知道堆分配和免费来自哪里？那是字符串文字吗？

感谢您的帮助!!

（这个问题的前一个主题可能令人困惑。我改变了它。）

A similar question, but in C++

Answer 1

将字符串文字“abc”分配给大小为3的数组会导致valgrind错误

分配不会导致valgrind错误。 char a[3] = "abc"没问题。 C允许字符数组初始化 sans null字符。

字符串文字的连续字节（包括如果有空间或数组的大小未知，则终止空字符）初始化数组的元素。 C11§6.7.914

printf("%s", ...期望指向空字符终止数组的指针。 a不是因为它没有空字符。代码尝试访问超出a[]并且未定义的行为，错误来自于此。它不是“低效使用内存。”，而是访问超出未初始化的内存。

而是使用以下内容进行打印，直到找到空字符或打印3个字符。

printf("%.3s\n", a);
// or 
printf("%.*s\n", (int) sizeof a, a);

Answer 2

以下是我对正在发生的事情的解释：

您正在写入stdout，默认情况下会缓存。因此，所有数据首先进入内部缓冲区，然后写入（＆＃34;刷新＆＃34;）到实际的底层文件描述符。

您的a数组不是有效字符串，因为它没有终止NUL字节。前几个消息来自printf内部，它试图通过查找终结符并将内容复制到stdout的缓冲区来计算参数字符串的长度。由于a中没有终结符，代码超出范围，读取未初始化的内存。

此时输出缓冲区如下所示：

char *buf = malloc(1024), contents:
a b c ? ? ? ?
^^^^^ ^^^^^^^

第一部分（abc）是从a合法复制的。下一部分是随机垃圾（a之后的未初始化字节，复制到缓冲区中）。这种情况一直持续到NUL字节恰好发生在a之后的某个地方，然后将其视为字符串的结尾（这是从a复制停止的地方）。

最后，格式字符串中的'\n'也被添加到缓冲区中：

char *buf = malloc(1024), contents:
a b c ? ? ? ? \n
^^^^^ ^^^^^^^ ^^

然后（因为我们遇到了'\n'并且stdout是行缓冲的）我们刷新缓冲区，调用write(STDOUT_FILENO, buf, N)，其中N是在那里使用了很多字节输出缓冲区（至少为4，但确切的数字取决于在'\0'之后找到a之前复制的垃圾字节数。）

现在，错误：

==19195== Syscall param write(buf) points to uninitialised byte(s)

这就是说write（缓冲区）的第一个参数中有未初始化的字节。

显然，valgrind将输出缓冲区的某些部分视为未初始化，因为源数据未初始化。将垃圾从A复制到B只意味着B也是垃圾。

==19195==  Address 0x5203043 is 3 bytes inside a block of size 1,024 alloc'd

所以它说有一个动态分配的缓冲区（大小为1024），并且在偏移量3处找到了前一个错误的uninitialised byte(s)。这是有道理的，因为偏移0， 1,2包含"abc"，这是完全有效的数据。但在那之后就是麻烦的开始。

它还说该块来自malloc，它是从printf（间接）调用的。这是因为stdout的输出缓冲区是在第一次写入时按需创建的。这是您printf中的第一个main来电。

C：尝试将字符串文字“abc”分配给大小为3的数组，valgrind检测到错误

2 个答案: