Question

我有以下代码：

#include <iostream>

char ch[] = "abcd";

int main() {
    std::cout << (long)(int*)(ch+0) << ' '
         << (long)(int*)(ch+1) << ' '
         << (long)(int*)(ch+2) << ' '
         << (long)(int*)(ch+3) << std::endl;

    std::cout << *(int*)(ch+0) << ' '
         << *(int*)(ch+1) << ' '
         << *(int*)(ch+2) << ' '
         << *(int*)(ch+3) << std::endl;
    std::cout << int('abcd') << ' '
         << int('bcd') << ' '
         << int('cd') << ' '
         << int('d') << std::endl;
}

我的问题是为什么'd'的指针是100？我认为应该是：

int('d') << 24; //plus some trash on stack after ch

问题是为什么标准输出的第二行和第三行不同？

6295640 6295641 6295642 6295643

1684234849 6579042 25699 100

1633837924 6447972 25444 100

感谢。

Answer 1

int('d')是转换为int的字符'd'，其十进制值为100.您可以查看ASCII table。

除此之外，您使用的指针算法不正确，因为ch + x的每次读取都会在x > 0读取超过数组末尾时读取。

那么为什么第二行的最后一个数字是100？它应该是100 <＆lt; 24加一些垃圾

可能你读了100,0,0,0（尽管在第1，第2，第3位可能有任何垃圾），因为有了endiannes，它被读为100。与“第三个条目是：(int)('d'*256 + 'c') = 25699而不是'c'*256 + 'd'”相同。

如果有人感兴趣为什么（int ）（ch + 2）=（int）（'d'* 256 +'c'）= 25699

C ++标准版n3337 § 2.14.3 / 1

（...）包含单个c-char的普通字符文字 type char，其值等于编码的数值执行字符集中的c-char。一个普通人物包含多个c-char的文字是多字符文字。 多字符文字的类型为int和实现定义的值。（...）

Answer 2

代码不是无警告

warning: multi-character character constant [-Wmultichar]

输出是：

6296232 6296233 6296234 6296235
1684234849 6579042 25699 100
1633837924 6447972 25444 100

说明：对于第1行，假设指针ch具有值6296232，它具有ch，ch+1，ch+2，ch+3打印的指针值

对于第二行，假设一个int在32位机器上是4个字节，

1st entry is : (int)('d'*256*256*256 + 'c'*256*256 + 'b'*256 + 'a') = 1684234849 
2nd entry is : (int)('d'*256*256 + 'c'*256 + 'b') = 6579042 
3rd entry is : (int)('d'*256 + 'c') = 25699 
4th entry is : (int)('d') = 100 (ASCII value of 'd)

对于第3行，假设一个int在32位机器上是4个字节，

1st entry is : (int)('d' + 'c'*256 + 'b'*256*256 + 'a'*256*256*256) = 1633837924
2nd entry is : (int)('d' + 'c'*256 + 'b'*256*256) = 6447972 
3rd entry is : (int)('d' + 'c'*256) = 25444
4th entry is : (int)('d') = 100 (ASCII value of 'd)

Answer 3

嗯，你正在调用未定义的行为，你期望什么是一个明智的答案？ ;）

第二行是调用未定义的行为：

std::cout << *(int*)(ch+0)

没关系，因为sizeof(int)确实有ch+0个字节的数据，但是：

*(int*)(ch+2)

和

*(int*)(ch+3)

每当sizeof(int) 4个字节或更多（并且大多数编译器/平台使用4个字节）时，

涉及读取数组的末尾。

那么，为什么你期望在数组之后垃圾？为什么字节值为0？

是不可接受的

未定义的行为，因此根据定义，任何内容都是可以接受的。包括0。

因此您正在读取（100,0,0,0）整数，显示为100.

为什么100而不是100＆lt;＆lt; 24你问？

嗯，这是Endianness的问题。如果您的平台是little-endian，则（100,0,0,0）被解释为100，如果它是big-endian，则（100,0,0,0）被解释为100＆lt;＆lt; 24。

您似乎处于小端平台上：所有x86和x86_64 CPU（如Intel / AMD）都是小端的。

注意：在std::cout << (long)(int*)(ch+0)中，long投射到ostream是不必要的，void const*可以显示T*，并且存在从void*到long的隐式转换{{1}}这样您就可以获得没有{{1}}的地址。

指针引用和解除引用

3 个答案: