Question

我试图了解C编程内存字节顺序，但是我感到困惑。我在此站点上尝试了一些有价值的应用来进行输出验证：www.yolinux.com/TUTORIALS/Endian-Byte-Order.html

对于我在C程序中使用的64位值：

volatile long long ll = (long long)1099511892096;
__mingw_printf("\tlong long, %u Bytes, %u bits,\t%lld to %lli, %lli, 0x%016llX\n", sizeof(long long), sizeof(long long)*8, LLONG_MIN, LLONG_MAX , ll, ll);

void printBits(size_t const size, void const * const ptr)
{
    unsigned char *b = (unsigned char*) ptr;
    unsigned char byte;
    int i, j;
    printf("\t");
    for (i=size-1;i>=0;i--)
    {
        for (j=7;j>=0;j--)
        {
            byte = b[i] & (1<<j);
            byte >>= j;
            printf("%u", byte);
        }

        printf(" ");
    }
    puts("");
}

退出

long long,                8 Bytes,   64 bits,   -9223372036854775808 to 9223372036854775807, 1099511892096, 0x0000010000040880
80 08 04 00 00 01 00 00  (Little-Endian)
10000000 00001000 00000100 00000000 00000000 00000001 00000000 00000000
00 00 01 00 00 04 08 80  (Big-Endian)
00000000 00000000 00000001 00000000 00000000 00000100 00001000 10000000

测试

0x8008040000010000, 1000000000001000000001000000000000000000000000010000000000000000 // online website hex2bin conv. 
                    1000000000001000000001000000000000000000000000010000000000000000 // my C app
0x8008040000010000, 1000010000001000000001000000000000000100000000010000000000000000 // yolinux.com


0x0000010000040880, 0000000000000000000000010000000000000000000001000000100010000000      //online website hex2bin conv., 1099511892096  ! OK
                    0000000000000000000000010000000000000000000001000000100010000000      // my C app,  1099511892096 ! OK
[Convert]::ToInt64("0000000000000000000000010000000000000000000001000000100010000000", 2) // using powershell for other verif., 1099511892096 ! OK          
0x0000010000040880, 0000000000000000000000010000010000000000000001000000100010000100      // yolinux.com, 1116691761284 (from powershell bin conv.) ! BAD !

问题

yolinux.com网站宣布 BIG ENDIAN 为 0x0000010000040880 ！但是我认为我的计算机使用LITTLE ENDIAN（Intel proc。）从我的C应用程序和另一个网站hex2bin转换器获得相同的值0x0000010000040880。 __mingw_printf（... 0x％016llX ...，... ll）也会打印0x0000010000040880。

在yolinux网站之后，我暂时在输出中反转了“（Little-Endian）”和“（Big-Endian）”标签。

另外，正号的符号位必须为0，这既是我的结果，也是yolinux的结果。（无法确定。）

如果我正确理解字节序，则仅字节交换而不是位交换，而我的位组似乎正确地反转了。

这仅仅是yolinux.com上的错误，还是我错过了有关64位数字和C编程的步骤？

Answer 1

使用printf（和正确的格式说明符）打印一些“多字节”整数时，系统是小端字节序还是大端字节序都无所谓。结果将是相同的。

小字节序和大字节序之间的区别是多字节类型在内存中存储的顺序。但是，一旦将数据从内存中读取到核心处理器中，就没有任何区别。

此代码显示如何将整数（4个字节）放置在我的计算机上的内存中。

#include <stdio.h>

int main()
{
    unsigned int u = 0x12345678;
    printf("size of int is %zu\n", sizeof u);
    printf("DEC: u=%u\n", u);
    printf("HEX: u=0x%x\n", u);
    printf("memory order:\n");
    unsigned char * p = (unsigned char *)&u;
    for(int i=0; i < sizeof u; ++i) printf("address %p holds %x\n", (void*)&p[i], p[i]);
    return 0;
}

输出：

size of int is 4
DEC: u=305419896
HEX: u=0x12345678
memory order:
address 0x7ffddf2c263c holds 78
address 0x7ffddf2c263d holds 56
address 0x7ffddf2c263e holds 34
address 0x7ffddf2c263f holds 12

所以我可以看到我在一个低端字节序计算机上，因为LSB（最低有效字节，即78）存储在最低地址上。

在大字节序的计算机上执行相同的程序将显示（假设地址相同）：

size of int is 4
DEC: u=305419896
HEX: u=0x12345678
memory order:
address 0x7ffddf2c263c holds 12 
address 0x7ffddf2c263d holds 34 
address 0x7ffddf2c263e holds 56 
address 0x7ffddf2c263f holds 78

现在，最低位存储的是MSB（最高有效字节，即12）。

要了解的重要一点是，此仅仅与“如何在内存中存储多字节类型”有关。一旦将整数从内存中读取到内核内部的寄存器中，该寄存器将在大小范围为的小型和大型字节序计算机上以0x12345678的形式保存整数。

Answer 2

只有一种方法可以以十进制，二进制或十六进制格式表示整数。例如，数字43981等于十六进制的0xABCD或二进制形式的0b1010101111001101。其他任何值（0xCDAB，0xDCBA或类似值）代表不同的数字。

就C标准而言，编译器和cpu选择在内部存储该值的方式无关紧要；如果您特别不幸，则该值可以存储为36-bit one's complement，只要该标准要求的所有操作都具有同等的效果即可。

编程时几乎不需要检查内部数据表示。实际上，您唯一关心尾音的时间是在使用通信协议时，因为必须精确定义数据的二进制格式，但是即使这样，无论采用哪种架构，您的代码也不会有所不同：

// input value is big endian, this is defined
// by the communication protocol

uint32_t parse_comm_value(const char * ptr)
{
     // but bit shifts in C have the same
     // meaning regardless of the endianness
     // of your architecture

     uint32_t result = 0;
     result |= (*ptr++) << 24;
     result |= (*ptr++) << 16;
     result |= (*ptr++) << 8;
     result |= (*ptr++);
     return result;
}

Tl; dr调用诸如printf("0x%llx", number);之类的标准函数总是使用指定的格式打印正确的值。通过读取单个字节来检查内存的内容可以使您在架构上表示数据。

C，小端和大端之间的混淆

2 个答案: