Question

我已经看到了几个不同的代码示例，它们将big endian转换为little endian，反之亦然，但是我遇到了一段代码，有人写道似乎可以工作，但是我我难以理解为什么会这样。

基本上，有一个char缓冲区，在某个位置，包含一个存储为big-endian的4字节int。代码将提取整数并将其存储为本机小端。这是一个简短的例子：

char test[8] = { 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07};
char *ptr = test;
int32_t value = 0;
value =  ((*ptr) & 0xFF)       << 24;
value |= ((*(ptr + 1)) & 0xFF) << 16;
value |= ((*(ptr + 2)) & 0xFF) << 8;
value |= (*(ptr + 3)) & 0xFF;
printf("value: %d\n", value);

值：66051

上面的代码占用前四个字节，将其存储为小端，并打印结果。任何人都可以一步一步解释这是如何工作的？我很困惑为什么（（* ptr）＆amp; 0xFF）＆lt;＆lt;对于任何X＆gt; = 8，X不会仅评估为0。

Answer 1

此代码正在构造值，一次一个字节。

首先它捕获最低字节

 (*ptr) & 0xFF

然后将其移至最高字节

 ((*ptr) & 0xFF) << 24

然后将其分配给之前的0初始化值。

 value =((*ptr) & 0xFF) << 24

现在＆＃34;魔术＆＃34;发挥作用。由于ptr值被声明为char*，因此添加一个值会使指针前移一个字符。

 (ptr + 1) /* the next character address */
 *(ptr + 1) /* the next character */

在看到他们使用指针数学来更新相对起始地址后，其余操作与已经描述的操作相同，只是为了保留部分移位的值，他们or值进入现有的value变量

 value |= ((*(ptr + 1)) & 0xFF) << 16

请注意，指针数学是您可以执行

之类的操作的原因

 char* ptr = ... some value ...

 while (*ptr != 0) {
     ... do something ...
     ptr++;
 }

但它的代价是可能真的搞乱你的指针地址，大大增加了SEGFAULT违规的风险。有些语言认为这是一个问题，他们删除了做指针数学的能力。一个几乎无法进行指针数学运算的指针通常称为引用。

Answer 2

如果你想将little endian represantion转换为big endian，你可以使用htonl，htons，ntohl，ntohs。这些函数在主机和网络字节顺序之间转换值。 Big endian也用于基于arm的平台。见这里：https://linux.die.net/man/3/endian

Answer 3

您可能使用的代码基于以下思想：网络上的数字应以BIG ENDIAN模式发送。

函数htonl()和htons()在BIG ENDIAN中转换32位整数和16位整数，其中系统使用LITTLE ENDIAN，否则它们将数字保留为BIG ENDIAN。

这里是代码：

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <arpa/inet.h>

int main(void)
{
    uint32_t x,y;
    uint16_t s,z;

    x=0xFF567890;

    y=htonl(x);

    printf("LE=%08X BE=%08X\n",x,y);

    s=0x7891;

    z=htons(s);

    printf("LE=%04X BE=%04X\n",s,z);

    return 0;

}

编写此代码是为了在LE机器上从LE转换为BE。

您可以使用相反的函数ntohl()和ntohs()从BE转换为LE，这些函数将LE上的整数从BE转换为LE并且不在BE机器上转换。< / p>

Answer 4

我很困惑为什么（（* ptr）＆amp; 0xFF）＆lt;＆lt;对于任何X＆gt; = 8，X不会仅评估为0。

我认为你误解了转换功能。

value = ((*ptr) & 0xFF) << 24;

表示使用0xff（字节）屏蔽ptr处的值，然后移位24位（不是字节）。这是一个24/8字节（3个字节）到最高字节的转换。

Answer 5

理解((*ptr) & 0xFF) << X

评估的关键点之一

是Integer Promotion。在转移之前，价值(*ptr) & 0xff会提升为Integer。

Answer 6

我写了下面的代码。此代码包含两个函数swapmem()和swap64()。

swapmem()交换任意维度的内存区域的字节。
swap64()交换64位整数的字节。

在本回复结束时，我指出了一个用缓冲区来解决问题的想法。

这里是代码：

#include <stdio.h>
#include <stdint.h>
#include <inttypes.h>
#include <malloc.h>

void * swapmem(void *x, size_t len, int retnew);
uint64_t swap64(uint64_t k);

/**
    brief swapmem

         This function swaps the byte into a memory buffer.

    param x
         pointer to the buffer to be swapped

    param len
         lenght to the buffer to be swapped

    param retnew
         If this parameter is 1 the buffer is swapped in a new
         buffer. The new buffer shall be deallocated by using
         free() when it's no longer useful.

         If this parameter is 0 the buffer is swapped in its
         memory area.

    return
        The pointer to the memory area where the bytes has been
        swapped or NULL if an error occurs.
*/
void * swapmem(void *x, size_t len, int retnew)
{
    char *b = NULL, app;
    size_t i;

    if (x != NULL) {
        if (retnew) {
            b = malloc(len);
            if (b!=NULL) {
                for(i=0;i<len;i++) {
                    b[i]=*((char *)x+len-1-i);
                }
            }
        } else {
            b=(char *)x;
            for(i=0;i<len/2;i++) {
                app=b[i];
                b[i]=b[len-1-i];
                b[len-1-i]=app;
            }
        }
    }
    return b;
}

uint64_t swap64(uint64_t k)
{
    return ((k << 56) |
            ((k & 0x000000000000FF00) << 40) |
            ((k & 0x0000000000FF0000) << 24) |
            ((k & 0x00000000FF000000) << 8) |
            ((k & 0x000000FF00000000) >> 8) |
            ((k & 0x0000FF0000000000) >> 24)|
            ((k & 0x00FF000000000000) >> 40)|
            (k >> 56)
           );
}

int main(void)
{
    uint32_t x,*y;
    uint16_t s,z;
    uint64_t k,t;

    x=0xFF567890;

    /* Dynamic allocation is used to avoid to change the contents of x */
    y=(uint32_t *)swapmem(&x,sizeof(x),1);
    if (y!=NULL) {
        printf("LE=%08X BE=%08X\n",x,*y);
        free(y);
    }

    /* Dynamic allocation is not used. The contents of z and k will change */
    z=s=0x7891;
    swapmem(&z,sizeof(z),0);
    printf("LE=%04X BE=%04X\n",s,z);

    k=t=0x1120324351657389;
    swapmem(&k,sizeof(k),0);
    printf("LE=%16"PRIX64" BE=%16"PRIX64"\n",t,k);

    /* LE64 to BE64 (or viceversa) using shift */
    k=swap64(t);
    printf("LE=%16"PRIX64" BE=%16"PRIX64"\n",t,k);

    return 0;
}

编译程序后，我有好奇心看到生成的汇编代码gcc。我发现函数swap64的生成如下所示。

00000000004007a0 <swap64>:
  4007a0:       48 89 f8                mov    %rdi,%rax
  4007a3:       48 0f c8                bswap  %rax
  4007a6:       c3                      retq

这个结果是在具有Intel I3 CPU的PC上使用gcc选项编译代码：-Ofast，或-O3，或-O2，或-Os。

您可以使用swap64()功能等方法解决问题。像下面这样的函数我命名为swap32()：

uint32_t swap32(uint32_t k)
{
    return ((k << 24) |
            ((k & 0x0000FF00) << 8) |
            ((k & 0x00FF0000) >> 8) |
            (k >> 24)
           );
}

您可以将其用作：

uint32_t j=swap32(*(uint32_t *)ptr);

用于将big endian转换为little endian的C / C ++代码

6 个答案: