使用unsigned char数组处理单个字节

时间:2015-03-30 18:29:28

标签: c++ arrays bit-manipulation byte

我搜索了很多网站,似乎找不到任何相关内容。

我希望能够获取每个默认数据类型的单个字节,例如short,unsigned short,int,unsigned int,float和double,并将每个单独的字节信息(二进制部分)存储到每个索引中。 unsigned char数组。如何实现这一目标?

例如:

int main() {
    short sVal = 1;
    unsigned short usVal = 2;
    int iVal = 3;
    unsigned int uiVal = 4;
    float fVal = 5.0f;
    double dVal = 6.0;

    const unsigned int uiLengthOfShort  = sizeof(short);
    const unsigned int uiLengthOfUShort = sizeof(unsigned short);
    const unsigned int uiLengthOfInt    = sizeof(int);
    const unsigned int uiLengthOfUInt   = sizeof(unsigned int);
    const unsigned int uiLengthOfFloat  = sizeof(float);
    const unsigned int uiLengthOfDouble = sizeof(double);

    unsigned char ucShort[uiLengthOfShort];
    unsigned char ucUShort[uiLengthOfUShort];
    unsigned char ucInt[uiLengthOfInt];
    unsigned char ucUInt[uiLengthOfUInt];
    unsigned char ucFloat[uiLengthOfFloat];
    unsigned char ucDouble[uiLengthOfDouble];

    // Above I declared a variable val for each data type to work with
    // Next I created a const unsigned int of each type's size.
    // Then I created unsigned char[] using each data types size respectively
    // Now I would like to take each individual byte of the above val's
    // and store them into the indexed location of each unsigned char array.

    // For Example: - I'll not use int here since the int is 
    // machine and OS dependent. 
    // I will use a data type that is common across almost all machines. 
    // Here I will use the short as my example

    // We know that a short is 2-bytes or has 16 bits encoded
    // I would like to take the 1st byte of this short:
    // (the first 8 bit sequence) and to store it into the first index of my unsigned char[].
    // Then I would like to take the 2nd byte of this short:
    // (the second 8 bit sequence) and store it into the second index of my unsigned char[]. 

    // How would this be achieved for any of the data types?

    // A Short in memory is 2 bytes here is a bit representation of an 
    // arbitrary short in memory { 0101 1101, 0011 1010 }
    // I would like ucShort[0] = sVal's { 0101 1101 } &
    //              ucShort[1] = sVal's { 0011 1010 } 

    ucShort[0] = sVal's First Byte info. (8 Bit sequence)
    ucShort[1] = sVal's Second Byte info. (8 Bit sequence)

    // ... and so on for each data type.

    return 0;
}

4 个答案:

答案 0 :(得分:1)

好的,首先,如果可以避免,请不要这样做。它很危险,可能非常依赖于建筑。

上面的评论员是正确的,union是最安全的方法,你还有endian问题,是的,但至少你没有堆栈对齐问题(我假设这是针对网络代码的,所以堆栈-alignment是另一个潜在的架构问题)

我发现这是最直接的方式:

uint32_t example_int;
char array[4];

//No endian switch
array[0] = ((char*) &example_int)[0];
array[1] = ((char*) &example_int)[1];
array[2] = ((char*) &example_int)[2];
array[3] = ((char*) &example_int)[3];

//Endian switch
array[0] = ((char*) &example_int)[3];
array[1] = ((char*) &example_int)[2];
array[2] = ((char*) &example_int)[1];
array[3] = ((char*) &example_int)[0];

如果您正在尝试编写跨体系结构代码,则需要以某种方式处理字节序问题。我的建议是构建一个短端测试并构建函数以基于上述方法“打包”和“解包”字节数组。应该注意,要“解包”一个字节数组,只需反转上面的赋值语句。

答案 1 :(得分:1)

最简单的正确方法是:

// static_assert(sizeof ucShort == sizeof sVal);

memcpy( &ucShort, &sVal, sizeof ucShort);

你在评论中写的东西是不正确的;除字符类型外,所有类型都具有与机器相关的大小。

答案 2 :(得分:0)

在Raw N的帮助下,通过为我提供一个网站,我对字节操作进行了搜索并找到了这个帖子 - http://www.cplusplus.com/forum/articles/12/并且它提供了一个类似于我正在寻找的解决方案,但是我必须对每种默认数据类型重复此过程。

答案 3 :(得分:0)

在做了一些测试后,这是我到目前为止所提出的,这取决于机器架构,但要在其他机器上执行此操作,概念是相同的。

typedef struct packed_2bytes {
    unsigned char c0;
    unsigned char c1;
} packed_2bytes;

typedef struct packed_4bytes {
    unsigned char c0;
    unsigned char c1;
    unsigned char c2;
    unsigned char c3;
} packed_4bytes;

typedef struct packed_8bytes {
    unsigned char c0;
    unsigned char c1;
    unsigned char c2;
    unsigned char c3;
    unsigned char c4;
    unsigned char c5;
    unsigned char c6;
    unsigned char c7;
} packed_8bytes;

typedef union {
    short s;
    packed_2bytes bytes;
} packed_short;

typedef union {
    unsigned short us;
    packed_2bytes bytes;
} packed_ushort;

typedef union { // 32bit machine, os, compiler only
    int i;
    packed_4bytes bytes;
} packed_int;

typedef union { // 32 bit machine, os, compiler only
    unsigned int ui;
    packed_4bytes bytes;
} packed_uint;

typedef union { 
    float f;
    packed_4bytes bytes;
} packed_float;

typedef union {
    double d;
    packed_8bytes bytes;
} packed_double;

没有实现仅使用这些类型的声明或定义。我确实认为它们应该包含正在使用的endian,但正在使用它们的人必须提前知道这一点,就像了解每种默认类型的机器体系结构大小一样。我不确定是否由于某人的赞美或签名位实现而导致签名int是否存在问题,但它也可能需要考虑。