Question

我可以使用

从文件中读取4个字节

ifstream r(filename , ios::binary | ios::in)
uint_32 readHere;
r.read( (char*)&readHere, 4 )

但我怎么能读4.5字节= 4字节和4位。

我想到的是

ifstream r(filename , ios::binary | std::in)
uint_64t readHere;
r.read( (char*)&readHere, 5 ) // reading 5 bytes ;

uint_64t tmp = readHere & 11111111 // extract 5th bytes
tmp = tmp >> 4  // get first half of the bites
readHere = (( readHere >> 8 ) << 8) | tmp     // remove 5th byte then add 4 bits

但我不确定如果第一个或最后一个4，应该采取一半的字节。有没有更好的方法来检索它？

Answer 1

您可以在文件中读取或写入的最小单位，或者在内存中是char（公共系统上的字节（*））。您可以按字节顺序浏览更长的元素，并且有效的字节顺序在这里很重要。

uint32_t u = 0xaabbccdd;
char *p = static_cast<char *>(&u);
char c = p[0];    // c is 0xdd on a little endian system and 0xaa on a big endian one

但是只要你在里面一个字节，你所能做的就是使用按位ands和shift来提取低阶或高阶位。除非你决定使用一种约定，否则此处不再有字节序。

顺便说一句，如果你在网络接口上读取，或者甚至在单独传输位的串行线上读取，你一次得到一个完整的字节，并且在一次读取和其他4个读取时无法读取4位在下一个。

（*）旧系统（80年代的CDC）过去每个字符有6位 - 但当时C ++不存在，我不确定C编译器是否存在

Answer 2

目前还不清楚这是您控制的文件格式，还是其他内容。无论如何，让我们假设您有一些可以保存36位无符号值的整数数据类型：

typedef uint64_t u36;

现在，无论您的系统使用big-endian还是little-endian，您都可以通过一次一个字节以可预测的顺序将值写入二进制流。让我们使用big-endian，因为将这些位组合在一起创建一个值会稍微容易一点。

你可以使用朴素的移位和掩蔽到一个小缓冲区。唯一要决定的是截断半字节的位置。但是如果你遵循将每个值移动另外8位的模式，那么剩余部分自然会落在高位。

ostream & write_u36( ostream & s, u36 val )
{
    char bytes[5] = {
        (val >> 28) & 0xff,
        (val >> 20) & 0xff,
        (val >> 12) & 0xff,
        (val >> 4 ) & 0xff,
        (val << 4 ) & 0xf0
    };
    return s.write( bytes, 5 );
}

但这并不是你如何实际写出这些数字。您必须先扣掉第5个字节，直到完成或者您可以将下一个值打包到其中。或者你总是一次写两个值：

ostream & write_u36_pair( ostream & s, u36 a, u36 b )
{
    char bytes[9] = {
        (a >> 28) & 0xff,
        (a >> 20) & 0xff,
        (a >> 12) & 0xff,
        (a >> 4 ) & 0xff,
        (a << 4 ) & 0xf0 | (b >> 32) & 0x0f,
        (b >> 24) & 0xff,
        (b >> 16) & 0xff,
        (b >> 8) & 0xff,
        b & 0xff
    };
    return s.write( bytes, 9 );
}

现在，您可能会看到如何处理读取值并将它们反序列化为整数。最简单的方法是一次读两个。

istream & read_u36_pair( istream & s, u36 & a, u36 & b )
{
    char bytes[9];
    if( s.read( bytes, 9 ) )
    {
        a = (u36)bytes[0] << 28
          | (u36)bytes[1] << 20
          | (u36)bytes[2] << 12
          | (u36)bytes[3] << 4
          | (u36)bytes[4] >> 4;

        b = ((u36)bytes[4] & 0x0f) << 32
          | (u36)bytes[5] << 24
          | (u36)bytes[6] << 16
          | (u36)bytes[7] << 8
          | (u36)bytes[8];
    }
    return s;
}

如果您想一次读取一个，则需要跟踪某个状态，以便知道要读取的字节数（5或4）以及要应用的移位操作。像这样天真的东西：

struct u36deser {
    char bytes[5];
    int which = 0;
};

istream & read_u36( istream & s, u36deser & state, u36 & val )
{
    if( state.which == 0 && s.read( state.bytes, 5 ) )
    {
        val = (u36)state.bytes[0] << 28
            | (u36)state.bytes[1] << 20
            | (u36)state.bytes[2] << 12
            | (u36)state.bytes[3] << 4
            | (u36)state.bytes[4] >> 4;
         state.which = 1;
    }
    else if( state.which == 1 && s.read( state.bytes, 4 ) )
    {
        val = ((u36)state.bytes[4] & 0x0f) << 32  // byte left over from previous call
            | (u36)state.bytes[0] << 24
            | (u36)state.bytes[1] << 16
            | (u36)state.bytes[2] << 8
            | (u36)state.bytes[3];
        state.which = 0;
    }
    return s;
}

所有这些都是纯粹的假设，无论如何这似乎都是你的问题。还有许多其他方法可以将位序列化，其中一些方法并不明显。

从文件中读取位

2 个答案: