Question

我正在尝试用c ++实现Huffman的编码算法。

我的问题是：在我得到每个字符的等效二进制字符串后，如何将这些0和1作为二进制文件写入文件而不是字符串0或字符串1？

提前感谢...

Answer 1

在不同的数据结构中单独获取每个字符的编码是一个破碎的解决方案，因为你需要在生成的二进制文件中并置每个字符的编码：单独存储它们就像直接将它们连续存储在一个文件中一样难。 位向量。

这个考虑建议使用std::vector<bool>来执行你的任务，但它是一个破碎的解决方案，因为它不能被视为c风格的数组，你真的需要在输出时。

这question确切地询问哪些是std::vector<bool>的有效替代品，所以我认为这个问题的答案非常适合您的问题。

顺便说一句，我要做的就是将std::vector<uint8_t>包装在一个适合您需要的类下面，如附带的代码：

#include <iostream>
#include <vector>
#include <cstdint>
#include <algorithm>
class bitstream {
private:
    std::vector<std::uint8_t> storage;
    unsigned int bits_used:3;
    void alloc_space();
public:
    bitstream() : bits_used(0) { }

    void push_bit(bool bit);

    template <typename T>
    void push(T t);

    std::uint8_t *get_array();

    size_t size() const;

    // beware: no reference!
    bool operator[](size_t pos) const;
};

void bitstream::alloc_space()
{
    if (bits_used == 0) {
        std::uint8_t push = 0;
        storage.push_back(push);
    }
}

void bitstream::push_bit(bool bit)
{
    alloc_space();
    storage.back() |= bit << 7 - bits_used++;
}

template <typename T>
void bitstream::push(T t)
{
    std::uint8_t *t_byte = reinterpret_cast<std::uint8_t*>(&t);
    for (size_t i = 0; i < sizeof(t); i++) {
        uint8_t byte = t_byte[i];
        if (bits_used > 0) {
            storage.back() |= byte >> bits_used;
            std::uint8_t to_push = (byte & ((1 << (8 - bits_used)) - 1)) << bits_used;
            storage.push_back(to_push);
        } else {
            storage.push_back(byte);
        }
    }
}

std::uint8_t *bitstream::get_array()
{
    return &storage.front();
}

size_t bitstream::size() const
{
    const unsigned int m = 0;
    return std::max(m, (storage.size() - 1) * 8 + bits_used);
}

bool bitstream::operator[](size_t size) const
{
    // No range checking
    return static_cast<bool>((storage[size / 8] >> 7 - (size % 8)) & 0x1);
}

int main(int argc, char **argv)
{
    bitstream bs;
    bs.push_bit(true);
    std::cout << bs[0] << std::endl;
    bs.push_bit(false);
    std::cout << bs[0] << "," << bs[1] << std::endl;
    bs.push_bit(true);
    bs.push_bit(true);
    std::uint8_t to_push = 0xF0;
    bs.push_byte(to_push);
    for (size_t i = 0; i < bs.size(); i++)
        std::cout << bs[i] << ",";
    std::cout << std::endl;
}

Answer 2

我希望这段代码可以帮到你。

从一个字节序列（1和0）开始，表示输入文件的每个字符的连续编码。
您获取序列的每个字节并将一个位添加到临时字节（char byte）
每次填写一个字节时，都会将其写入文件（为了提高效率，还可以等待，以获得更大的数据）
最后，将剩余的位写入文件，填充尾随零，例如
正如akappa正确指出的那样，如果else在每次文件写入操作后设置为byte，则可以删除0分支（或者更一般地说，每次完全写入1s时填写并在其他地方刷新，所以只能写void writeBinary(char *huffmanEncoding, int sequenceLength) { char byte = 0; // For each bit of the sequence for (int i = 0; i < sequenceLength; i++) { char bit = huffmanEncoding[i]; // Add a single bit to byte if (bit == 1) { // MSB of the sequence to msb of the file byte |= (1 << (7 - (i % 8))); // equivalent form: byte |= (1 << (-(i + 1) % 8); } else { // MSB of the sequence to msb of the file byte &= ~(1 << (7 - (i % 8))); // equivalent form: byte &= ~(1 << (-(i + 1) % 8); } if ((i % 8) == 0 && i > 0) { //writeByteToFile(byte); } } // Fill the last incomplete byte, if any, and write to file }。

{{1}}

Answer 3

你不能只用位写入二进制文件;写入的最小数据大小是一个字节（因此是8位）。

所以你应该做的是创建一个缓冲区（任何大小）。

char BitBuffer;

写入缓冲区：

int Location;
bool Value;

if (Value)
    BitBuffer |= (1 << Location);
else
    BitBuffer &= ~(1 << Location)

代码(1 << Location)生成一个全0的数字，但Location指定的位置除外。然后，如果Value设置为true，则将Buffer中的相应位设置为1，而将其他情况设置为0。使用的二进制操作非常简单，如果你不理解它们，它应该在任何好的C ++书/教程中。

位置应该是范围中的数字＆lt; 0，sizeof（缓冲区）-1＆gt;，所以＆lt; 0,7＆gt;在这种情况下。

使用fstream时，将缓冲区写入文件相对简单。请记住将其打开为二进制文件。

ofstream File;
File.open("file.txt", ios::out | ios::binary);
File.write(BitBuffer, sizeof(char))

编辑：注意到一个错误并修复了它。

EDIT2：你不能在二进制模式下使用<<运算符，我忘记了。

替代解决方案：使用std::vector<bool>或std::bitset作为缓冲区。

这应该更简单，但我想我可以帮助你一点点。

void WriteData (std::vector<bool> const& data, std::ofstream& str)
{
    char Buffer;
    for (unsigned int i = 0; i < data.size(); ++i)
    {
       if (i % 8 == 0 && i != 0)
           str.write(Buffer, 1);
       else
           // Paste buffer setting code here
           // Location = i/8;
           // Value = data[i];
    }
    // It might happen that data.size() % 8 != 0. You should fill the buffer
    // with trailing zeros and write it individually.
}

如何用c ++编写二进制文件

3 个答案: