Question

我有一个文本文件，我认为我正确实现了LZW算法，但压缩文件比原始文件大。

我没有在文本的字节中运行LZW，而是在字符串中运行。

我构建了一个字典[string:int]并运行它。我想知道我是否应该使用字节而不是字符串。

它也逐行运行，而不是为整个文件只构建一个字典。

这是我的LZW

map<string, int> D;                      //dictionary

int init(){                              //init dictionary with all single chars
    D.clear(); rD.clear();
    f(i,,256){
        D[string(1, char(i))] = i + 1;
    }
    return 257;
}

void encode(char* file){                 //LZW encoding method
    ifstream in(file);
    if (!in.is_open()) {cout<<"Could not open file"<<endl; return;}
    else {
        ofstream out("compressed.txt");
        for(string text; getline(in, text); ){

            int value = init();
            vector<int> idx;
            string p = "", c = "", pc = "";

            for(int i = 0; i < text.size(); i++){
                c = text[i];
                let s = p + c;
                if(D.find(s) != D.end()){
                    p = p + c;


          }
            else{
                idx.push_back(D[p]);
                D[s] = value++;
                p = c;
            }
        }
        idx.push_back(D[p]);
        int len = idx.size();
        f(i,,len) {out<<idx[i]; if(i == len-1) out<<" 0"<<endl; else out<<" ";}
    }
    in.close();
    out.close();
    cout<<"File compressed successfully"<<endl;

}

}

它只接收文件的地址并将其压缩为＆＃34; compressed.txt＆＃34;文件。

Answer 1

LZW的核心是将重复的字节转换为符号，然后将符号写入比特流。您拥有的字节越多，压缩率就越高。打包的位将节省很多空间。

当你以这种方式将一个符号作为int写入ofstream时，它可能会使用超过4个字节。但是对于打包位，它应该占用9位到16位，具体取决于您的设置方式。我认为这是你的产出大于预期的主要原因。

祝你好运。

LZW压缩生成文件比原来大

1 个答案: