LZW中的压缩问题

时间:2015-05-20 11:49:59

标签: c compression lzw

我在实施LZW压缩机时遇到了麻烦。压缩器似乎工作正常但在处理某些流时它并没有放置流结束字符(用值256定义),结果是解压缩器将无限循环。 压缩机的代码如下:

int compress1(FILE* input, BIT_FILE* output) {
CODE next_code;         // next node
CODE current_code;      // current node
CODE index;             // node of the found character
int character;
int ret;

next_code = FIRST_CODE;

dictionary_init();

if ((current_code = getc(input)) == EOF)
    current_code = EOS;

while ((character = getc(input)) != EOF) {
    index  = dictionary_lookup(current_code, (SYMBOL)character);
    if (dictionary[index].code != UNUSED) {
        current_code = dictionary[index].code;
    }
    else {
        if (next_code <= MAX_CODE-1) {
            dictionary[index].code = next_code++;
            dictionary[index].parent = current_code;
            dictionary[index].symbol = (SYMBOL)character;
        }
        else {
            // handling full dictionary
            dictionary_init();
            next_code = FIRST_CODE;
        }
        ret = bit_write(output, (uint64_t) current_code, BITS);
        if( ret != 0)
            return -1;

        current_code = (CODE)character;
    }
}
ret = bit_write(output, (uint64_t) current_code, BITS);
if (ret != 0)
    return -1;

ret = bit_write(output, (uint64_t) EOS, BITS);
if (ret != 0)
    return -1;

if (bit_close(output) == -1) {
    printf("Ops: error during closing\n");
    return -1;
}

return 0;
}

CODESYMBOL分别为typedefuint32_t的{​​{1}},uint16_t定义为257.功能{{ 1}}简单地初始化字典,FIRST_CODE返回具有符号&#34; dictionary_init()&#34;的孩子的索引。父节点&#34; dictionary_lookup()&#34; (如果存在)。

二进制文件的写入定义为:

character

我已经使用另一个函数打开了文件,因此current_node将指向int bit_write(BIT_FILE* bf, uint64_t data, int len) { int space, result, offset, wbits, udata; uint64_t* p; uint64_t tmp; udata = (int)data; if (bf == NULL || len < 1 || len > (8* sizeof(data))) return -1; if (bf->reading == true) return -1; while (len > 0) { space = bf->end - bf->next; if (space < 0) { return -1; } // if buffer is full, flush data to file and reinit BIT_IO struct if (space == 0) { result = bit_flush(bf); if (result < 0) return -1; } p = bf->buf + (bf->next/64); offset = bf->next % 64; wbits = 64 - offset; if (len < wbits) wbits = len; tmp = le64toh(*p); tmp |= (data << offset); *p = htole64(tmp); bf->next += wbits; len -= wbits; data >>= wbits; } return 0; } 结构的指针作为输入。 有人可以帮我找到错误吗?

出现此问题的一个例子如下:

如果输入字符串是&#34; Nel mezzo del cammi &#34;一切正常,我有以下压缩文件(十六进制,使用12位编码符号):

4E 50 06 6C 00 02 6D 50 06 7A A0 07 6F 00 02 64
20 10 20 30 06 61 D0 06 6D 90 06 0D A0 00 00 01

如果我在字符串中添加另一个字符,特别是&#34; Nel mezzo del cammin &#34;,我有以下结果:

4E 50 06 6C 00 02 6D 50 06 7A A0 07 6F 00 02 64
20 10 20 30 06 61 D0 06 6D 90 06 6E D0 00 0A 00
10

在第二种情况下,它没有正确地写出End of Stream。

解决方案:检查缓冲区中是否有足够的空间用于我要编写的整个编码符号。只需改变:

bit_write

为:

bf

0 个答案:

没有答案