RLE算法应压缩c

时间:2018-06-27 05:41:26

标签: c

我知道有关此算法的问题太多,但我找不到压缩字节的好答案。我是C语言的新手。我有以下代码:

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

//compress function here...

int main(int argc, char **argv) {

    if(argc != 2){
        fprintf(stderr, "Wrong argument number\n");
        exit(1);
    } 

    FILE *source = fopen(argv[1], "rb");
    if(source == NULL){
        fprintf(stderr, "Cannot open the file to be read\n");
        exit(1);
    }

    FILE *destination;
    char name = printf("%s.rle", argv[1]);
    while((destination = fopen(&name, "wb")) == NULL){
        fprintf(stderr, "Can't create the file to be written\n");
        exit(1);
    }

    compress_file(source, destination);

    int error;
    error = fclose(source);
    if(error != 0){
        fprintf(stderr, "Error: fclose failed for source file\n");
    }
    error = fclose(destination);
    if(error != 0){
        fprintf(stderr, "Error: fclose failed for destination file\n");
    }
}

如果这是test.c并且可执行文件是test。我需要在终端/命令提示符下将其作为“ ./test file.txt”进行处理。我的file.txt包含(字节)之类的内容:

20 21 20 20 8F 8F 21 21 64 60 70 20 21 90 90

,所需的输出是:

01 20 01 21 02 20 02 8F 02 21 01 64 01 60 01 70 01 20 01 21 02 90

我的代码创建了一个文件,该文件包括:

0b00 0000 0106 0000 0000 0000 0000 0000 0000 0000 0a

代替我想要的。我想念什么?

我也想将我的文件命名为file.txt.rle,但没有名称。

编辑:

char name[30];
sprintf(name, "%s.rle", argv[1]);

解决了命名问题。

2 个答案:

答案 0 :(得分:2)

  

我也想将我的文件命名为file.txt.rle,但没有名称。

好吧,这段代码

char name = printf("%s.rle", argv[1]);
while((destination = fopen(&name, "wb")) == NULL){

不会提供“ file.txt.rle”之类的字符串。而是尝试类似的东西:

size_t len = strlen(argv[1]) + 4 + 1;
char name[len];
sprintf(name, "%s.rle", argv[1]);
while((destination = fopen(name, "wb")) == NULL){
  

代替我想要的。我想念什么?

好吧,您错过了需要将数据放入str

此代码

char str[BUF_SIZE];
fwrite(str, sizeof(str), 1, destination);

只是将未初始化的变量写入文件。

我不会为您提供完整的解决方案,但是您可以从这里开始,然后自己解决其余问题。

void compress_file(FILE *source, FILE *destination){

    char str[BUF_SIZE];
    int index = 0;

    int repeat_count = 0;
    int previous_character = EOF;
    int current_character;

    while((current_character = fgetc(source)) != EOF){
        if(current_character != previous_character) {

            if (previous_character != EOF) {
                // Save the values to str
                str[index++] = repeat_count;
                str[index++] = previous_character;
            }

            previous_character = current_character;
            repeat_count = 1;
        }
        else{
            repeat_count++;
        }
    }
    if (repeat_count != 0)
    {
        str[index++] = repeat_count;
        str[index++] = previous_character;
    }

    fwrite(str, index, 1, destination);
}

示例1:

让我们说一个file.txt是:

ABBCCC

在linux上,可以这样显示十六进制:

# hexdump -C file.txt
00000000  41 42 42 43 43 43                                 |ABBCCC|

运行程序后,您将具有:

hexdump -C file.txt.rle
00000000  01 41 02 42 03 43                                 |.A.B.C|

示例2:

让我们说file.txt就像

# hexdump -C file.txt
00000000  20 21 20 20 8f 8f 21 21  64 60 70 20 21 90 90     | !  ..!!d`p !..|

结果将是

# hexdump -C file.txt.rle
00000000  01 20 01 21 02 20 02 8f  02 21 01 64 01 60 01 70  |. .!. ...!.d.`.p|
00000010  01 20 01 21 02 90                                 |. .!..|

答案 1 :(得分:1)

正如评论中指出的那样,您有两个问题:

  • 使用printf代替sprintf
  • 书面记录您所数的内容。

名称创建

char name = printf("%s.rle", argv[1]);
destination = fopen(&name, "wb");

第一行将存储argv[1]中的字符数加4到name中。自man printf起,

  

成功返回后,这些函数返回打印的字符数(不包括用于结束输出到字符串的空字节)。

第二行的问题更大:您要求fopen打开一个文件,为该文件提供指向char的指针,而不是读取的字符串。

一种正确的方法来做你想要的事情:

/* reserve memory to store file name
   NOTE: 256 here might not large enough*/
char name[256];
/* fill name array with original name + '.rle' 
   The return of sprintf is tested to assert that its size was enough */    
if (snprintf(name, sizeof name, "%s.rle", argv[1]) >= sizeof name)
{
    fprintf(stderr, "name variable is not big enough to store destination filename");
}

写文件

代码

char str[BUF_SIZE];
fwrite(str, sizeof(str), 1, destination);

保留一个大数组,并将其写入文件,而无需初始化。要执行您想要的操作,可以采用以下方法:

  • 创建一个仅在文件中写入两个字符的函数:找到的字符数和字符本身
  • 每次需要时都调用此函数(在更改字符时,但在字符之一为EOF时不调用)

让我们看看:

void write_char_to_file(FILE *f, int count, char car)
{
    /* char array to be stored in file */
    char str[2];   
    /* number of repeating characters */
    str[0] = count;
    /* the character */
    str[1] = car;
    /* write it to file */
    fwrite(str, sizeof str, 1, f);    
}

此功能有两个潜在问题:

  • 它无法处理char溢出(如果count超过256会怎样?)
  • 它不会测试fwrite的返回。

然后,当当前字符更改时,应在何时调用此函数:

EOF A A B C C EOF

在此示例中,我们更改了4个字符,但是我们只希望在文件中写入3个字符,因此:

  • 上一个字符为EOF时更改字符必须忽略(否则我们将在文件开始时写0 (char)EOF之类的字眼)
  • 必须在while循环之后添加一个文字,因为当最后一次读取给出EOF时,我们仍然有2 C要写入文件。

让我们看一下代码:

while((current_character = fgetc(source)) != EOF) {
    if(current_character != previous_character) { 
        /* ignore initial change */            
        if (previous_character != EOF) {  
            write_char_to_file(destination, repeat_count, previous_character);
        }
        previous_character = current_character;
        repeat_count = 1;
    } else {
        repeat_count++;
    }
}
/* write last change */
write_char_to_file(destination, repeat_count, previous_character);

此代码也有问题:如果输入文件为空怎么办? (初读为EOF


完整代码:

#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>

#define BUF_SIZE 5096

void write_char_to_file(FILE *f, int count, char car)
{
    /* char array to be stored in file */
    char str[2];   
    /* number of repeating characters */
    str[0] = count;
    /* the character */
    str[1] = car;
    /* write it to file */
    fwrite(str, sizeof str, 1, f);    
}

void compress_file(FILE *source, FILE *destination)
{
    int repeat_count = 0;
    int previous_character = EOF;
    int current_character;

    while((current_character = fgetc(source)) != EOF) {
        if(current_character != previous_character) {            
            if (previous_character != EOF) {  
                write_char_to_file(destination, repeat_count, previous_character);
            }
            previous_character = current_character;
            repeat_count = 1;
        } else {
            repeat_count++;
        }
    }
    write_char_to_file(destination, repeat_count, previous_character);
}

int main(int argc, char **argv) {
    if(argc != 2) {
        fprintf(stderr, "Wrong argument number\n");
        exit(1);
    } 

    FILE *source = fopen(argv[1], "rb");
    if(source == NULL) {
        fprintf(stderr, "Cannot open the file to be read\n");
        exit(1);
    }

    FILE *destination;
    /* reserve memory to store file name
       NOTE: 256 here might not large enough*/
    char name[256];
    /* fill name array with original name + '.rle' 
       The return of sprintf is tested to assert that its size was enough */    
    if (snprintf(name, sizeof name, "%s.rle", argv[1]) >= sizeof name)
    {
        fprintf(stderr, "name variable is not big enough to store destination filename");
    }

    /* while is not needed here, if do the job */
    if((destination = fopen(name, "wb")) == NULL) {
        fprintf(stderr, "Can't create the file to be written\n");
        exit(1);
    }

    compress_file(source, destination);

    int error;
    error = fclose(source);
    if(error != 0) {
        fprintf(stderr, "Error: fclose failed for source file\n");
    }
    error = fclose(destination);
    if(error != 0) {
        fprintf(stderr, "Error: fclose failed for destination file\n");
    }

    /* main must return a integer */
    return 0;
}