顺序字节比较

时间:2014-07-16 21:07:46

标签: c hash byte sequential deduplication

如何使用xor按位运算在c中执行逐字节比较?比较两个文件时

#include<stdio.h>
int main()
{
    FILE *fp1, *fp2;
    int ch1, ch2;
    char fname1[40], fname2[40] ;

    printf("Enter name of first file :") ;
    gets(fname1);

    printf("Enter name of second file:");
    gets(fname2);

    fp1 = fopen( fname1,  "r" );
    fp2 = fopen( fname2,  "r" ) ;

    if ( fp1 == NULL )
       {
       printf("Cannot open %s for reading ", fname1 );
       exit(1);
       }
    else if (fp2 == NULL)
       {
       printf("Cannot open %s for reading ", fname2 );
       exit(1);
       }
    else
       {
       ch1  =  getc( fp1 ) ;
       ch2  =  getc( fp2 ) ;

       while( (ch1!=EOF) && (ch2!=EOF) && (ch1 == ch2))
        {
            ch1 = getc(fp1);
            ch2 = getc(fp2) ;
        }

        if (ch1 == ch2)
            printf("Files are identical n");
        else if (ch1 !=  ch2)
            printf("Files are Not identical n");

        fclose ( fp1 );
        fclose ( fp2 );
       }
return(0);
 }

我收到以下警告,然后当我运行它时说我的test2.txt为null但是有数据吗?

hb@hb:~/Desktop$ gcc -o check check.c
check.c: In function ‘main’:
check.c:21:8: warning: incompatible implicit declaration of built-in function ‘exit’ [enabled by default]
check.c:26:8: warning: incompatible implicit declaration of built-in function ‘exit’ [enabled by default]
hb@hb:~/Desktop$ 


hb@hb:~/Desktop$ ./check
Enter name of first file :test1.txt
Enter name of second file:test2.txt
Cannot open test2.txt for reading hb@hb:~/Desktop$

有什么想法吗?

1 个答案:

答案 0 :(得分:1)

有很多方法可以做到这一点,如果你有两个并排的文件,最简单的方法是简单地并排读取它们并比较缓冲区。

#define BUFFERSIZE 4096
FILE *filp1, *filp2;
char *buf1, *buf2;
bool files_equal;
int read1, read2;


filp1 = fopen("file1", "rb");
filp2 = fopen("file2", "rb");

// Don't forget to check that they opened correctly.

buf1 = malloc(sizeof(*buf1)*BUFFERSIZE);
buf2 = malloc(sizeof(*buf2)*BUFFERSIZE);

files_equal = true;

while ( true ) {
    read1 = fread(buf1, sizeof(*buf1), BUFFERSIZE, filp1);
    read2 = fread(buf2, sizeof(*buf2), BUFFERSIZE, filp2);

    if (read1 != read2 || memcmp( buf1, buf2, read1)) { 
         files_equal = false;
         break;
    }
}

如果在读取文件时发生错误,您可能会得到一些漏报,但您可能会为此添加一些额外的检查。

另一方面,如果您的文件位于两台不同的计算机上,或者您希望处理大量文件并查明其中的任何文件是否相同。最好的方法是使用校验和。

良好的校验和来自良好的哈希函数。根据您的安全要求,常见的实现使用:

  • SHA-1,SHA-2或SHA-3
  • MD5

还存在许多其他人。 Wiki