Question

我尝试从不同文件中读取内容，计算它们的sha1，并将其与先前计算的哈希值进行比较，以了解内容是否已被修改。我有一个问题：有时当我计算sha1哈希时，我得到完全不同内容的相同哈希。我有一个带有url的文件。我读了一个网址，然后从我最近已知内容的哈希表中获取存储的校验和。我使用函数read_file获取他的内容，我计算当前内容的哈希并比较2个哈希值。

主要执行的代码：

GHashTable *table = loadChecksums("checksums.txt"); //here i load some checksums i calculated before and stored
char *result=malloc(SIZE_MAX_URL*sizeof(char));
for(int i=0; i<SIZE_MAX_LIST; i++){
  if(fgets(result, SIZE_MAX_URL, fichier_url)!=NULL) {
    char *aux=strchr(result,'\n');
    aux[0]='\0';
    // i get the url from the file, now i read the coontent
    char *contenu = read_file(result);
    //get previously calculated hash to compare it with the one we will calculate
    char *hash = g_hash_table_lookup(table, result);
    unsigned char hashCalc[SHA_DIGEST_LENGTH];
    printf("Contenu: %s\n",contenu); // just to check it is the content i expect from the file
    SHA1(contenu, sizeof(contenu)-1, hashCalc); // calculate hash
    char *hexHash = toHex(hashCalc); // turn it into hex

    if(compareHash(hexHash,hash)){ // nothing changed
      printf("Le site n'a pas changé\n");
    }else{// content changed : i update my hashtable
      printf("Checksum different\n");
      g_hash_table_insert(table,g_strdup(result),g_strdup(hexHash));
    }
  }
}

哈希函数

int compareHash(char *hash1, char* hash2){
  if(hash1 == NULL || hash2 == NULL){
    return 0;
  }
  for (int i = 0; i < 40; i++){
    if(hash1[i] != hash2[i]){
      return 0;
    }
  }
  return 1;
}

char * toHex(unsigned char*hash){
  char *hashHex = malloc(41);
  for(int i = 0 ; i < SHA_DIGEST_LENGTH ; i++){
    sprintf(hashHex+(i*2),"%02x",(unsigned char)hash[i]);
  }
  hashHex[40]='\0';
  return hashHex;
}

我用的第一个文件为例：

<!doctype html>
<html>
<head>
    <title>Test Test8</title>
</head>
<body>
    <p>Test1</p>
    <p>Tesythythytht2</p>
    <p>Test3</p>
    <p>Test4</p>
</body>
</html>

第二档：

<!doctype html>
<html>
<head>
    <title>Test Test5</title>
</head>
<body>
    <p>Test16518489498</p>
    <p>Test2</p>
    <p>Tetyhtyhtyhst3</p>
    <p>Test4</p>
    orejgioejgiojeoijgreoigjeoij
</body>
</html>

使用此代码和这些示例，我得到以下两个哈希：

7a820ae07b87e220980556ea12d79e65eb3c19d3

有谁看到我做错了什么？谢谢你的帮助。

PS：我不知道我是否在这篇文章中提供了太多信息或太少信息，对不起，如果不清楚的话。

不同字符串的SHA1哈希值相同

0 个答案: