Question

所以，我的SFH功能：

/*  
 * Hash function (found at: 'http://www.azillionmonkeys.com/qed/hash.html')  
 */ 
int32_t SuperFastHash(const char * data, int len)  {
    uint32_t hash = len, tmp;
    int rem;

    if (len <= 0 || data == NULL) return 0;

    rem = len & 3;
    len >>= 2;

    /* Main loop */
    for (;len > 0; len--) {
        hash  += get16bits (data);
        tmp    = (get16bits (data+2) << 11) ^ hash;
        hash   = (hash << 16) ^ tmp;
        data  += 2*sizeof (uint16_t);
        hash  += hash >> 11;
    }

    /* Handle end cases */
    switch (rem) {
        case 3: hash += get16bits (data);
                hash ^= hash << 16;
                hash ^= ((signed char)data[sizeof (uint16_t)]) << 18;
                hash += hash >> 11;
                break;
        case 2: hash += get16bits (data);
                hash ^= hash << 11;
                hash += hash >> 17;
                break;
        case 1: hash += (signed char)*data;
                hash ^= hash << 10;
                hash += hash >> 1;
    }

    /* Force "avalanching" of final 127 bits */
    hash ^= hash << 3;
    hash += hash >> 5;
    hash ^= hash << 4;
    hash += hash >> 17;
    hash ^= hash << 25;
    hash += hash >> 6;

    // Limits hashes to be within the hash table    
    return hash % HT_LENGTH; 
}

看起来它的工作正常，（因为除了最后一行之外的一切都没有被我改变）。

这是我将字典加载到哈希表中的函数，哈希表似乎也在工作中。

bool load(const char* dictionary)
{
    // declares file pointer
    FILE* dictptr = fopen(dictionary, "r");

    // declare temp index
    uint32_t index = 0;

    // read words, one by one
    while(true)
    {

        // malloc node
        node* new_node = malloc(node_size);

        // insert word into node, if fscanf couldn't scan word; we're done
        if (fscanf(dictptr, "%s", new_node->word) != 1)
        {
            return true;
        }

        // hash word - HASH FUNCTION CALL -
        index = SuperFastHash(&new_node->word[0], sizeof(new_node->word));

        // check if head node has been assigned with value
        if (!strcmp(hashtable[index].word,""))
        {
            // declare hashtable[index] to new_node
            hashtable[index] = *new_node;

            //increment size
            hashtablesize++;
        }

        else
        {
            // if node is initialized, insert after head 
            new_node->next = hashtable[index].next;
            hashtable[index].next = new_node;

            //increment size
            hashtablesize++;
        }
    } 
}

最后，我的检查功能会根据哈希表检查一个单词。

bool check(const char* keyword)
{

    // gets index from SFH
    uint32_t index = SuperFastHash(keyword, sizeof(keyword));

    // declares head pointer to the pointer of the index'd element of hashtable
    node* head = &hashtable[index];

    // if word of head is equal to keyword, return true 
    // else continue down chain till head is null or key is found
    while (head != NULL)
    {
        if (!strcmp(head->word, keyword))
        {
            return true;
        }
        head = head->next;
    }
    return false;
}

注意：当使用不同的哈希函数时，一切正常，所以我怀疑问题与len参数或实际的SFH函数有关。

我已经用lldb检查了索引返回的内容，例如＆＃34; cat＆＃34;不等于＆＃34; cat＆＃34;驻留在哈希表中。也就是说，函数调用在load中返回的索引。

Answer 1

一些事情......

作为提及的评论者，使用sizeof()将无法为您提供正确的字符串长度。例如，更改

index = SuperFastHash(&new_node->word[0], sizeof(new_node->word));

到

index = SuperFastHash(&new_node->word[0], strlen(new_node->word));

阅读完字典文件后，您无法拨打fclose()。如果fopen()成功，则应致电fclose()。

以下代码看起来有点可疑：

// check if head node has been assigned with value
if (!strcmp(hashtable[index].word,""))
{
    // declare hashtable[index] to new_node
    hashtable[index] = *new_node;

    //increment size
    hashtablesize++;
}

如果哈希表在开始时已完全初始化，您是否需要递增hashtablesize？如果哈希表未完全初始化，则对尚未初始化的条目调用strcmp()可能会出现问题。您没有显示声明或初始化代码，所以它不是100％清楚这是否实际上是一个问题，但可能需要仔细检查。

SuperFastHash为相等的字符串返回不同的哈希值，但仅在由不同的函数调用确定时才会返回

1 个答案: