Question

我有两个带有电话号码的文件（.txt）（每行一个号码）。文件非常庞大（282 MB），我正在编写一个程序来检查这两个文件（原始数据和DO NO CALL List），并过滤掉那些DO NO CALL列表中不存在的数字。与grep -f raw.txt类似的东西donotcall.txt -v＆gt; filtered.txt

我实现了一种非常简单的哈希表形式（单独的寻址，使用链表）。我的代码目前从DoNotCall.txt读取电话号码并将其存储在哈希表中。这是我用来生成哈希的函数。表大小是100

int hashgen(char s[])
{
  int hash;

  hash = (s[0] + s[1] + s[2] + s[3]) * 100 / 13;
  return hash;
}

哈希表：我的方式。

#define TABLESIZE 100
struct node {
char str[30];
struct node *next;
}
struct node *hashtble[TABLESIZE];

struct node *hashtable_alloc(void) //allocates space for a node in the memory
{
  struct node *tmp = calloc(1, sizeof(struct node));
  strcpy(tmp->str, "~"); //just a string to mark the head of the linked list
  tmp->next = NULL;
  return tmp;
}
void hashinit(void)
{
  struct node *t = NULL;
  int i=0;
  for(i=0; i<TABLE_SIZE; i++)
    ht[i] = hashtable_alloc();
}
void hashtable_add(char s[])
{
    struct node *t = NULL;
    int arrnum = hashgen(s);

    t = calloc(1, sizeof(struct node));
    strcpy(t->str, s);
    t->next = ht[arrnum];

    ht[arrnum] = t;

}

毫无疑问，我是一个处理哈希表的天真程序员。请建议我一个更好的哈希函数。虽然，我已经阅读过关于哈希表的文章，如果有人能告诉我一个更好的方法，比哈希表更好的东西，或者以更好的方式做哈希表方法，那将会很棒。提前致谢

建议适当的哈希函数

0 个答案: