Wireshark是否使用众所周知的哈希函数来存储TCP流? (对于那些感兴趣的人,他们使用GHashTable。)或者是Wireshark开发人员自己出现的东西?另外,对于它所用的输入数据(即地址和端口),是否有关于散列函数均匀性的数据?
仅供参考,这里是conversation_key
结构定义:
typedef struct conversation_key {
struct conversation_key *next;
address addr1;
address addr2;
port_type ptype;
guint32 port1;
guint32 port2;
} conversation_key;
这是哈希函数本身:
static guint
conversation_hash_exact(gconstpointer v)
{
const conversation_key *key = (const conversation_key *)v;
guint hash_val;
address tmp_addr;
hash_val = 0;
tmp_addr.len = 4;
ADD_ADDRESS_TO_HASH(hash_val, &key->addr1);
tmp_addr.data = &key->port1;
ADD_ADDRESS_TO_HASH(hash_val, &tmp_addr);
ADD_ADDRESS_TO_HASH(hash_val, &key->addr2);
tmp_addr.data = &key->port2;
ADD_ADDRESS_TO_HASH(hash_val, &tmp_addr);
hash_val += ( hash_val << 3 );
hash_val ^= ( hash_val >> 11 );
hash_val += ( hash_val << 15 );
return hash_val;
}
ADD_ADDRESS_TO_HASH
宏扩展为函数调用:
static inline guint
add_address_to_hash(guint hash_val, const address *addr) {
const guint8 *hash_data = (const guint8 *)(addr)->data;
int idx;
for (idx = 0; idx < (addr)->len; idx++) {
hash_val += hash_data[idx];
hash_val += ( hash_val << 10 );
hash_val ^= ( hash_val >> 6 );
}
return hash_val;
}
#define ADD_ADDRESS_TO_HASH(hash_val, addr) do { hash_val = add_address_to_hash(hash_val, (addr)); } while (0)
答案 0 :(得分:0)
Jenkin的一对一:https://en.wikipedia.org/wiki/Jenkins_hash_function
我忘记了为什么我们特别选择它,但它相对较快而且我们没有碰撞问题。
过去情况更糟(https://code.wireshark.org/review/gitweb?p=wireshark.git;a=blob;f=epan/address.h;h=f1848adfe104237f52ff02bc40fb9116e5efbc86;hb=7f57fe33573d04cb94598942d807abd3d2d85ebc#l188)并在https://code.wireshark.org/review/gitweb?p=wireshark.git;a=commit;h=7f57fe33573d04cb94598942d807abd3d2d85ebc更新了