线性哈希:c ++实现,重新哈希问题

时间:2018-06-22 22:58:11

标签: c++ c++11 pointers hashtable

我正在研究线性哈希的C ++实现。

简而言之,该结构按所谓的存储桶(数组)进行组织,每个存储桶都可以具有其溢出存储桶(该溢出也可以具有溢出等)。

  

在代码中,溢出桶如下所示:bucket *   overflowBucket {nullptr}; //指向溢出桶和   bucket本身是用struct bucket {}制成的->每个都有其数组   固定大小的元素和溢出指针。

要在表中插入元素时,应检查元素是否已经存在,如果不存在,则应进行检查。但是,如果应在其中插入元素的存储桶已满,则应添加新的溢出,但在此之前,应该使用新的哈希函数重新填充“下一个拆分”存储桶。

到目前为止,我的程序无论如何都可以正确执行大多数操作,但是,当我将一个新元素插入其溢出已满的存储桶中时,而不是创建一个新的溢出并触发重新哈希分割桶,

它只会创建新的溢出。

与此相关的代码部分是: 插入元素的主要功能:

void insertElementInTable(const key_type& key) {
     //compute the index of the bucket that the key should be inserted in
     size_type start_idx{hashIndex(key)};
     bool check = table[start_idx]->Full();
     if(check == true) { //if the bucket is full, then...
        //...rehash the nts bucket
        rehashNextToSplit();
     }

     //then compute index again and insert the element in the bucket finally
     size_type final_idx = hashIndex(key);
     table[final_idx]->insertElementInBucket(key);

     //then check if round needs to change and if yes then double the size of the table for future splits
     if((nextToSplit-1) == roundNum) {
         roundNum += 1;
         nextToSplit = 0;
         replaceOldTable(roundNum);
     }

     //increase the number of elements in the table
     overallElementCount += 1;

 }

功能齐全(我认为可能是问题所在,但我没有看到)

    bool Full() {
        size_type numBuckets{0}, numElements{0};
        bucket* helper = this;

        //counts all the elements in primary and all overflow buckets
        while(helper != nullptr) {
            numBuckets += 1;
            for(unsigned i=0; i<N; ++i) {
                if(helper->Bucket[i].state == State::taken) {
                    numElements += 1;
                }
            }
            helper = helper->overflowBucket;
        }

        //set up real size and max size of bucket
        size_type real_sz = numElements*numBuckets;
        size_type max_sz = N*numBuckets;

        //if the real size matches the max size, bucket is full and return true
        return real_sz == max_sz;
    }

拆分功能旁边的REHASH:

 void rehashNextToSplit() {
        //store the contents of nts bucket and its overflows in temporary vector
        //COPY ELEMENTS AND CLEAR BUCKET
        std::vector<value_type> vec;
        vec = table[nextToSplit]->contentCopyAndClear();

        //PROCEED WITH REHASHING
        //first increment nts
        nextToSplit += 1;
        //then insert again the values from the vector
        size_type sz = vec.size();

        for(unsigned i=0; i<sz; ++i) {
            size_type idx = hashIndex(vec.at(i));
            table[idx]->insertElementInBucket(vec.at(i));
        }

        //when done free the memory used by the help vector
        vec.clear(); 
        vec.shrink_to_fit();

    }

通过重新哈希方法调用的内容复制和清除功能:

std::vector<value_type> contentCopyAndClear() {
        std::vector<value_type> vect;
        bucket* ptr = this;
        bool cond{true};

        while (cond == true) { //goes through all buckets and copies the values into the vector
            for(unsigned i=0; i<N; ++i) {
                if(ptr->Bucket[i].state == State::taken) {
                    vect.push_back(ptr->Bucket[i].key); //copy element
                    ptr->Bucket[i].state = State::free; //free the elements
                }
            }
            if(ptr->overflowBucket == nullptr) { cond = false; }
            ptr = ptr->overflowBucket;
        }

        //reset ptr to first overflow
        ptr = this->overflowBucket;
        delete ptr;

        return vect;
    }

哈希函数:

size_type hashIndex(const key_type& key) const {
    size_type idx = hasher{}(key) % (1<<roundNum);

    if(idx < nextToSplit) {
        size_type d{roundNum + 1};
        idx = hasher{}(key) % (1<<d);
    }

    return idx;
 }

再插入函数也调用insertElementInBucket:

void insertElementInBucket(const key_type& key) {
        bucket* next = this;

        while(true) {
            for(unsigned i=0; i<N; ++i) {
                if(next->Bucket[i].state == State::free) {
                    next->Bucket[i].key = key;
                    next->Bucket[i].state = State::taken;
                    return;
                } else {
                    if(key_equal{}(next->Bucket[i].key, key)) {
                        return;
                    }
                }
            }
            if(next->overflowBucket == nullptr) {
                next->overflowBucket = new bucket;
            }
            next = next->overflowBucket;
        }
    }

由于文章的可读性和简单性,我不会在此处粘贴其余代码,但是您可以在以下链接中看到它:https://pastebin.com/TtdN5tBU

问题又总结了一次:

在下面的图片中插入数字17后,应将第一张表中的数字4重新映射(使用4 mod 2 ^ 3)到存储桶编号。 2

[insert problem]

下面图片中的数字3和55只能输入一次(经过重新哈希处理后,它们应从1号存储桶中删除

[rehashing problem]

对于任何提示,我将非常感谢,因为我一直试图在2天内弄清楚这一点,而现在没有任何进展。.

0 个答案:

没有答案