问题是从正在重新散列的哈希表中删除项目

时间:2017-12-30 03:20:42

标签: c++ hashtable

所以我试图在测试中使用这个删除函数来提高健壮性。对于少量的单词似乎工作正常,但是当增加数百和数千时,它会下降。在测试中,我添加900个单词然后删除100,这就是我第一次发现问题的地方。它添加了很好的单词,但似乎并没有删除它们。

基于我的print语句,我可以告诉这种情况发生在代码中,当表格通过增量重新表达到新表格时删除。

通常,我不会发布这么多代码,但这只是rehash if语句删除,我不知道问题发生在哪里。

char * HashTable::remove(const char *str) 
{
  // if rehashing, check both tables
  // if not rehashing check if we need to
  // check new table if we have to rehash
  if (is_rehashing)
    {
      // first check in the old table
      // copy over any items we pass to new table
      int found = -1 ;
      char * cstring = NULL ;
      int index = hashCode(str) % m_tableSize ;
      int index2 = hashCode(str) % m_tableSize2 ;
      int count = 1;
      while (true)
        {
          if (m_hash[index] == NULL) { break ; } // not in first array
          if (m_hash[index] == DELETED) { } // not in this index
          else // its a cstring, copy over
            {
          //     int count = 1 ;
              int new_index = hashCode(m_hash[index]) % m_tableSize2 ;
              // insert to new table
              while( true )
                {
                  if (m_rehash[new_index] == NULL) break ;
                  if (m_rehash[new_index] == DELETED) break ;
                  new_index = (new_index + 1) % m_tableSize2 ;
                  count ++ ;
                }
              // copy to new table, and delete from old
              m_rehash[new_index] = strdup(m_hash[index]) ;
              m_size2++ ;
              free(m_hash[index]) ;
              m_hash[index] = NULL ;
              m_size-- ;



          if (strcmp(m_rehash[new_index], str) == 0) // FOUND!! now DELETE IT  
        {
          cstring = strdup(m_rehash[new_index]) ;
                  free(m_rehash[new_index]) ;
                  m_rehash[new_index] = DELETED ;
          found = 0 ;
        }
            }
          index = (index + 1) % m_tableSize ;
        }

      // iterate backwards, get the entire cluster.
      index = ((hashCode(str) % m_tableSize) - 1 ) %m_tableSize ;
      while (true)
        {
          if (m_hash[index] == NULL) { break ; } // not in first array
          if (m_hash[index] == DELETED) { } // not in this index
          else  // found a cstring
            {
              //              int count = 1 ;
              int new_index = hashCode(m_hash[index]) % m_tableSize2 ;
              // insert to new table TODO: maybe make a new function for this
              while( true )
                {
                  if (m_rehash[new_index] == NULL) break ;
                  if (m_rehash[new_index] == DELETED) break ;
                  new_index = (new_index + 1) % m_tableSize2 ;
                  count ++ ;
                }
              // copy to new table, and delete from old
              m_rehash[new_index] = strdup(m_hash[index]) ;
              m_size2++ ;
              free(m_hash[index]) ;
              m_hash[index] = NULL ;
              m_size-- ;

              // check if 2nd table needs rehashing
              //if (count >= 10) { rehash() ; }
              //else if (m_size2 >= (m_tableSize2 / 2)) { rehash() ; }
              // found it!
              if (strcmp(m_rehash[new_index], str) == 0) { found = new_index ; }
            }
          index = (index - 1) % m_tableSize ;
        }

      // check if 2nd table needs rehashing
      if (count >= 10) { rehash() ; }
      else if (m_size2 >= (m_tableSize2 / 2)) { rehash() ; }

      if (found != -1) 
    {
      finishRehash() ; 
      return cstring ; 
    }

      // couldn't find it in the 1st array, lets try the second
      count = 1 ;
      while (true)
        {
          if (m_rehash[index2] == NULL) { finishRehash() ; return NULL ; } 
      // not in either array
          if (m_rehash[index2] == DELETED) { } // nada
          else if (strcmp(m_rehash[index2], str) == 0)  // FOUND HERE
        { 
          char * cstring = strdup(m_rehash[index2]) ;
          free(m_rehash[index2]) ;
          m_rehash[index2] = DELETED ;
          finishRehash() ;
          return cstring;
        }
          index2 = (index2 + 1) % m_tableSize2 ;
          count++ ;
        }
      if (count >= 10) { rehash() ; }
      else if (m_size2 >= (m_tableSize2 / 2)) { rehash() ; }
    }

任何输入都会非常有用,谢谢!

0 个答案:

没有答案