我仍然不熟悉C ++,并且无法实现实际可行的添加方法。我已经在java中完成了哈希映射,但是将其转换为C ++已经证明是困难的。除此之外,我必须使用约束(例如不更改头文件中的任何内容),并且除了std :: string和std :: cout / cin之外不能使用任何std库函数。
基本上,我必须创建一个最终存储用户名(作为键)和密码(作为值)的哈希映射。此时,用户名/密码组合并不重要,因为实施非常通用的课程是练习的重点。
只是尝试使用默认构造函数测试我的HashMap并添加一个值最终会给我一个分段错误。在尝试实现此哈希表时,我100%确定我做了一些可怕的错误。要么我没有正确地将桶索引与节点连接,要么没有正确初始化。
以下是我正在使用的头文件:
#ifndef HASHMAP_HPP
#define HASHMAP_HPP
#include <functional>
#include <string>
class HashMap
{
public:
// Hash functions must conform to these properties:
//
// (1) Given a particular string s repeatedly, they must always
// return the same hash value.
// (2) They do not take the number of buckets into account (as they
// do not receive a parameter that tells them how many buckets
// there are). Any unsigned int value is fair game as a result.
// It will be the job of the HashMap class to reduce the results
// to the range of available bucket indices (e.g., by using the
// % operator).
typedef std::function<unsigned int(const std::string&)> HashFunction;
// This constant specifies the number of buckets that a HashMap will
// have when it is initially constructed.
static constexpr unsigned int initialBucketCount = 10;
public:
// This constructor initializes the HashMap to use whatever default
// hash function you'd like it to use. A little research online will
// yield some good ideas about how to write a good hash function for
// strings; don't just return zero or, say, the length of the string.
HashMap();
// This constructor instead initializes the HashMap to use a particular
// hash function instead of the default. (We'll use this in our unit
// tests to control the scenarios more carefully.)
HashMap(HashFunction hasher);
// The "Big Three" need to be implemented appropriately, so that HashMaps
// can be created, destroyed, copied, and assigned without leaking
// resources, interfering with one another, or causing crashes or
// undefined behavior.
HashMap(const HashMap& hm);
~HashMap();
HashMap& operator=(const HashMap& hm);
// add() takes a key and a value. If the key is not already stored in
// this HashMap, the key/value pair is added; if the key is already
// stored, the function has no effect.
//
// If adding the new key/value pair will cause the load factor of this
// HashMap to exceed 0.8, the following must happen:
//
// (1) The number of buckets should be increased by doubling it and
// adding 1 (i.e., if there were 10 buckets, increase it to
// 2 * 10 + 1 = 21).
// (2) All key/value pairs should be rehashed into their new buckets,
// important because changing the number of buckets will likely
// change which bucket a particular key hashes to (especialy if
// you're using % to determine the index of that bucket).
void add(const std::string& key, const std::string& value);
// remove() takes a key and removes it (and its associated value) from
// this HashMap if it is already present; if not, the function has no
// effect.
void remove(const std::string& key);
// contains() returns true if the given key is in this HashMap, false
// if not.
bool contains(const std::string& key) const;
// value() returns the value associated with the given key in this HashMap
// if the key is stored in this HashMap; if not, the empty string is
// returned. (Going forward, we'll discover that throwing an exception
// is a better way to handle the scenario where the key is not present,
// but we'll conquer that at a later date.)
std::string value(const std::string& key) const;
// size() returns the number of key/value pairs stored in this HashMap.
unsigned int size() const;
// bucketCount() returns the number of buckets currently allocated in
// this HashMap.
unsigned int bucketCount() const;
// loadFactor() returns the proportion of the number of key/value pairs
// to the number of buckets, a measurement of how "full" the HashMap is.
// For example, if there are 20 key/value pairs and 50 buckets, we would
// say that the load factor is 20/50 = 0.4.
double loadFactor() const;
// maxBucketSize() returns the number of key/value pairs stored in this
// HashMap's largest bucket.
unsigned int maxBucketSize() const;
private:
// This structure describes the nodes that make up the linked lists in
// each of this HashMap's buckets.
struct Node
{
std::string key;
std::string value;
Node* next;
};
// Store the hash function (either the default hash function or the one
// passed to the constructor as a parameter) in this member variable.
// When you want to hash a key, call this member variable (i.e., follow
// it with parentheses and a parameter) just like you would any other
// function.
HashFunction hasher;
// You will no doubt need to add at least a few more private members
public:
// our hash function
unsigned int hashFunc(const std::string& key) const;
private:
Node** hashTable;
// We need a variable that will always let us know what the current amount
// of buckets is. bucketCount will use this and return this variable.
unsigned int amountOfBuckets;
// we also need the number of keys currently in the hash map. This is stored here
unsigned int sz;
};
#endif // HASHMAP_HPP
以下是我实现我的类(HashMap.cpp)的方法:
#include "HashMap.hpp"
// default constructor will initialize Node to default values
// Create a new hash table with the initial bucket count
// Set the amount of buckets to the initial bucket count
// Set the current amount of key/value pairs to zero.
HashMap::HashMap()
: hashTable{new Node*[initialBucketCount]}, amountOfBuckets{initialBucketCount}, sz{0}
{
}
// constructor that initializes HashMap to use a different hash function other
// than the default
HashMap::HashMap(HashFunction hashFunc)
: hasher{hashFunc}, hashTable{new Node*[initialBucketCount]}, amountOfBuckets{initialBucketCount}, sz{0}
{
}
// copy constructor, initializes a new HashMap to be a copy of an existing one
HashMap::HashMap(const HashMap& hm)
// commented out for now :
{
}
// destructor: deallocate the HashMap
HashMap::~HashMap()
{
// delete something here
}
// Assignment operator that overloads equals
HashMap& HashMap::operator=(const HashMap& hm)
{
// FIX COMPILER WARNINGS, DELETE
return *this;
}
// our hash function, this is for our type def HashFunction
// pass this through the constructor
unsigned int HashMap::hashFunc(const std::string& key) const
{
unsigned int hashValue = 0; // what we end up returning
for(int i = 0; i < key.length(); i++) { // iterate through string
int letterIndex = key.at(i) - 96; // create an index for each ASCII char
// first multiply our current hashValue by a prime number
// add to the letter index, to maintain a stable result
// mod by the current amount of buckets on each iteration to prevent overflow
hashValue = (hashValue * 27 + letterIndex) % bucketCount();
}
return hashValue;
}
// add function
void HashMap::add(const std::string& key, const std::string& value)
{
// Check if key being stored matches key already in hashmap
/* BASIC ADD FUNCTION, JUST TO IMPLEMENT UNIT TESTS */
unsigned int hashVal = hashFunc(key);
//Node* prev = nullptr; // keeps track of where we are
Node* current = hashTable[hashVal]; // the place we store a data item into
while(current != nullptr) {
//prev = current; // update previous node to point to current
// this lets us move current without losing our place
current = current->next; // move current to the next node
} // stop once we find an empty node
// current should equal a nullptr
current->key = key; // set key (user)
current->value = value; // set password
current->next = nullptr; // set the next ptr to be null
}
// takes in a key (username), removes it and the value (password) associated
// with it, otherwise, it has no effect
void HashMap::remove(const std::string& key)
{
}
// returns true if given key is in hash map, otherwise returns false
// this acts as a find method
bool HashMap::contains(const std::string& key) const
{
unsigned int hashedValue = hashFunc(key); // hash the key given to get an index
if(hashTable[hashedValue] == nullptr) { // if there are no nodes at given index
return false;
} else { // there are some nodes in the hash table
// iterate through each node in the linked list
// Node* current = hashTable[hashedValue];
// start at first node (this is current)
Node* current = hashTable[hashedValue];
while(current != nullptr && current->key == key) {
current = current->next;
} // end while
if(current == nullptr) { // we reached the end of our linked list
return false; // couldn't find a value
} else { // we found the key provided
return true;
}
} // end if-else
}
// value() returns the value associated with the given key in this HashMap
// if the key is stored in this HashMap; if not, the empty string is returned.
std::string HashMap::value(const std::string& key) const
{
// HANDLES COMPILER WARNINGS, DELETE LATER
return "";
}
// size() returns the number of key/value pairs stored in this HashMap.
unsigned int HashMap::size() const
{
return sz;
}
// bucketCount() returns the number of buckets currently allocated in this HashMap.
// each bucket is an index for the array, we do not include the linked lists.
unsigned int HashMap::bucketCount() const
{
return amountOfBuckets;
}
// loadFactor() returns the proportion of the number of key/value pairs
// to the number of buckets, a measurement of how "full" the HashMap is.
// For example, if there are 20 key/value pairs and 50 buckets, we would
// say that the load factor is 20/50 = 0.4.
double HashMap::loadFactor() const
{
return sz / amountOfBuckets;
}
// maxBucketSize() returns the number of key/value pairs stored in this
// HashMap's largest bucket.
unsigned int HashMap::maxBucketSize() const
{
// HANDLE COMPILER WARNINGS, DELETE LATER
return 0;
}
我现在正在实施的主要方法是添加。目前我只是尝试使用main函数测试类,甚至可以查看是否可以向地图添加值,并测试它是否识别地图中是否包含了某些内容。我意识到课堂上很少有人完成,而且这些功能本身并不完整,但我只是试图测试最基本的案例。
最后,这是我的主要内容:
#include <iostream>
#include <string>
#include "HashMap.hpp"
int main()
{
// initialize test
HashMap test1;
std::cout << "TEST 1 HASHMAP OBJECT CREATED" << std::endl;
// add some values
// at this point (11/16/2014), I only have contains, add, and hashFunc
// test these methods below
// constructor doesn't quite work right
std::string key1 = "Alex";
std::string value1 = "password1";
std::string key2 = "Danielle";
std::string value2 = "password2";
std::cout << "strings have been created" << std::endl;
// add to hash map
test1.add(key1, value1);
test1.add(key2, value2);
std::cout << "Keys and values have been added to hash map" << std::endl;
// does key1 contain the word "hi"? no, should return false
std::cout << "Hash map contains word hi?: " << test1.contains("hi") << std::endl;
// does key2 contain word "Danielle"? yes, should return true
std::cout << "Hash map contains word Danielle?: " << test1.contains("Danielle") << std::endl;
return 0;
}
我在构建它之后使用预构建的脚本来运行程序。运行时,我得到这个输出:
TEST 1 HASHMAP OBJECT CREATED
strings have been created
./run: line 43: 10085 Segmentation fault (core dumped) $SCRIPT_DIR/out/bin/a.out.$WHAT_TO_RUN
基本上,分段故障发生在添加功能期间。那么添加实际上是怎么回事?我怎样才能理解哈希映射应该做得更好?
答案 0 :(得分:1)
你自己评论当前应该等于nullptr
正确预示下一行的失败:
// current should equal a nullptr
current->key = key; // set key (user)
假设新分配的数组将充满nullptr
,这通常是不好的做法。
您需要在构造函数中将其设置为所有nullptr
。
您的add()
功能存在其他问题。
这里至少要使它至少适合目的:
void HashMap::add(const std::string& key, const std::string& value)
{
// Check if key being stored matches key already in hashmap
/* BASIC ADD FUNCTION, JUST TO IMPLEMENT UNIT TESTS */
unsigned int hashVal = hashFunc(key);
Node* head=hashTable[hashVal];
Node* current = head;
if(current==nullptr){
//Nothing in this bucket so it's definitely new.
current=new Node();
current->key = key; // set key (user)
current->value = value; // set password
current->next = nullptr; // set the next ptr to be nullptr.
hashTable[hashVal]=current;
return;
}
do { //It's a do-while because the if statement above has handled current==nullptr.
if(current->key==key){
//We've found a match for the key.
//Common hash-table behavior is to overwrite the value. So let's do that.
current->value=value;
return;
}
current = current->next; // move current to the next node
} while(current != nullptr) // stop once we go past the last node.
//Finally, we found hash collisions but no match on key.
//So we add a new node and chain it to the node(s) already there.
//
//Sometimes it's a good idea to put the new one at the end or if it's likely to get looked up
//it's also a good idea to put it at the start.
//It might be a good idea to keep the collision chain sorted and insert into it accordingly.
//If we sort we can dive out of the loop above when we pass the point the key would be.
//
//However for a little example like this let's put the new node at the head.
current=new Node();
current->key = key; // set key (user)
current->value = value; // set password
current->next = head; // set the next pointer to be the old head.
hashTable[hashVal]=current;
}
PS:您的哈希函数也存在问题。 您在主循环内以模数形式表示散列表大小。 这只会产生限制分配和可怕传播的效果。 至少将它移到该函数的最后一行:
return hashValue % bucketCount() ;
但我建议将其移至add
功能:
unsigned int hashVal = hashFunc(key); //Assume modified to not use bucketCount().
unsigned int tableIndex=hashVal % bucketCount();//Reduce hash to a valid index.
Node* head=hashTable[tableIndex]; //TODO: Do same in the other accesses to hashTable...
然后,您可以将完整哈希值存储在Node
结构中,并在current->key==key
之前将其用作更强的预比较。如果哈希表可能非常满,则可以获得很大的性能提升。这取决于您是否要为每unsigned int
Node
节省字节。
如果你这样做并且你接受了提示来对碰撞链进行排序,你可以通过哈希码进行排序,并且通常可以避免比较add()
新密钥或get()
不是那里的任何字符串,通常只有一旦成功get()
或覆盖add()
。
答案 1 :(得分:0)
在您的HashMap :: add中,您将取消引用nullpointer。您可以在构造函数中创建一个大小为10个元素的节点指针数组,但在代码中,您永远不会创建任何节点对象并分配给数组中的指针。所以当你这样做时:
current->key = key;
您正在访问一个0的节点指针。