关于Bloomfilter Python

时间:2018-08-06 16:21:53

标签: python-3.x bloom-filter korean-nlp

from bitarray import bitarray
import mmh3

class BloomFilter:

    def __init__(self, size, hash_count):
        self.size = size
        self.hash_count = hash_count
        self.bit_array = bitarray(size)
        self.bit_array.setall(0)

    def add(self, string):
        for seed in range(self.hash_count):
            result = mmh3.hash(string, seed) % self.size
            self.bit_array[result] = 1

    def lookup(self, string):
        for seed in range(self.hash_count):
            result = mmh3.hash(string, seed) % self.size
            if self.bit_array[result] == 0:
                return "Nope"
        return "Probably"

bf = BloomFilter(500000, 7)

lines = open("esw.txt").read().splitlines()
for line in lines:
    bf.add(line)

input1 = input("단어를 입력하세요:")

print(bf.lookup(input1))

由于布隆过滤器的性质,应该没有负错误。当加法函数中的输入数据有空间时,将出现负错误。为什么会出现此问题?以及我该如何解决?

0 个答案:

没有答案