我编写了一个方法来生成哈希并将其返回到词典列表中。它适用于少量记录,例如,100。但它需要大约17分钟来生成10000条记录的哈希值。
如何改进以下代码以更快地处理10000条记录(几分钟)?也许多线程会帮助我?
def generate_hashes(self, records):
def get_year(date):
return str(date.year)
def create_hash(string):
md5 = hashlib.md5()
md5.update(string)
return md5.hexdigest()
result = []
for rec in records:
rec_dict = {}
if rec.dob != None and rec.priv_number != None:
org_hash = "{0}_{1}".format(create_hash(rec.priv_number), get_year(rec.dob))
group_hash = create_hash("{0}_{1}".format(create_hash(org_hash), '144C5A0013EDE1B0ACF585'))
rec_hash = group_hash
print("Generate hash for %s rec." % rec.pub_number.pub_number)
else:
rec_hash = '0a'*16
print("There are not enough data to create hash for rec %s." % rec.pub_number.pub_number)
rec_dict.update({'hash': rec_hash, 'pub_number': rec.pub_number.pub_number})
result.append(rec_dict)
return result