如何在python中的现有pickle文件中添加新数据集

时间:2018-02-23 06:50:53

标签: python pickle naivebayes jsonpickle

我有这个从json文件生成的pickle文件https://github.com/Nilabhra/ethnicity/blob/master/models/ethnicity_classifier_last_name.pkl https://github.com/Nilabhra/ethnicity/blob/master/json_counts/last_name_ethnicity.json

我的问题: 如何删除旧数据集并将新数据集放在.pkl文件中。

import pickle

ethinicity= {"Kumari": {"Hindu,Brahmin": 1.0},"Choopra": {"Jain,Digambar": 1.0}}
pickle.dump(ethinicity, open("ethnicity_classifier_last_name.pkl", "wb"))

然而,上面代码生成的pickle文件有不同的结构,因此当我运行此代码时抛出错误

2 个答案:

答案 0 :(得分:1)

删除旧的pickle文件并使用新的数据集转储新的pickle文件。

答案 1 :(得分:0)

在写入pickle文件之前,您需要使用新条目更新旧dict

import pickle 
import json

#Loading the old json
old_ethnicity = json.load(open('last_name_ethnicity.json','rb'))
ethinicity= {"Kumari": {"Hindu,Brahmin": 1.0},"Choopra": {"Jain,Digambar": 1.0}}

#Add the changes to old dict
new_ethnicity = dict(old_ethnicity, **ethinicity)
pickle.dump(new_ethnicity, open("ethnicity_classifier_last_name.pkl", "wb"))`