在defaultdict中更新键值对

时间:2018-07-20 03:09:05

标签: python python-3.x pandas collections defaultdict

enter image description here

以上数据帧是通过以下代码生成的:

$this->uri->segment(5)

我使用以下代码从Pandas Dataframe(duplicateProductList(如上所示))生成了一个字典:

errors.add(:base, "check no imports")

上面的代码片段产生以下字典:

newCols = ['Book-1', 'Book-2', 'Similarity Score']

l1 = ['b1', 'b1', 'b2']
l2 = ['b2', 'b3', 'b3']
score1 = [0.95, 0.87, 0.84]

duplicateProductList = pd.DataFrame(columns=newCols)

duplicateProductList['Book-1'] = l1
duplicateProductList['Book-2'] = l2
duplicateProductList['Similarity Score'] = score1

print(duplicateProductList)

相反,我想产生以下字典:

from collections import defaultdict    

new_dict = {}

my_list = [(i,[a,b]) for i, a,b in zip(duplicateProductList['Book-1'], duplicateProductList['Book-2'], duplicateProductList['Similarity Score'])]
for (key, value) in my_list:
    if key in new_dict:
        new_dict[key].append(value)
    else:
        new_dict[key] = [value]

print(new_dict)

有人可以帮助我修改字典理解以产生上述字典吗?

1 个答案:

答案 0 :(得分:2)

>>> import collections
>>> from pprint import pprint

>>> df
  Book-1 Book-2  Similarity Score
0     b1     b2              0.95
1     b1     b3              0.87
2     b2     b3              0.84
>>> 
>>> d = collections.defaultdict(list)
>>> for row in df.itertuples(index=False):
    a,b,c = row
    d[a].append((b,c))
    d[b].append((a,c))


>>> pprint(d)
defaultdict(<class 'list'>,
            {'b1': [('b2', 0.95), ('b3', 0.87)],
             'b2': [('b1', 0.95), ('b3', 0.84)],
             'b3': [('b1', 0.87), ('b2', 0.84)]})