Python:在列表中分离术语并分配值

时间:2018-10-10 08:15:49

标签: python

我使用了关键字提取器,并获得了如下列表-

[('solutions design team', 0.5027793039863974), 
('communication skills', 0.039048703166463736), 
('internal stakeholders', 0.03230578820017667),
('potential customers', 0.020380881551651655), ('utilize', 0.002776174060064261)]

我试图将每个单词分开,并分别为每个单词分配相应的值(在右侧给出)。

例如-将“解决方案设计团队” = 0.5027793039863974放入

'solutions' = 0.5027793039863974, 
'design' = 0.5027793039863974 ,   
'team' = 0.5027793039863974.

2 个答案:

答案 0 :(得分:1)

使用双重扁平理解来重新创建带有分隔词的元组列表的方法:

inlist = [('solutions design team', 0.5027793039863974),
('communication skills', 0.039048703166463736),
('internal stakeholders', 0.03230578820017667),
('potential customers', 0.020380881551651655), ('utilize', 0.002776174060064261)]

outlist = [(word,value) for words,value in inlist for word in words.split()]

结果:

>>> outlist
[('solutions', 0.5027793039863974),
 ('design', 0.5027793039863974),
 ('team', 0.5027793039863974),
 ('communication', 0.039048703166463736),
 ('skills', 0.039048703166463736),
 ('internal', 0.03230578820017667),
 ('stakeholders', 0.03230578820017667),
 ('potential', 0.020380881551651655),
 ('customers', 0.020380881551651655),
 ('utilize', 0.002776174060064261)]

请注意,如果关键字不止一次出现,则在元组列表中将有重复项。如果要累积它们,可以使用collections.defaultdict(float)对象方便地创建带有关键字=>累积值的字典。

accumulated = collections.defaultdict(float)
for word,value in outlist:
    accumulated[word] += value

答案 1 :(得分:0)

词典理解,词典更新可能会以下面的方式帮助您

corpus = [('solutions design team', 0.5027793039863974), ('communication skills', 0.039048703166463736), ('internal stakeholders', 0.03230578820017667), ('potential customers', 0.020380881551651655), ('utilize', 0.002776174060064261)]

final_dict = {}
for phase, prob in corpus:
    final_dict.update({word:prob for word in phase.split()}

print(final_dict['solutions'])
print(final_dict['design'])