我有一个字符串列表,如下:
A = [
'philadelphia court excessive disappointed court hope hope',
'hope hope jurisdiction obscures acquittal court',
'mention hope maryland signal held mention problem internal reform life bolster level grievance'
]
,另一个列表为:
B = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
我想基于字符串B
列表中列表词A
的出现次数来创建字典。像
C = [
{'count':2,'hope':2,'mention':0,'life':0,'bolster':0,'internal':0,'level':0},
{'count':1,'hope':2,'mention':0,'life':0,'bolster':0,'internal':0,'level':0},
{'count':0,'hope':1,'mention':2,'life':1,'bolster':1,'internal':1,'level':1}
]
我喜欢什么
dic={}
for i in A:
t=i.split()
for j in B:
dic[j]=t.count(j)
但是,它仅返回最后一对字典,
print (dic)
{'court': 0,
'hope': 1,
'mention': 2,
'life': 1,
'bolster': 1,
'internal': 1,
'level': 1}
答案 0 :(得分:2)
您不会像在示例输出中那样创建字典列表,而是仅创建一个字典(并且每次检查短语时都会覆盖字数统计)。您可以使用re.findall
来统计每个短语中的单词出现次数(如果您的任何短语中都包含单词后跟标点符号(例如“希望?”,那么这样做的好处就是不会失败)。
import re
words = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
phrases = ['philadelphia court excessive disappointed court hope hope','hope hope jurisdiction obscures acquittal court','mention hope maryland signal held mention problem internal reform life bolster level grievance']
counts = [{w: len(re.findall(r'\b{}\b'.format(w), p)) for w in words} for p in phrases]
print(counts)
# [{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]
答案 1 :(得分:1)
两个问题:您正在错误的位置初始化dic
,而不是将这些dic
收集在列表中。解决方法如下:
C = []
for i in A:
dic = {}
t=i.split()
for j in B:
dic[j]=t.count(j)
C.append(dic)
# Result:
[{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0},
{'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0},
{'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]
答案 2 :(得分:0)
您总是用dic
覆盖字典dict[j]=t.count(j)
中的现有值。您可以为每个i创建一个新的字典,并将其添加到类似以下的列表中:
dic=[]
for i in A:
i_dict = {}
t=i.split()
for j in B:
i_dict[j]=t.count(j)
dic.append(i_dict)
print(dic)
答案 3 :(得分:0)
为避免覆盖现有值,请检查条目是否已在字典中。尝试添加:
if j in b:
dic[j] += t.count(j)
else:
dic[j] = t.count(j)
答案 4 :(得分:0)
尝试一下
from collections import Counter
A = ['philadelphia court excessive disappointed court hope hope',
'hope hope jurisdiction obscures acquittal court',
'mention hope maryland signal held mention problem internal reform life bolster level grievance']
B = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
result = [{b: dict(Counter(i.split())).get(b, 0) for b in B} for i in A]
print(result)
输出:
[{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]