我有一个字典,每个会话都有一些特定的查询。键值是特定会话的ID号,项目值是搜索的查询,如下所示:
1000 , [ Malaria, Cholera ]
1001 , [ Disease, Malaria, Fever]
1002 , [ Fever, Cholera, AIDS, Cancer, Sickness]
1003 , [ Sickness, Disease, Fever, Constipation]
我想找到所有会话的特定查询的同时发生(例如:疾病,2次发生:[(发烧,2次),(疟疾,1次),(疾病,1次),(便秘,1次)我已尝试过该代码,尝试使用我读过的库可以帮助我,itertool:
for x in occurrences.values():
if len(x) > 2:
for y in x:
for pair in itertools.combinations(y, 2):
coccurr[pair]+=1
for k in cooccurr.keys():
print k, len(cooccurr[k])
脚本运行没有错误,但它不打印任何内容,也不打印空列表。这是我的错误?我正确使用itertools?
答案 0 :(得分:3)
from collections import Counter
def findForQuery (queries, value):
related = Counter()
count = 0
for query in queries.values():
if value in query:
count += 1
related.update({item: 1 for item in query if item != value})
return count, related
queries = {
1000: [ 'Malaria', 'Cholera' ],
1001: [ 'Disease', 'Malaria', 'Fever'],
1002: [ 'Fever', 'Cholera', 'AIDS', 'Cancer', 'Sickness'],
1003: [ 'Sickness', 'Disease', 'Fever', 'Constipation']
}
像这样使用:
>>> findForQuery(queries, 'Disease')
(2, Counter({'Fever': 2, 'Malaria': 1, 'Constipation': 1, 'Sickness': 1}))
>>> findForQuery(queries, 'Sickness')
(2, Counter({'Fever': 2, 'AIDS': 1, 'Constipation': 1, 'Cancer': 1, 'Disease': 1, 'Cholera': 1}))