请帮忙,我似乎找不到办法做到这一点。我正在开展一个网络科学项目,这是我的第三个python项目。
我需要将字典中的第一项与同一字典中的所有其他项进行比较,但我的其他项是字典。
例如,我有一个具有以下值的字典:
{'25': {'Return of the Jedi (1983)': 5.0},
'42': {'Batman (1989)': 3.0, 'E.T. the Extra-Terrestrial (1982)': 5.0},
'8': {'Return of the Jedi (1983)': 5.0 },'542': {'Alice in Wonderland (1951)': 3.0, 'Blade Runner (1982)': 4.0}, '7': {'Alice in Wonderland (1951)': 3.0,'Blade Runner (1982)': 4.0}}
因此,在这种情况下,我需要查看键“25”和“42”是否包含相同的电影“绝地归来”,然后如果“25”和“8”具有相同的电影,依此类推。我他们这样做,我需要知道有多少电影重叠。
这是一个字典的例子,整个字典包含1000个键,子字典也更大。
我尝试迭代,比较字典,制作副本,合并,加入,但我似乎无法掌握如何做到这一点。
请帮忙!
问题是我仍然无法比较两个子区域因为我需要找到至少有两部相同电影的键。
答案 0 :(得分:2)
您可以使用collections.Counter
:
>>> dic={'25': {'Return of the Jedi (1983)': 5.0}, '42': {'Batman (1989)': 3.0, 'E.T. the Extra-Terrestrial (1982)': 5.0}, '8': {'Return of the Jedi (1983)': 5.0 }}
>>> from collections import Counter
>>> c=Counter(movie for v in dic.values() for movie in v)
>>> [k for k,v in c.items() if v>1] #returns the name of movies repeated more than once
['Return of the Jedi (1983)']
>>> c
Counter({'Return of the Jedi (1983)': 2,
'Batman (1989)': 1,
'E.T. the Extra-Terrestrial (1982)': 1})
要获取与每部电影相关的按键,您可以使用collections.defaultdict
:
>>> from collections import defaultdict
>>> movie_keys=defaultdict(list)
>>> for k,v in dic.items():
for movie in v:
movie_keys[movie].append(k)
...
>>> movie_keys
defaultdict(<type 'list'>, {'Batman (1989)': ['42'], 'Return of the Jedi (1983)': ['25', '8'], 'E.T. the Extra-Terrestrial (1982)': ['42']})
答案 1 :(得分:0)
字典中并没有真正的“第一”项,但你可以找到包含给定电影的所有键,如下所示:
movies = {}
for k in data:
for movie in data[k]:
movies.setdefault(movie, []).append(k)
输出电影看起来像是:
{'Return of the Jedi (1983)': [25, 8], 'Batman (1989)': [42], ...}
答案 2 :(得分:0)
我的回答只会返回包含'title',['offender1',...]
对电影的字典,这些电影不止一次出现,即不 'E.T. the Extra-Terrestrial (1982)'
,但会报告'Return of the Jedi (1983)'
。这可以通过简单地在解决方案中返回overlaps
而不是字典理解的结果来改变。
d是:
d = {'25': {'Return of the Jedi (1983)': 5.0},
'42': {'Batman (1989)': 3.0, 'E.T. the Extra-Terrestrial (1982)': 5.0},
'8': {'Return of the Jedi (1983)': 5.0 },
'542': {'Alice in Wonderland (1951)': 3.0, 'Blade Runner (1982)': 4.0},
'7': {'Alice in Wonderland (1951)': 3.0,'Blade Runner (1982)': 4.0}
}
以下内容:
from collections import defaultdict
import itertools
def findOverlaps(d):
overlaps = defaultdict(list)
for (parentKey,children) in d.items(): #children is the dictionary containing movie_title,rating pairs
for childKey in children.keys(): #we're only interested in the titles not the ratings, hence keys() not items()
overlaps[childKey].append(parentKey) #add the parent 'id' where the movie_title came from
return dict(((overlap,offenders) for (overlap,offenders) in overlaps.items() if len(offenders) > 1)) #return a dictionary, only if the movie title had more than one 'id' associated with it
print(findOverlaps(d))
产地:
>>>
{'Blade Runner (1982)': ['7', '542'], 'Return of the Jedi (1983)': ['25', '8'], 'Alice in Wonderland (1951)': ['7', '542']}
代码背后的原因:
d中的每个条目代表id : { movie_title1: rating, movie_title2: rating }
。现在说movie_title1
发生在与两个或多个单独的 id
键相关联的值中。我们想存储
move_title
。id
的密钥,与看到电影的值相关联。因此我们想要一个像这样的结果字典
{ move_title1: {'id1','id2'}, movie_title2: {'id2','id5'}