我有元组列表(mytuples
)和列表列表(mylist
)。
我想找出每个列表中mytuples中每个元组出现的次数。
在(2,3)
,[1,2,3,4]
和[2,3,4,5]
中出现元组[2,3]
。因此(2,3)
的计数为3
。
元组和列表大小可以不同。
mytuples = [(2,3), (3,6), (1,2)]
mylist = [[1,2,3,4],[2,3,4,5],[2,3],[4,5,6]]
count={}
for m in mytuples :
counter = 0
for i in mylist :
if set(m).issubset(i):
counter = counter + 1
count[m]=counter
我的输出是{(2,3):3, (3,6): 0, (1,2):1}
这种方法很好但是当我的列表大小很大时说1000条记录,它更耗时。这可以更快地完成吗?有什么建议吗?
答案 0 :(得分:3)
使用 dict理解我们可以将所有内容减少为一行。
假设元组总是成对的:
count = {(x,y):sum((x in i and y in i) for i in mylist) for x,y in mytuples}
# {(1, 2): 1, (2, 3): 3, (3, 6): 0}
如果元组大小未知,您可以使用all()
求和:
count = {t:sum(all(x in i for x in t) for i in mylist) for t in mytuples}
# {(1, 2): 1, (2, 3): 3, (3, 6): 0}
如果不清楚:
我们经历了其中的多个:
[all(x in i for x in (2,3)) for i in mylist]
# [True, True, True, False]
# sum([True, True, True, False]) = 3
# And we assign them back to the tuple
答案 1 :(得分:3)
通过小幅调整,您的当前算法可以更快一些:
# Your input data.
tuples = [(2,3), (3,6), (1,2)]
lists = [[1,2,3,4],[2,3,4,5],[2,3],[4,5,6]]
# Convert to sets just once, rather than repeatedly
# within the nested for-loops.
subsets = {t : set(t) for t in tuples}
mainsets = [set(xs) for xs in lists]
# Same as your algorithm, but written differently.
tallies = {
tup : sum(s.issubset(m) for m in mainsets)
for tup, s in subsets.items()
}
print(tallies)