如果我在python中有一个列表ts
的元组:
ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
如何获取包含2个或更多此类元组之间的公共元素的列表?
假设ts
中的元组和元组中的元素都已经过数字排序。
对于此示例,预期输出应为:
ts_output = [703, 803, 903]
以下是我目前的工作:
ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
ts = set(ts)
t1 = set(w for w,x in ts for y,z in ts if w == y) # t1 should only contain 803
print("t1: ", t1)
t2 = set(y for w,x in ts for y,z in ts if x == y) # t2 should only contain 703
print("t2: ", t2)
t3 = set(x for w,x in ts for y,z in ts if x == z) # t3 should only contain 903
print("t3: ", t3)
这是相应的输出:
t1: {803, 901, 902, 702, 703}
t2: {703}
t3: {704, 805, 806, 903, 703}
从上面开始,只有t2
给出了预期的输出,但我不确定t1
和t3
发生了什么。
您可以使用此替代输入来测试您的代码,它应该提供完全相同的输出:
ts = [(701,703), (702,703), (703,704), (803,805), (803,806), (901,903), (902,903), (903,904)]
答案 0 :(得分:5)
import collections
ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
flat_list = [item for sublist in ts for item in sublist]
duplicates = [item for item, count in collections.Counter(flat_list).items() if count > 1]
print(duplicates)
根据您的输入,您首先要平整您的列表。
#1 Simple and pythonic
flat_list = [item
for sublist in ts
for item in sublist]
#2 More efficient.
import itertools
flat_list = itertools.chain.from_iterable(ts)
对于方法#1,如果方法#2,flat_list
将是list
对象,则它将是generator
对象。对于迭代,两者的行为都相同。
现在您可以计算flat_list中的元素。如果它们大于1,则它们是重复的。
for item, count in collections.Counter(flat_list).items():
if count > 1:
print(item)
或者你可以使用更多的pythonic列表理解。
duplicates = [item
for item, count in collections.Counter(flat_list).items()
if count > 1]
答案 1 :(得分:5)
您需要展平元组列表。您可以使用itertools.chain
>>> from itertools import chain
>>> flat_list = list(chain(*ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]
或者你也可以使用itertools.chain.from_iterables
来做同样的事情,但这不需要迭代解包
>>> flat_list = list(itertools.chain.from_iterable(ts))
>>> flat_list
>>> [702, 703, 703, 704, 803, 805, 803, 806, 901, 903, 902, 903]
在此步骤之后,您可以使用Collections.Counter
计算平面列表中每个元素的出现次数,并过滤一次出现多次的元素。
>>> from collections import Counter
>>> c = Counter(flat_list)
>>> c
>>> Counter({803: 2, 903: 2, 703: 2, 704: 1, 805: 1, 806: 1, 901: 1, 902: 1, 702: 1})
然后最后过滤c
>>> [k for k,v in c.items() if v>1]
>>> [803, 903, 703]
答案 2 :(得分:1)
>>> from collections import Counter
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> c = Counter(el for t in ts for el in t)
>>> [k for k in c if c[k] >= 2]
[703, 803, 903]
答案 3 :(得分:1)
这是一个通过只传递一次而不是两次来解决它的答案,然后构建结果(不确定它在实践中对于超大ts
更快还是更慢)
>>> from collections import Counter
>>> from itertools import chain
>>> ts = [(702,703), (703,704), (803,805), (803,806), (901,903), (902,903)]
>>> def find_common(ts):
... c = Counter()
... for num in chain.from_iterable(ts):
... c[num] += 1
... if c[num] == 2:
... yield num
...
>>> list(find_common(ts))
[703, 803, 903]
没有Counter
>>> def find_common(ts):
... seen, dupes = set(), set()
... for num in chain.from_iterable(ts):
... if num in seen and num not in dupes:
... dupes.add(num)
... yield num
... seen.add(num)
>>> list(find_common(ts))
[703, 803, 903]