考虑以下列表:
tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]
我怎样才能做到这一点?
new_tuple_list = [('c', 'e', 'd'), ('a', 'b')]
我试过了:
for tuple in tuple_list:
for tup in tuple_list:
if tuple[0] == tup[0]:
new_tup = (tuple[0],tuple[1],tup[1])
new_tuple_list.append(new_tup)
但它只有在我按特定顺序拥有元组的元素时才有效,这意味着它会导致这样:
new_tuple_list = [('c', 'e', 'd'), ('a', 'b'), ('d', 'e')]
答案 0 :(得分:9)
您可以将元组视为图表中的边缘,将目标视为在图表中查找connected components。然后你可以简单地遍历顶点(元组中的项)和你尚未访问的每个顶点,然后执行DFS以生成组件:
from collections import defaultdict
def dfs(adj_list, visited, vertex, result, key):
visited.add(vertex)
result[key].append(vertex)
for neighbor in adj_list[vertex]:
if neighbor not in visited:
dfs(adj_list, visited, neighbor, result, key)
edges = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]
adj_list = defaultdict(list)
for x, y in edges:
adj_list[x].append(y)
adj_list[y].append(x)
result = defaultdict(list)
visited = set()
for vertex in adj_list:
if vertex not in visited:
dfs(adj_list, visited, vertex, result, vertex)
print(result.values())
输出:
[['a', 'b'], ['c', 'e', 'd']]
请注意,在上面,组件中的组件和元素都是随机顺序。
答案 1 :(得分:3)
如果您不需要重复值(例如,保留['a', 'a', 'b']
的能力),这是一种通过集合执行所需操作的简单快捷方式:
iset = set([frozenset(s) for s in tuple_list]) # Convert to a set of sets
result = []
while(iset): # While there are sets left to process:
nset = set(iset.pop()) # Pop a new set
check = len(iset) # Does iset contain more sets
while check: # Until no more sets to check:
check = False
for s in iset.copy(): # For each other set:
if nset.intersection(s): # if they intersect:
check = True # Must recheck previous sets
iset.remove(s) # Remove it from remaining sets
nset.update(s) # Add it to the current set
result.append(tuple(nset)) # Convert back to a list of tuples
给出
[('c', 'e', 'd'), ('a', 'b')]
答案 2 :(得分:1)
在NetworkX(用于图形操作的库)中,任务变得微不足道。与this answer by @niemmi类似,您需要找到connected components:
import networkx as nx
tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]
graph = nx.Graph(tuple_list)
result = list(nx.connected_components(graph))
print(result)
# [{'e', 'c', 'd'}, {'b', 'a'}]
要获取结果作为元组列表:
result = list(map(tuple, nx.connected_components(G)))
print(result)
# [('d', 'e', 'c'), ('a', 'b')]
答案 3 :(得分:0)
这表现不佳,因为列表包含的检查是O(n)
,但它很短:
result = []
for tup in tuple_list:
for idx, already in enumerate(result):
# check if any items are equal
if any(item in already for item in tup):
# tuples are immutable so we need to set the result item directly
result[idx] = already + tuple(item for item in tup if item not in already)
break
else:
# else in for-loops are executed only if the loop wasn't terminated by break
result.append(tup)
这有很好的副作用,即保留订单:
>>> result
[('c', 'e', 'd'), ('a', 'b')]
答案 4 :(得分:0)
使用套装。您正在检查(最初很小的)集合的重叠和累积,并且Python具有以下数据类型:
#!python3
#tuple_list = [('c', 'e'), ('c', 'd'), ('a', 'b'), ('d', 'e')]
tuple_list = [(1,2), (3,4), (5,), (1,3,5), (3,'a'),
(9,8), (7,6), (5,4), (9,'b'), (9,7,4),
('c', 'e'), ('e', 'f'), ('d', 'e'), ('d', 'f'),
('a', 'b'),
]
set_list = []
print("Tuple list:", tuple_list)
for t in tuple_list:
#print("Set list:", set_list)
tset = set(t)
matched = []
for s in set_list:
if tset & s:
s |= tset
matched.append(s)
if not matched:
#print("No matches. New set: ", tset)
set_list.append(tset)
elif len(matched) > 1:
#print("Multiple Matches: ", matched)
for i,iset in enumerate(matched):
if not iset:
continue
for jset in matched[i+1:]:
if iset & jset:
iset |= jset
jset.clear()
set_list = [s for s in set_list if s]
print('\n'.join([str(s) for s in set_list]))
答案 5 :(得分:0)
我在设置时遇到了问题,因此我正在为此提供解决方案。它尽可能地将集合与更常见的元素之一结合在一起。
我的示例数据:
data = [['A','B','C'],['B','C','D'],['D'],['X'],['X','Y'],['Y','Z'],['M','N','O'],['M','N','O'],['O','A']]
data = list(map(set,data))
我要解决的代码:
oldlen = len(data)+1
while len(data)<oldlen:
oldlen = len(data)
for i in range(len(data)):
for j in range(i+1,len(data)):
if len(data[i]&data[j]):
data[i] = data[i]|data[j]
data[j] = set()
data = [data[i] for i in range(len(data)) if data[i]!= set()]
结果:
[{'A', 'B', 'C', 'D', 'M', 'N', 'O'}, {'X', 'Y', 'Z'}]
答案 6 :(得分:0)
我在解决共指问题时遇到了这个问题,我需要将集合合并到具有共同元素的集合列表中:
import copy
def merge(list_of_sets):
# init states
list_of_sets = copy.deepcopy(list_of_sets)
result = []
indices = find_fist_overlapping_sets(list_of_sets)
while indices:
# Keep other sets
result = [
s
for idx, s in enumerate(list_of_sets)
if idx not in indices
]
# Append merged set
result.append(
list_of_sets[indices[0]].union(list_of_sets[indices[1]])
)
# Update states
list_of_sets = result
indices = find_fist_overlapping_sets(list_of_sets)
return list_of_sets
def find_fist_overlapping_sets(list_of_sets):
for i, i_set in enumerate(list_of_sets):
for j, j_set in enumerate(list_of_sets[i+1:]):
if i_set.intersection(j_set):
return i, i+j+1