我有一个有序子列表的列表,其中包含两个字符串对象。
mylist = [['x1','red'],['x2','blue'],['x2','green'],['x1','yellow']]
我正在尝试寻找一种方法来启动以下规则;
for sublist in mylist:
if sublist[0] == sublist[0] - 1
combine sublist and sublist - 1
换句话说,我需要将每个子列表中的第一个字符串与前一个子列表中的第一个strig进行比较,如果两者匹配,则将它们组合在一起;结果如下:
mySortedlist = [['x1','red'],['x2','blue','x2','green'],['x1','yellow']]
注意:我只对紧接的前一个子列表感兴趣,如果该项目已出现在列表的其他位置,则不感兴趣。
更新:基于另一个用户的有用评论,值得指出的是,我的实际数据文件是成千上万行文本,这些异常可能出现在任何地方。
答案 0 :(得分:3)
您可以使用itertools.groupby
查找连续的元素:
from itertools import chain, groupby
from operator import itemgetter
[[*chain.from_iterable(v)] for _,v in groupby(mylist, key=itemgetter(0))]
# [['x1', 'red'], ['x2', 'blue', 'x2', 'green'], ['x1', 'yellow']]
答案 1 :(得分:2)
您可以这样做:
from itertools import groupby
mylist = [['x1','red'],['x2','blue'],['x2','green'],['x1','yellow']]
[[j for i in v for j in i] for k, v in groupby(mylist, key=lambda x: x[0])]
#[['x1', 'red'], ['x2', 'blue', 'x2', 'green'], ['x1', 'yellow']]
答案 2 :(得分:0)
我只是写了个简单的东西。如果您退后一步,然后思考一下“如果sublist [0] == sublist [0]-1”在实际代码中的样子,它并不太复杂。 一个绊脚石可能是try-except块,用于处理列表中的第一项(第一项将引发“索引超出范围”错误)
mylist = [['x1','red'],['x2','blue'],['x2','green'],['x1','yellow']]
mylist2 = []
for sublist in mylist:
print(sublist)
print(mylist2)
try:
if sublist[0] == mylist2[-1][0]:
mylist2[-1].append(sublist[0])
mylist2[-1].append(sublist[1])
else:
mylist2.append(sublist)
except Exception as e:
if len(mylist2) == 0:
mylist2.append(sublist)
else:
raise e
print(mylist2)
->
['x1', 'red']
[]
['x2', 'blue']
[['x1', 'red']]
['x2', 'green']
[['x1', 'red'], ['x2', 'blue']]
['x1', 'yellow']
[['x1', 'red'], ['x2', 'blue', 'x2', 'green']]
[['x1', 'red'], ['x2', 'blue', 'x2', 'green'], ['x1', 'yellow']]