我有一个类似于A的元组列表:
A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)],
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)],
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)],
[(128, 3, 4)],
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
在此列表的每一行中,可能会有元组的第二个和第三个元素相同。例如在A [0]中:
A[0] = [(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)]
(90、1、5),(1000、1、5)和(176、1、5)具有相同的第二和第三元素。其中,我需要保留第一个元素的最大值,并删除其他两个元素。因此,我应该能够保留(1000,1,5)并从A [0]中删除(90,1,5)和(176,1,5)。
最好保持列表的顺序。
有没有办法对A中的所有行进行迭代处理?任何帮助将不胜感激!
答案 0 :(得分:3)
如果我理解正确,这是一个itertools.groupby
解决方案。我假设最终结果的顺序无关紧要。
from itertools import groupby
def keep_max(lst, groupkey, maxkey):
'groups lst w.r.t. to groupkey, keeps maximum of each group w.r.t. maxkey'
sor = sorted(lst, key=groupkey)
groups = (tuple(g) for _, g in groupby(sor, key=groupkey))
return [max(g, key=maxkey) for g in groups]
实际情况:
>>> from operator import itemgetter
>>> groupkey = itemgetter(1, 2)
>>> maxkey = itemgetter(0)
>>> A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)], [(160, 2, 5), (1000, 2, 5), (111, 1, 2)], [(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)], [(128, 3, 4)], [(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
>>>
>>> [keep_max(sub, groupkey, maxkey) for sub in A]
[[(111, 1, 2), (139, 1, 3), (1000, 1, 5)],
[(111, 1, 2), (1000, 2, 5)],
[(139, 1, 3), (128, 3, 4), (134, 3, 5)],
[(128, 3, 4)],
[(1000, 1, 5), (1000, 2, 5), (134, 3, 5)]]
答案 1 :(得分:2)
如果每个元组(整体)都是唯一的,则此解决方案将保留元组的原始顺序;如果有重复的元组,这将返回每个元组的最后一个外观:
from operator import itemgetter
A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)],
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)],
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)],
[(128, 3, 4)],
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
def uniques(lst):
groups = {}
for t in lst:
groups.setdefault(t[1:], []).append(t)
lookup = {t: i for i, t in enumerate(lst)}
index = lookup.get
first = itemgetter(0)
return sorted(map(lambda x: max(x, key=first), groups.values()), key=index)
result = [uniques(a) for a in A]
print(result)
输出
[[(139, 1, 3), (1000, 1, 5), (111, 1, 2)], [(1000, 2, 5), (111, 1, 2)], [(134, 3, 5), (128, 3, 4), (139, 1, 3)], [(128, 3, 4)], [(134, 3, 5), (1000, 2, 5), (1000, 1, 5)]]
答案 2 :(得分:1)
使用字典:
fin = []
for row in A:
dict = {}
for tup in row:
dict[tup[1:2]] = tup[0]
fin.append(dict)
A = [[value, t1, t1] for (t1, t2), value in dict.iteritems()]
使用此命令,您的词典将转换A [0]的来源
A[0] = [(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)]
到
{ (1,5): 1000, (1,3): 139, (1,2): 111 } (as a dict)
,然后可以使用迭代项将其转换回数组
这样,订单也将保留。
答案 3 :(得分:1)
如果您可以忽略排序,则可以使用itertools.groupby
将元素按列表中的第二和第三元素分组,该列表按第二和第三元素的升序和第一元素的降序排序。然后,每个组的第一个元素就是您想要的选择:
read.table(
text = gsub(pattern = "\"\"", "\"", readLines("data.csv")),
sep = ",",
header = TRUE
)
输出
from itertools import groupby
A = [[(90, 1, 5), (126, 1, 3), (139, 1, 3), (1000, 1, 5), (111, 1, 2), (176, 1, 5)],
[(160, 2, 5), (1000, 2, 5), (111, 1, 2)],
[(134, 3, 5), (126, 1, 3), (128, 3, 4), (139, 1, 3)],
[(128, 3, 4)],
[(90, 1, 5), (160, 2, 5), (134, 3, 5), (1000, 2, 5), (1000, 1, 5), (176, 1, 5)]]
def max_duplicate(lst):
res = []
for k, g in groupby(sorted(lst, key=lambda x: (x[1], x[2], -x[0])), key=lambda x: (x[1], x[2])):
res.append(next(g))
return res
result = [max_duplicate(l) for l in A]
for r in result:
print(r)
答案 4 :(得分:1)
您可以通过如下使用哈希图来做到这一点:
d = {}
for a in A:
for aa in a:
v, k1, k2 = aa
if (k1, k2) in d:
d[(k1, k2)] = max(v, d[(k1, k2)])
else:
d[(k1, k2)] = v
l = [[v, k1, k2] for (k1, k2), v in d.iteritems()]