我有一个格式列表(浮点数,字符串)。如何从列表中删除具有相同浮点值的重复项?
列表按浮点顺序排序。我想保留订单。
[(0.10507038451969995,
'Deadly stampede in Shanghai - Emergency personnel help victims.'),
(0.078586381821416265,
'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
(0.072031446647399661, '- Emergency personnel help victims.'),
(0.072031446647399661, 'Emergency personnel help victims.')]
看看最后两个。
答案 0 :(得分:5)
您可以使用itertools.groupby
,因为您已经对这些值进行了排序。这是数据:
>>> lot
[(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
(0.07858638182141627, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
(0.07203144664739966, '- Emergency personnel help victims.'),
(0.07203144664739966, 'Emergency personnel help victims.')]
演示:
>>> import itertools
>>> [next(t) for _, t in itertools.groupby(lot, lambda x: x[0])]
[(0.10507038451969995,
'Deadly stampede in Shanghai - Emergency personnel help victims.'),
(0.07858638182141627,
'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
(0.07203144664739966, '- Emergency personnel help victims.')]
这将为您提供组合在一起的第一个值。
答案 1 :(得分:4)
如果值不在>>> lst
[(0.10507038451969995,
'Deadly stampede in Shanghai - Emergency personnel help victims.'),
(0.078586381821416265,
'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
(0.072031446647399661, '- Emergency personnel help victims.'),
(0.072031446647399661, 'Emergency personnel help victims.')]
>>> seen = set()
>>> result = []
>>> for a, b in lst:
... if not a in seen:
... seen.add(a)
... result.append((a, b))
>>> print result
[(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
(0.07858638182141627, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
(0.07203144664739966, '- Emergency personnel help victims.')]
中,您可以创建一组看到的值并添加元组:
>>> seen = set()
>>> [(a, b) for a, b in lst if not (a in seen or seen.add(a))]
以下是另一种理解方式:
accessToken
答案 2 :(得分:2)
>>> L = [(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
... (0.078586381821416265, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
... (0.072031446647399661, '- Emergency personnel help victims.'),
... (0.072031446647399661, 'Emergency personnel help victims.')]
>>> from collections import OrderedDict
>>> OrderedDict(L).items()
[(0.10507038451969995, 'Deadly stampede in Shanghai - Emergency personnel help victims.'),
(0.07858638182141627, 'Deadly stampede in Shanghai - Police and medical staff help injured people after the stampede.'),
(0.07203144664739966, 'Emergency personnel help victims.')]