我有三个项目元组的列表。前两项通常是重复项(GPS坐标),而最后一项是分数(信号强度)
[(62.45807, -114.41026, 8),
(62.45807, -114.41026, 11),
(62.45807, -114.41026, 18),
(62.45807, -114.41026, 16),
(62.45807, -114.41026, 9),
(62.45785, -114.41003, 23),
(62.45785, -114.41003, 19),
(62.45785, -114.41003, 11),
(62.45785, -114.41003, 17),
(62.45785, -114.41003, 14),
(62.45785, -114.41003, 11),
(62.45785, -114.41003, 15),
(62.45765, -114.40978, 28),
(62.45765, -114.40978, 16),
(62.45765, -114.40978, 10),
(62.45765, -114.40978, 15),
(62.45765, -114.40978, 25)]
我想知道如何删除重复的GPS坐标,同时更喜欢最高分,最终得到这个:
[(62.45807, -114.41026, 18),
(62.45785, -114.41003, 23),
(62.45765, -114.40978, 28)]
如何做同样的事情,但平均分数最终得到类似的东西
[(62.45807, -114.41026, 12),
(62.45785, -114.41003, 16),
(62.45765, -114.40978, 19)]
答案 0 :(得分:2)
听起来像是itertools.groupby
的工作:
>>> from itertools import groupby
最大:
>>> [max(g, key=lambda x:x[-1]) for k, g in groupby(data, key= lambda x:x[:2])]
[(62.45807, -114.41026, 18),
(62.45785, -114.41003, 23),
(62.45765, -114.40978, 28)]
平均:
>>> [a + (round(sum(c for _, _, c in b)/float(len(b))),)
for a, b in ((k, list(g)) for k, g in
groupby(data, key= lambda x:x[:2]))]
[(62.45807, -114.41026, 12.0),
(62.45785, -114.41003, 16.0),
(62.45765, -114.40978, 19.0)]
答案 1 :(得分:0)
您可以创建一个函数,将每个值映射到带有键的字典作为GPS坐标,其中值是分数列表
def create_gps_score_dict(gps_score_list):
gps_score_dict = {}
for gps_score in gps_score_list:
if (gps_score[0], gps_score[1]) in gps_score_dict.keys():
gps_score_dict[(gps_score[0], gps_score[1])].append(gps_score[2])
else:
gps_score_dict[(gps_score[0], gps_score[1])] = [gps_score[2]]
return gps_score_dict
现在,您可以生成查看此简单字典的结果。
def max_gps_scores(gps_score_dict):
gps_score_list = []
for gps, score in gps_score_dict.items():
gps_score_list.append((gps[0], gps[1], max(score))
实施例
>>> gps_score_list=[(62.45807, -114.41026, 8),
(62.45807, -114.41026, 11),
(62.45807, -114.41026, 18),
(62.45807, -114.41026, 16),
(62.45807, -114.41026, 9),
(62.45785, -114.41003, 23),
(62.45785, -114.41003, 19),
(62.45785, -114.41003, 11),
(62.45785, -114.41003, 17),
(62.45785, -114.41003, 14),
(62.45785, -114.41003, 11),
(62.45785, -114.41003, 15),
(62.45765, -114.40978, 28),
(62.45765, -114.40978, 16),
(62.45765, -114.40978, 10),
(62.45765, -114.40978, 15),
(62.45765, -114.40978, 25)]
>>> max_gps_scores(create_gps_score_dict(gps_score_list))
[(62.45807, -114.41026, 18), (62.45765, -114.40978, 28), (62.45785, -114.41003,23)]
我会把平均值留给你!