所以我有一个这样的列表:
my_list = [{"id":21313,"remark":"","marks":"100"},
{"id":21314,"remark":"","marks":"29"},
{"id":21315,"remark":"","marks":"15"},
{"id":21316,"remark":"","marks":"50"},
{"id":21317,"remark":"","marks":"20"}]
该列表包含许多元素。我想做的是遍历整个列表,以当前元素i作为一个点,从中我们检查接下来两个点中该点的标记是多少。如果说的多于好话,而说的多于不好话。 这就是我想要的样子:
my_list = [{"id":21313,"remark":"good","marks":"100"},
{"id":21314,"remark":"bad","marks":"29"},
{"id":21315,"remark":"bad","marks":"15"},
{"id":21316,"remark":"NaN","marks":"50"},
{"id":21317,"remark":"NaN","marks":"20"}]
最后两个不可用,因为它们后面没有足够的条目可进行比较。 有办法吗?
答案 0 :(得分:1)
您可以对sliding window iterator进行稍加修改的版本,其中还包括最后几个元素:
from itertools import islice
def diminishing_window(seq, n=2):
"""
(s0, ..., s[n-1]), (s1, ..., sn), ..., (s[-2], s[-1]), (s[-1])
"""
it = iter(seq)
result = tuple(islice(it, n))
if len(result) == n:
yield result
for elem in it:
result = result[1:] + (elem,)
yield result
result = result[1:]
while result:
yield result
result = result[1:]
这将使您在数据上拥有n
个“窗口”的宽度,直到最后几个窗口为止,这将逐渐变小。如果我们认为这些窗口中的第一个项目是“打开”的项目,则可以将其与窗口中的其他项目进行比较,以确定其结果。
def dict_replace(d, **kwargs):
res = d.copy()
res.update(kwargs)
return res
def get_remark(a, b):
if len(b) < 2:
return "NAN"
elif all(int(a["marks"]) > int(d["marks"]) for d in b):
return "good"
else:
return "bad"
new_list = [dict_replace(a, remark=get_remark(a, b)) for a, *b in diminishing_window(my_list, 3)]
print(new_list)
# [{'id': 21313, 'remark': 'good', 'marks': '100'}, {'id': 21314, 'remark': 'bad', 'marks': '29'},
# {'id': 21315, 'remark': 'bad', 'marks': '15'}, {'id': 21316, 'remark': 'NAN', 'marks': '50'},
# {'id': 21317, 'remark': 'NAN', 'marks': '20'}]
答案 1 :(得分:1)
将当前元素
i
作为一个点,从中检查 接下来两点的这一点或多或少。
您可以使用zip_longest
使用for
循环以3个为一组进行迭代:
from itertools import zip_longest
# dictionary mapping for remark strings
rems = {1: 'good', 0: 'bad'}
for d1, d2, d3 in zip_longest(my_list, my_list[1:], my_list[2:], fillvalue={}):
if not (d2 and d3):
d1['remark'] = 'NaN'
else:
d1['remark'] = rems[int(d1['marks']) > max(int(d2['marks']), int(d3['marks']))]
print(my_list)
# [{'id': 21313, 'marks': '100', 'remark': 'good'},
# {'id': 21314, 'marks': '29', 'remark': 'bad'},
# {'id': 21315, 'marks': '15', 'remark': 'bad'},
# {'id': 21316, 'marks': '50', 'remark': 'NaN'},
# {'id': 21317, 'marks': '20', 'remark': 'NaN'}]
顺便说一句,您可能希望将这些marks
存储为整数(或浮点数)而不是字符串。这样可以避免每次调用int
进行比较。