如何从一个点迭代列表中的下n个元素

时间:2019-01-29 14:49:16

标签: python python-3.x

所以我有一个这样的列表:

my_list = [{"id":21313,"remark":"","marks":"100"}, 
{"id":21314,"remark":"","marks":"29"},
{"id":21315,"remark":"","marks":"15"},
{"id":21316,"remark":"","marks":"50"},
{"id":21317,"remark":"","marks":"20"}]

该列表包含许多元素。我想做的是遍历整个列表,以当前元素i作为一个点,从中我们检查接下来两个点中该点的标记是多少。如果说的多于好话,而说的多于不好话。 这就是我想要的样子:

my_list = [{"id":21313,"remark":"good","marks":"100"}, 
{"id":21314,"remark":"bad","marks":"29"},
{"id":21315,"remark":"bad","marks":"15"},
{"id":21316,"remark":"NaN","marks":"50"},
{"id":21317,"remark":"NaN","marks":"20"}]

最后两个不可用,因为它们后面没有足够的条目可进行比较。 有办法吗?

2 个答案:

答案 0 :(得分:1)

您可以对sliding window iterator进行稍加修改的版本,其中还包括最后几个元素:

from itertools import islice

def diminishing_window(seq, n=2):
    """
    (s0, ..., s[n-1]), (s1, ..., sn), ..., (s[-2], s[-1]), (s[-1])
    """
    it = iter(seq)
    result = tuple(islice(it, n))
    if len(result) == n:
        yield result
    for elem in it:
        result = result[1:] + (elem,)
        yield result
    result = result[1:]
    while result:
        yield result
        result = result[1:]

这将使您在数据上拥有n个“窗口”的宽度,直到最后几个窗口为止,这将逐渐变小。如果我们认为这些窗口中的第一个项目是“打开”的项目,则可以将其与窗口中的其他项目进行比较,以确定其结果。

def dict_replace(d, **kwargs):
    res = d.copy()
    res.update(kwargs)
    return res

def get_remark(a, b):
    if len(b) < 2:
        return "NAN"
    elif all(int(a["marks"]) > int(d["marks"]) for d in b):
        return "good"
    else:
        return "bad"

new_list = [dict_replace(a, remark=get_remark(a, b)) for a, *b in diminishing_window(my_list, 3)]

print(new_list)
# [{'id': 21313, 'remark': 'good', 'marks': '100'}, {'id': 21314, 'remark': 'bad', 'marks': '29'}, 
#  {'id': 21315, 'remark': 'bad', 'marks': '15'}, {'id': 21316, 'remark': 'NAN', 'marks': '50'}, 
#  {'id': 21317, 'remark': 'NAN', 'marks': '20'}]

答案 1 :(得分:1)

  

将当前元素i作为一个点,从中检查   接下来两点的这一点或多或少。

您可以使用zip_longest使用for循环以3个为一组进行迭代:

from itertools import zip_longest

# dictionary mapping for remark strings
rems = {1: 'good', 0: 'bad'}

for d1, d2, d3 in zip_longest(my_list, my_list[1:], my_list[2:], fillvalue={}):
    if not (d2 and d3):
        d1['remark'] = 'NaN'
    else:
        d1['remark'] = rems[int(d1['marks']) > max(int(d2['marks']), int(d3['marks']))]

print(my_list)

# [{'id': 21313, 'marks': '100', 'remark': 'good'},
#  {'id': 21314, 'marks': '29', 'remark': 'bad'},
#  {'id': 21315, 'marks': '15', 'remark': 'bad'},
#  {'id': 21316, 'marks': '50', 'remark': 'NaN'},
#  {'id': 21317, 'marks': '20', 'remark': 'NaN'}]

顺便说一句,您可能希望将这些marks存储为整数(或浮点数)而不是字符串。这样可以避免每次调用int进行比较。