Question

我有两个列表，其中包含时间序列的时间和价值。有一个相应的列表，其中包含布尔值，用于标识NAN值在时间序列中的位置。我需要做的是，如果True值（即NAN值）重复超过5次（连续6个NAN值），则将列表拆分为两个（在序列的开头和结尾，因此没有NAN值在两个结果列表中。所以基本上，我需要将列表拆分成一个较小的列表列表，开始和结束的地方有一个包含6个以上重复NAN值的间隙。我尝试了以下几行：

    for i in range(len(nan_list)-5):
        if nan_list[i] == True and nan_list[i+1] == True and nan_list[i+2] == True and nan_list[i+3] == True and nan_list[i+4] == True and nan_list[i+5] == True:

我不确定从这里开始的最佳方式是什么，我确信有更好的方法。

然后我需要做的是，重复NAN值的重复次数小于5次（连续6个NAN值），将这些值替换为使用scipy中的b-spline计算的值。我不太清楚如何解决这个问题。谢谢！

Answer 1

如果我理解你的身份，你想要根据另一个列表的索引（假设是一个长度相同的列表）拆分一个列表，使得n该另一个列表中的重复元素定义切片的位置应该发生。一个优雅的，但不是最高效的方式是迭代你的其他列表的n大小的片段并检查当前索引是否all(nan_list[i:i+n]) - 如果是，将从该索引之前的第一个列表中的所有内容作为切片放入结果中，然后跳过n个位置并重复该过程。然而，我更喜欢程序方法：

def split_list(source, nan_data, nan_len=6):
    result = []  # a list for our final result
    last_pos = 0  # holds the last sliced position
    counter = 0  # a counter for sequential NaNs
    for i, is_nan in enumerate(nan_data):
        if not is_nan:  # encountered a non-NaN, check how many consecutive NaNs we had
            if counter >= nan_len:  # we have a match...
                result.append(source[last_pos:i-counter])  # add a new slice to our result
                last_pos = i  # set the new slice position
            counter = 0  # reset the counter
        else:
            counter += 1  # NaN found, increase the counter
    # find the last slice, if any
    left_over = source[last_pos:] if counter < nan_len else source[last_pos:-counter]
    if left_over:
        result.append(left_over)
    return result  # return the result

然后，您可以使用它根据source列表中的nan_len个连续True值（或任何评估为True的值）拆分任何nan_data列表，例如：

base_list = ["01", "02", "03", "04", "05", "06", "07", "08", "09", "10",
             "11", "12", "13", "14", "15", "16", "17", "18", "19", "20"]
nan_list = [True, False, True, False, True, True, True, True, True, True,
            False, True, False, True, True, False, True, True, True, False]

print(split_list(base_list, nan_list, 3))
# [['01', '02', '03', '04'], ['11', '12', '13', '14', '15', '16'], ['20']]

在布尔值重复n次

1 个答案: