Question

我目前有6个单独的for循环，它遍历一个数字列表，寻找匹配较大序列中特定数字序列的数据，并将其替换为：

[...0,1,0...] => [...0,0,0...]
[...0,1,1,0...] => [...0,0,0,0...]
[...0,1,1,1,0...] => [...0,0,0,0,0...]

他们的反面：

[...1,0,1...] => [...1,1,1...]
[...1,0,0,1...] => [...1,1,1,1...]
[...1,0,0,0,1...] => [...1,1,1,1,1...]

我现有的代码是这样的：

for i in range(len(output_array)-2):
    if output_array[i] == 0 and output_array[i+1] == 1 and output_array[i+2] == 0:
        output_array[i+1] = 0

for i in range(len(output_array)-3):
    if output_array[i] == 0 and output_array[i+1] == 1 and output_array[i+2] == 1 and output_array[i+3] == 0:
        output_array[i+1], output_array[i+2] = 0

总的来说，我使用暴力检查迭代相同的output_array 6次。有更快的方法吗？

Answer 1

# I would create a map between the string searched and the new one.

patterns = {}
patterns['010'] = '000'
patterns['0110'] = '0000'
patterns['01110'] = '00000'

# I would loop over the lists

lists = [[0,1,0,0,1,1,0,0,1,1,1,0]]

for lista in lists:

    # i would join the list elements as a string
    string_list = ''.join(map(str,lista))

    # we loop over the patterns
    for pattern,value in patterns.items():

        # if a pattern is detected, we replace it
        string_list = string_list.replace(pattern, value)
        lista = list(string_list)
    print lista

Answer 2

虽然这个问题与问题Here和Here有关，但OP的问题涉及一次快速搜索多个序列。虽然接受的答案效果很好，但我们可能不希望遍历基本序列的每个子迭代的所有搜索序列。

以下算法仅在基本序列中存在（i-1）整数的序列时才检查i个序列的序列

# This is the driver function which takes in a) the search sequences and 
# replacements as a dictionary and b) the full sequence list in which to search 

def findSeqswithinSeq(searchSequences,baseSequence):
    seqkeys = [[int(i) for i in elem.split(",")] for elem in searchSequences]
    maxlen = max([len(elem) for elem in seqkeys])
    decisiontree = getdecisiontree(seqkeys)
    i = 0
    while i < len(baseSequence):
        (increment,replacement) = get_increment_replacement(decisiontree,baseSequence[i:i+maxlen])
        if replacement != -1:
            baseSequence[i:i+len(replacement)] = searchSequences[",".join(map(str,replacement))]
        i +=increment
    return  baseSequence

#the following function gives the dictionary of intermediate sequences allowed
def getdecisiontree(searchsequences):
    dtree = {}
    for elem in searchsequences:
        for i in range(len(elem)):
            if i+1 == len(elem):
                dtree[",".join(map(str,elem[:i+1]))] = True
            else:
                dtree[",".join(map(str,elem[:i+1]))] = False
    return dtree

# the following is the function does most of the work giving us a) how many
# positions we can skip in the search and b)whether the search seq was found
def get_increment_replacement(decisiontree,sequence):
    if str(sequence[0]) not in decisiontree:
        return (1,-1)
    for i in range(1,len(sequence)):
        key = ",".join(map(str,sequence[:i+1]))
        if key not in decisiontree:
            return (1,-1)
        elif decisiontree[key] == True:
            key = [int(i) for i in key.split(",")]
            return (len(key),key)
    return 1, -1

您可以使用以下代码段测试上述代码：

if __name__ == "__main__":
    inputlist = [5,4,0,1,1,1,0,2,0,1,0,99,15,1,0,1]
    patternsandrepls = {'0,1,0':[0,0,0],
                        '0,1,1,0':[0,0,0,0],
                        '0,1,1,1,0':[0,0,0,0,0],
                        '1,0,1':[1,1,1],
                        '1,0,0,1':[1,1,1,1],
                        '1,0,0,0,1':[1,1,1,1,1]}

    print(findSeqswithinSeq(patternsandrepls,inputlist))

所提出的解决方案将要搜索的序列表示为决策树。

由于跳过了许多搜索点，我们应该能够用这种方法做得比O（m * n）更好（其中m是搜索序列的数量，n是基本序列的长度。

编辑：根据编辑问题的更清晰度更改答案。

在Python列表/数组中搜索和替换多个特定的元素序列

2 个答案: