Question

我要解决以下问题：

给出一组整数，例如{1,3,2}，以及一个随机整数数组，例如

[1, 2, 2, -5, -4, 0, 1, 1, 2, 2, 0, 3,3]

找到包含集合中所有值的最短连续子数组。如果找不到子数组，则返回一个空数组。

结果：[1, 2, 2, 0, 3]

或

[1, 2, 2, -5, -4, 3, 1, 1, 2, 0], {1,3,2}。

结果：[3, 1, 1, 2]

我尝试了以下操作，第二个循环似乎出了点问题。我不确定我需要更改什么：

def find_sub(l, s):
    i = 0
    counts = dict()
    end = 0
    while i < len(s):
        curr = l[end]
        if curr in s:
            if curr in counts:
                counts[curr] = counts[curr] + 1
            else:
                counts[curr] = 1
                i += 1
        end += 1
    curr_len = end

    start = 0
    for curr in l:
        if curr in counts:
            if counts[curr] == 1:
                if end < len(l):
                    next_item = l[end]
                    if next_item in counts:
                        counts[next_item] += 1
                    end += 1
            else:
                counts[curr] -= 1
                start += 1
        else:
            start += 1
    if (end - start) < curr_len:
        return l[start:end]
    else:
        return l[:curr_len]

Answer 1

您可以使用sliding window方法（使用生成器），其想法是生成大小为n（集合大小）到大小N（列表大小）的所有子集，并检查是否有它们存在，找到第一个时停止：

id name address
1  tnu  a123 
2  tn   a23
3  tnu  a1234
4  mnu  dd34
7  mnuu dd34
8  mna  dd3

结果 id name address true_id 1 tnu a123 abc1 2 tn a23 abc1 3 tnu a1234 abc1 4 mnu dd34 def1 7 mnuu dd34 def1 8 mna dd3 def1 这里有live example

Answer 2

您正在使用两指针方法，但是只能将两个索引移动一次-直到找到第一个匹配项。您应该重复move right - move left模式以获取最佳索引间隔。

def find_sub(l, s):
    left = 0
    right = 0
    ac = 0
    lens = len(s)
    map = dict(zip(s, [0]*lens))
    minlen = 100000
    while left < len(l):
        while right < len(l):
            curr = l[right]
            right += 1
            if curr in s:
                c = map[curr]
                map[curr] = c + 1
                if c==0:
                    ac+=1
                    if ac == lens:
                        break
        if ac < lens:
            break

        while left < right:
            curr = l[left]
            left += 1
            if curr in s:
                c = map[curr]
                map[curr] = c - 1
                if c==1:
                    ac-=1
                    break

        if right - left + 1 < minlen:
            minlen = right - left + 1
            bestleft = left - 1
            bestright = right

    return l[bestleft:bestright]

print(find_sub([1, 2, 2, -5, -4, 3, 1, 0, 1, 2, 2, 0, 3, 3], {1,3,2}))
print(find_sub([1, 2, 2, -5, -4, 3, 1, 0, 1, 2, 2, 1, 0, 3, 3], {1,3,2}))
>>[2, -5, -4, 3, 1]
>>[2, 1, 0, 3]

Answer 3

这应该是性能最好的解决方案，以O（n）运行：

def find_sub(l, s):
    if len(l) < len(s):
        return None

    # Keep track of how many elements are in the interval
    counters = {e: 0 for e in s}

    # Current and best interval
    lo = hi = 0
    best_lo = 0
    best_hi = len(l)

    # Increment hi until all elements are in the interval
    missing_elements = set(s)
    while hi < len(l) and missing_elements:
        e = l[hi]
        if e in counters:
            counters[e] += 1
        if e in missing_elements:
            missing_elements.remove(e)
        hi += 1

    if missing_elements:
        # Array does not contain all needed elements
        return None

    # Move the two pointers
    missing_element = None
    while hi < len(l):
        if missing_element is None:
            # We have all the elements
            if hi - lo < best_hi - best_lo:
                best_lo = lo
                best_hi = hi

            # Increment lo
            e = l[lo]
            if e in counters:
                counters[e] -= 1
                if counters[e] == 0:
                    missing_element = e
            lo += 1
        else:
            # We need more elements, increment hi
            e = l[hi]
            if e in counters:
                counters[e] += 1
                if missing_element == e:
                    missing_element = None
            hi += 1

    return l[best_lo:best_hi]


assert find_sub([1, 2, 2, -5, -4, 3, 1, 0, 1, 2, 2, 0, 3, 3], {1, 3, 2}) == [2, -5, -4, 3, 1]
assert find_sub([1, 2, 2, -5, -4, 3, 1, 0, 1, 2, 2, 1, 0, 3, 3], {1, 3, 2}) == [2, 1, 0, 3]
assert find_sub([1, 2, 2, -5, -4, 3, 1, 0, 1, 2, 2, 1, 0, 3, 3], {1, 3, 7}) is None

Answer 4

加入乐趣，这是我的尝试。我不熟悉算法名称，但这似乎是基于@Netwave的答案的滑动窗口方法。

I = {1, 3, 2}
A = [1, 2, 2, -5, -4, 0, 1, 1, 2, 2, 0, 3, 3]

setcount = {i: 0 for i in I}
stage = []
shortest = A

for i in range(len(A)):
    # Subset
    stage.append(A[i])
    # Update the count
    if A[i] in I:
        setcount[A[i]] += 1

    while 0 not in setcount.values():
        # Check if new subset is shorter than existing's
        if len(stage) < len(shortest):
            shortest = stage.copy()

        # Consume the head to get progressively shorter subsets
        if stage[0] in I:
            setcount[stage[0]] -= 1

        stage.pop(0)


>>>print(shortest)
[1, 2, 2, 0, 3]

查找包含集合中所有值的最短连续子数组的算法

4 个答案: