Question

我正在使用Python 3进行368B on CodeForces，它基本上要求您在给定数组的一系列“后缀”中打印唯一元素的数量。这是我的解决方案（带有一些额外的重定向代码用于测试）：

import sys

if __name__ == "__main__":
    f_in = open('b.in', 'r')
    original_stdin = sys.stdin
    sys.stdin = f_in

    n, m = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
    a = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
    l = [None] * m
    for i in range(m):
        l[i] = int(sys.stdin.readline().rstrip())

    l_sorted = sorted(l)
    l_order = sorted(range(m), key=lambda k: l[k])

    # the ranks of elements in l
    l_rank = sorted(range(m), key=lambda k: l_order[k])

    # unique_elem[i] = non-duplicated elements between l_sorted[i] and l_sorted[i+1]
    unique_elem = [None] * m

    for i in range(m):
        unique_elem[i] = set(a[(l_sorted[i] - 1): (l_sorted[i + 1] - 1)]) if i < m - 1 else set(a[(l_sorted[i] - 1): n])

    # unique_elem_cumulative[i] = non-duplicated elements between l_sorted[i] and a's end
    unique_elem_cumulative = unique_elem[-1]

    # unique_elem_cumulative_count[i] = #unique_elem_cumulative[i]
    unique_elem_cumulative_count = [None] * m
    unique_elem_cumulative_count[-1] = len(unique_elem[-1])

    for i in range(m - 1):
        i_rev = m - i - 2
        unique_elem_cumulative = unique_elem[i_rev] | unique_elem_cumulative
        unique_elem_cumulative_count[i_rev] = len(unique_elem_cumulative)

    with open('b.out', 'w') as f_out:
        for i in range(m):
            idx = l_rank[i]
            f_out.write('%d\n' % unique_elem_cumulative_count[idx])

    sys.stdin = original_stdin
    f_in.close()

代码显示正确的结果，除了可能的最后一次大测试，n = 81220和m = 48576（模拟输入文件是here，并且天真解决方案创建的预期输出是here）。时间限制为1秒，我无法解决问题。那么用Python 3可以在1秒内解决它吗？谢谢。

UPDATE ：添加了“预期”输出文件，该文件由以下代码创建：

import sys

if __name__ == "__main__":
    f_in = open('b.in', 'r')
    original_stdin = sys.stdin
    sys.stdin = f_in

    n, m = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
    a = [int(i) for i in sys.stdin.readline().rstrip().split(' ')]
    with open('b_naive.out', 'w') as f_out:
        for i in range(m):
            l_i = int(sys.stdin.readline().rstrip())
            f_out.write('%d\n' % len(set(a[l_i - 1:])))

    sys.stdin = original_stdin
    f_in.close()

Answer 1

我认为你会把它剪掉。在我公认的相当老旧的机器上，每次运行I / O只需0.9秒。

我认为，一种有效的算法将是向后迭代数组，跟踪你找到的不同元素。找到新元素后，将其索引添加到列表中。因此，这将是一个降序排序列表。

然后对于每个l _i，此列表中的l _i的索引将是答案。

对于小样本数据集

10 10
1 2 3 4 1 2 3 4 100000 99999
1
2
3
4
5
6
7
8
9
10

该列表将包含[10, 9, 8, 7, 6, 5]，因为从右侧读取时，第一个不同的值出现在索引10处，第二个值出现在索引9处，依此类推。

那么如果l _i = 5，则它在生成的列表中具有索引6，因此在索引＆gt; = l _i处找到6个不同的值。答案是6

如果l _i = 8，则它在生成的列表中具有索引3，因此在索引＆gt; = l _i处找到3个不同的值。答案是3

练习数字1-indexed和python计数0-indexed有点繁琐。为了使用现有的库函数快速找到这个索引，我颠倒了列表然后使用bisect。

import timeit
from bisect import bisect_left

def doit():
    f_in = open('b.in', 'r')
    n, m = [int(i) for i in f_in.readline().rstrip().split(' ')]
    a = [int(i) for i in f_in.readline().rstrip().split(' ')]
    found = {}
    indices = []
    for i in range(n - 1, 0, -1):
        if not a[i] in found:
            indices.append(i+1)
            found[a[i]] = True

    indices.reverse()
    length = len(indices)
    for i in range(m):
        l = int(f_in.readline().rstrip())
        index = bisect_left(indices, l)
        print length - index

if __name__ == "__main__":
    print (timeit.timeit('doit()', setup="from bisect import bisect_left;from __main__ import doit", number=10))

在我的机器上输出12秒进行10次运行。还是太慢了。

用于计算数组“后缀”中的唯一元素的高效算法

1 个答案: