Question

我有一个函数可以递归地给出一个字符串的子串。有谁能告诉我这是多么复杂？我猜它是O（2 * n），因为给定n的输入，可以有2个 * n个子串，但我不是100％肯定。

以下是代码：

def build_substrings(string):
    """ Returns all subsets that can be formed with letters in string. """
    result = []
    if len(string) == 1:
        result.append(string)
    else:
        for substring in build_substrings(string[:-1]):
            result.append(substring)
            substring = substring + string[-1]
            result.append(substring)
        result.append(string[-1])
    return result

我实际上有更多的问题，我认为不值得一个新话题。我想知道在Python中搜索字典中的键的复杂性（如果字典中的项目）？谢谢你的帮助！

Answer 1

首先，这里有两种编写函数的方法。

# this one's about the same speed
import itertools
def build_substrings_2(s):
    return [''.join(r) for r in itertools.product(*(['',ch] for ch in s))]

# this one's about 4 times faster
def build_substrings_3(s):
    res = [""]
    for ch in s:
        res += [r+ch for r in res]
    return res

以下是衡量速度的方法：

import matplotlib.pyplot as plt
from itertools import izip
import timeit

xs = range(3, 25)
fns = ['build_substrings_1', 'build_substrings_2', 'build_substrings_3']
res = [(fn, []) for fn in fns]
for i,s in ((chars,"a"*chars) for chars in xs):
    ts  = [
        timeit.Timer(
            '{}({})'.format(fn, repr(s)),
            'from __main__ import {}'.format(fn)
        )
        for fn in fns
    ]
    for t,r in izip(ts, res):
        r[1].append(min(t.repeat(number=10)))

fig = plt.figure()
ax = fig.add_subplot(111, yscale='log')
for label,dat in res:
    ax.plot(xs, dat, label=label)
legend = plt.legend(loc='upper left')

enter image description here

（y轴是运行时的日志，以秒为单位，x轴是输入字符串的长度，以字符为单位）

以下是找到最佳多项式拟合的方法：

import numpy

data = [numpy.log10(r[1]) for r in res]       # take log of data
best = [numpy.polyfit(xs[5:], dat[5:], 1) for dat in data]   # find best-fit line
big_o = [10**(b[0]) for b in best]         # convert slope back to power

（感谢DSM提供了这种简化方法！）

导致

[2.0099844256336676, 2.0731239717002787, 2.0204035253442099]

...你的职能是关于O（n ** 2.00998）

Answer 2

如果N是string的长度。长度> gt = 1＆lt; = N的子串的数量是（N * N + 1）/ 2.

所以时间复杂度将是O（N ** 2）

python dict是一个散列映射，因此如果散列函数不好并导致大量冲突，则最坏的情况是O（n）。然而，这是一种非常罕见的情况，其中添加的每个项目都具有相同的哈希值，因此被添加到同一个链中，对于主要的Python实现来说极不可能。平均时间复杂度当然是O（1）。

Python中字典查找的时间复杂度

2 个答案: