Question

我有两种逻辑形式来找出单词列表中的字谜：

第一个使用Python的collections模块并使用Counter：

def anagram_checker(cls, word=None, words=[]):

    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = Counter(word)
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if Counter(given_word) == counter_word:
                    anagrams.append(given_word)
    return anagrams

第二个功能具有相同的功能，但是使用了已排序的内置函数。

def anagram_checker(cls, word=None, words=[]):

    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = list(sorted(word))
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if list(sorted(given_word)) == counter_word:
                    anagrams.append(given_word)
    return anagrams

时间复杂度更高。我的意思是比较Python Counter对象具有更好的复杂性还是比较排序列表具有更好的时间复杂性？

如果我没错，比较列表的复杂度为O（n）对。比较两个Counter对象的复杂性是什么？

我搜索了各种文档，但没有找到满意的答案。

请帮助。

Answer 1

我进行了一些测量，尽管比较列表和Counters均为O（n）并创建Counter为O（n），这比对O（n.log n）进行排序要好，anagram_checker与排序是比较快的。

from timeit import timeit
from collections import Counter

def anagram_checker_1(word=None, words=[]):
    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = Counter(word)
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if Counter(given_word) == counter_word:
                    anagrams.append(given_word)
    return anagrams


def anagram_checker_2(word=None, words=[]):
    anagrams = []

    if len(words) == 0:  # if the second arg list is empty then there is nothing to check
        return anagrams
    else:
        counter_word = list(sorted(word))
        for given_word in words:
            if given_word != word:  # Cannot be same as the word in question
                if list(sorted(given_word)) == counter_word:
                    anagrams.append(given_word)
    return anagrams

print(timeit("anagram_checker_1('battle', ['battet', 'batlet', 'battel', 'tablet'])", globals=globals(), number=100_000))
print(timeit("anagram_checker_2('battle', ['battet', 'batlet', 'battel', 'tablet'])", globals=globals(), number=100_000))

输出：

2.3342012430075556
0.47786532100872137

分析字谜1显示以下内容：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   500000    0.662    0.000    2.512    0.000 /usr/lib/python3.6/collections/__init__.py:517(__init__)
   500000    0.501    0.000    0.821    0.000 /usr/lib/python3.6/abc.py:180(__instancecheck__)
   500000    0.438    0.000    1.808    0.000 /usr/lib/python3.6/collections/__init__.py:586(update)
   100000    0.433    0.000    2.978    0.000 example.py:4(anagram_checker_1)
  1000006    0.320    0.000    0.320    0.000 /usr/lib/python3.6/_weakrefset.py:70(__contains__)
   500000    0.283    0.000    0.283    0.000 {built-in method _collections._count_elements}
   500002    0.225    0.000    1.047    0.000 {built-in method builtins.isinstance}
  1100000    0.090    0.000    0.090    0.000 {built-in method builtins.len}
        1    0.042    0.042    3.020    3.020 <timeit-src>:2(inner)
   300000    0.025    0.000    0.025    0.000 {method 'append' of 'list' objects}

因此可以很明显地看出，Python创建Counter对象的开销在此优先于任何算法复杂性优势。

编辑：

Anagram 2中的概要分析报告，以进行比较：

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   500000    0.372    0.000    0.372    0.000 {built-in method builtins.sorted}
   100000    0.353    0.000    0.762    0.000 example.py:18(anagram_checker_2)
        1    0.041    0.041    0.803    0.803 <timeit-src>:2(inner)
   300000    0.028    0.000    0.028    0.000 {method 'append' of 'list' objects}
   100000    0.009    0.000    0.009    0.000 {built-in method builtins.len}

比较将Python'c计数器对象与排序列表进行比较的复杂度

1 个答案: