Question

我希望最大限度地优化这段代码的运行时间：

aDictionary= {"key":["value", "value2", ...

rests = \
         list(map((lambda key: Resp(key=key)),
                     [key for key, values in
                      aDictionary.items() if (test1 in values or test2 in values)]))

使用python3。愿意尽可能多地记忆。

考虑将两个字典查找放在单独的进程上以加速（这有意义吗？）。欢迎任何其他优化想法

值绝对可以排序并转换为集合;它是预先计算的，非常大。
总是len（值）＆gt;＆gt;＆gt;＆gt; len（测试），虽然它们都在随着时间的推移而增长
len（测试）增长非常缓慢，每次执行都有新值
目前正在查看字符串（考虑进行字符串 - >整数映射）

Answer 1

对于初学者，当您已经使用列表推导时，没有理由使用map，因此您可以将其删除，以及外部list调用：

rests = [Resp(key=key) for key, values in aDictionary.items()
         if (test1 in values or test2 in values)]

第二种可能的优化方法可能是将每个值列表转换为一组。这会花费最初的时间，但它会将您的查找（in使用）从线性时间更改为恒定时间。您可能需要为此创建单独的辅助函数。类似的东西：

def anyIn(checking, checkingAgainst):
    checkingAgainst = set(checkingAgainst)
    for val in checking:
        if val in checkingAgainst:
            return True
    return False

然后您可以将列表理解的结尾更改为

...if anyIn([test1, test2], values)]

但是，如果你检查的值超过两个，或者values中的值列表很长，那么这可能是值得的。

Answer 2

如果tests足够多，那么切换到设置操作肯定会有所回报：

tests = set([test1, test2, ...])
resps = map(Resp, (k for k, values in dic.items() if not tests.isdisjoint(values)))  
# resps this is a lazy iterable, not a list, and it uses a 
# generator inside, thus saving the overhead of building 
# the inner list.

将dict值转换为集合将无法获得任何效果，因为转换为O(N)且N是所有values列表的附加大小，而上述不相交的操作只会迭代每个values，直到遇到testx O(1)次查找。

如果你不必使用lambda，那么

map可能比理解更高效，例如如果key可以用作Resp的{{1}}中的第一个位置参数，但肯定不能用于lambda！（Python List Comprehension Vs. Map）。否则，生成器或理解会更好：

__init__

在python字典中执行多重匹配查找的最有效方法是什么？

2 个答案: