这是一个有趣的。当我使用filter
或生成器发现一些意外结果时,我实际上正在为another question写一个答案。我有一个文件路径列表:
paths = ['/directoryb/baba.txt', '/directorya/nigel.txt', '/directoryb/ralph.txt', '/directorya/jim.txt'
我在路径列表中创建了一组不同的目录:
from os.path import dirname
dirs = {dirname(path) for path in paths}
现在我想制作一个生成器列表(甚至是生成器生成器),每个生成器都包含同一目录中paths
的元素。所以我这样做:
dirs_iter = [(path for path in paths if path.startswith(dir)) for dir in dirs]
跑步后我没有惊讶:
for dir_iter in dirs_iter:
for path in dir_iter:
print(path)
获得以下内容:
/directorya/nigel.txt
/directorya/jim.txt
/directorya/nigel.txt
/directorya/jim.txt
这显然是错误的。然而,如果我使用以下句子:
# now I'm generating the lists instead of using generators
dirs_iter = [[path for path in paths if path.startswith(dir)] for dir in dirs]
打印循环显示预期答案:
/directoryb/baba.txt
/directoryb/ralph.txt
/directorya/nigel.txt
/directorya/jim.txt
如果我使用filter
和/或map
代替生成器:
dirs_iter = map(lambda dir: filter(lambda path: path.startswith(dir), paths), dirs)
我的答案也错了 编辑: map
/ filter
版本确实有效。
这里发生了什么?
答案 0 :(得分:2)
名称dir
是一个闭包,在执行生成器时查找,而不是在定义它时。到那时dir
最后绑定到dirs
中的最后一个值:
>>> from os.path import dirname
>>> paths = ['/directoryb/baba.txt', '/directorya/nigel.txt', '/directoryb/ralph.txt', '/directorya/jim.txt']
>>> dirs = {dirname(path) for path in paths}
>>> def echo(value):
... print('echoing:', value)
... return value
...
>>> dirs_iter = [(path for path in paths if path.startswith(echo(dir))) for dir in dirs]
>>> for dir_iter in dirs_iter:
... print('Iterating over the next dir_iter generator')
... for path in dir_iter:
... print(path)
...
Iterating over the next dir_iter generator
echoing: /directoryb
/directoryb/baba.txt
echoing: /directoryb
echoing: /directoryb
/directoryb/ralph.txt
echoing: /directoryb
Iterating over the next dir_iter generator
echoing: /directoryb
/directoryb/baba.txt
echoing: /directoryb
echoing: /directoryb
/directoryb/ralph.txt
echoing: /directoryb
>>> list(dirs)
['/directorya', '/directoryb']
因为Python 3使用随机散列种子,所以在我的运行中/directoryb
是最后一次而不是/directorya
,但只有当我们实际迭代时,才会看到{ {1}}生成器dir_iter
值被访问(并回显),并且在那时它被设置为一个值。 dir
行显示list(dirs)
集合以什么顺序产生其值。
请注意,dirs
不会出现此问题;您的filter()
和map()
组合工作正常:
filter()