itertools.chain返回意外的迭代器

时间:2019-05-16 14:05:02

标签: python python-3.x itertools

据我了解,Python itertools.chain旨在链接多个迭代器。

当第一个生成器包含['a/a.jpg', 'a/b.jpg']且第二个生成器为空生成器时,预期输出为['a/a.jpg', 'a/b.jpg']

但是下面的代码给我一个令人困惑的结果['a/b/a.jpg', 'a/b/b.jpg']

import itertools
import os

jpeg_paths = iter([])
# jpeg_paths = []

walk = [("a", ["a.jpg", "b.jpg"]), ("a/b", ["a.txt"])]

for dirpath, filenames in walk:
    # select image files
    jpg_filenames = filter(lambda name: str.endswith(name, "jpg"), filenames)
    # generate absolute path
    image_fullpath = map(lambda name: os.path.join(dirpath, name), jpg_filenames)

    jpeg_paths = itertools.chain(jpeg_paths, image_fullpath)
    # jpeg_paths += image_fullpath

a = list(jpeg_paths)
print(a)

1 个答案:

答案 0 :(得分:2)

原因是iterable是用last dirpath a/b执行的。并非itertools总是返回iterator,在迭代之前,它不会执行。

因此,要将dirpathiteration循环中的每个for关联起来,我们可以使用像function这样的简单mapfunc。所以结果代码就像

import itertools
import os

jpeg_paths = []

walk = [("a", ["a.jpg", "b.jpg"]), ("a/b", ["a.txt"])]

def mapfunc(filenames, dirpath=None): # `dirpath` will be associated with each function object
    return map(lambda name: os.path.join(dirpath, name), filenames)


for dirpath, filenames in walk:
    # select image files
    jpg_filenames = filter(lambda name: name.endswith("jpg"), filenames)
    # generate absolute path
    #break
    image_fullpath = mapfunc(jpg_filenames, dirpath=dirpath) # associate the `dirpath` to each `function` object
    jpeg_paths = itertools.chain(jpeg_paths, image_fullpath)

print(list(jpeg_paths))

或者您可以在每次迭代中用尽iterator,例如

image_fullpath = tuple(map(lambda name: os.path.join(dirpath, name), jpg_filenames))

因此它将把那一刻的dirpathjpg_filenames调用相关联。 但这会将所有objects保留在memory中,如果您要走路的东西很大,则不是个好主意:)