试着看看我是否可以递归地生成文件路径的迭代器。基本上对于基本路径列表和子目录的有序列表,我想生成所有子路径作为两个输入的组合。
即
base_path = ["/a", "/b"], subdir_lists = [ ["1", "2"], ["c", "d"] ]
然后输出应该是
[ "/a", "/a/1", "/a/1/c", "/a/1/d", "a/2", "/a/2/c", "/a/2/d", "/b", "/b/1", ... "/b/2/d" ]
我的python代码看起来像这样。我以递归方式调用appendpaths()。
def appendpaths(subdir_lists, base_path):
if not subdir_lists or len(subdir_lists) == 0:
return base_path
if len(subdir_lists) == 1:
return starmap(os.path.join, product(base_path, subdir_lists[0]))
right = subdir_lists[1:]
iter_list = [base_path, appendpaths(right, starmap(os.path.join, product(base_path, subdir_lists[0])))]
return chain(*iter_list)
def main():
subdir_lists = [["1", "2"], ["c", "d"]]
it = appendpaths(subdir_lists, ["/a", "/b"])
for x in it:
print(x)
main()
我的输出缺少一些排列:
/a
/b
/a/1/c
/a/1/d
/a/2/c
/a/2/d
/b/1/c
/b/1/d
/b/2/c
/b/2/d
你可以看到我缺少/ a / 1,/ a / 2,/ b / 1和/ b / 2。我猜它是因为我的代码中的某个地方已经耗尽了那些迭代这些排列的生成器?
答案 0 :(得分:0)
你有点太复杂 - 如果你只是想要一个连续的列表产品一个简单的递归来合并以前连接的路径(或基础),在每个递归中移动一个级别你需要的所有:
import os
def append_paths(base, children):
paths = []
for e in base:
paths.append(e)
if children: # dig deeper
paths += append_paths([os.path.join(e, c) for c in children[0]], children[1:])
return paths
并测试它:
base_path = ["/a", "/b"] # you might want to prepend with os.path.sep for cross-platform use
subdir_lists = [["1", "2"], ["c", "d"]]
print(append_paths(base_path, subdir_lists))
# ['/a', '/a/1', '/a/1/c', '/a/1/d', '/a/2', '/a/2/c', '/a/2/d',
# '/b', '/b/1', '/b/1/c', '/b/1/d', '/b/2', '/b/2/c', '/b/2/d']
答案 1 :(得分:0)
给出
>>> import pathlib
>>> import itertools as it
>>> base = ["/a", "/b"]
>>> subdirs = [["1", "2"], ["c", "d"]]
辅助itertools食谱:
>>> def powerset(iterable):
... "powerset([1,2,3]) --> () (1,) (2,) (3,) (1,2) (1,3) (2,3) (1,2,3)"
... s = list(iterable)
... return it.chain.from_iterable(it.combinations(s, r) for r in range(len(s)+1))
代码
>>> def subsequence(iterable, pred=None):
... """Return a non-contiguous subsequence."""
... if pred is None: pred = lambda x: x
... return (x for x in powerset(iterable) if x and pred(x))
>>> prods = list(it.product(base, subdirs[0], subdirs[1]))
>>> pred = lambda x: x[0].startswith("/")
>>> result = sorted(set(it.chain.from_iterable(subsequence(p, pred) for p in prods)))
>>> result
[('/a',),
('/a', '1'),
('/a', '1', 'c'),
('/a', '1', 'd'),
('/a', '2'),
('/a', '2', 'c'),
('/a', '2', 'd'),
('/a', 'c'),
('/a', 'd'),
('/b',),
('/b', '1'),
('/b', '1', 'c'),
('/b', '1', 'd'),
('/b', '2'),
('/b', '2', 'c'),
('/b', '2', 'd'),
('/b', 'c'),
('/b', 'd')]
应用程序
以字符串或pathlib
对象的形式加入路径。
>>> ["/".join(x) for x in result];
['/a', '/a/1', '/a/1/c', ...]
>>> [pathlib.Path(*x) for x in result];
[WindowsPath('/a'), WindowsPath('/a/1'), WindowsPath('/a/1/c'), ...]
详细信息
步骤
prods
都是itertools.product
s,它们都接受迭代并以类似于date picker dialog application的方式创建唯一的组合(或笛卡尔积)。请参见下面的示例。subsequence
只是powerset
itertools recipe的包装。它允许使用pred
字符,该字符用于过滤以斜杠开头的结果,例如base
中的斜杠。result
对为每个产品生成的一组平坦的子序列进行排序。您可以根据需要选择加入每个元素。请参阅代码-应用程序。示例
以下是笛卡尔积:
>>> prods
[('/a', '1', 'c'),
('/a', '1', 'd'),
('/a', '2', 'c'),
('/a', '2', 'd'),
('/b', '1', 'c'),
('/b', '1', 'd'),
('/b', '2', 'c'),
('/b', '2', 'd')]
没有谓词的情况下,允许不需要的子序列:
>>> list(subsequence(prods[0]))
[('/a',),
('1',), # bad
('c',),
('/a', '1'),
('/a', 'c'),
('1', 'c' # bad
('/a', '1', 'c')]
因此,我们用谓词pred
过滤掉不需要的元素。
>>> list(subsequence(prods[0], pred=pred))
[('/a',), ('/a', '1'), ('/a', 'c'), ('/a', '1', 'c')]