我想采用任意数量的路径来表示嵌套的tar档案,并对最里面的档案执行操作。问题是,嵌套可以是任意的,因此我需要的上下文管理器的数量也是任意的。
举个例子:
ARCHIVE_PATH = "path/to/archive.tar"
INNER_PATHS = (
"nested/within/archive/one.tar",
"nested/within/archive/two.tar",
# Arbitary number of these
)
def list_inner_contents(archive_path, inner_paths):
with TarFile(archive_path) as tf1:
with TarFile(fileobj=tf1.extractfile(inner_paths[0])) as tf2:
with TarFile(fileobj=tf2.extractfile(inner_paths[1])) as tf3:
# ...arbitary level of these!
return tfX.getnames()
contents = list_inner_contents(ARCHIVE_PATH, INNER_PATHS))
我无法使用with
语句nesting syntax,因为可以嵌套任意数量的级别。我无法使用contextlib.nested
,因为文档就在那里说:
...使用
nested()
打开两个文件是编程错误,因为如果在打开第二个文件时抛出异常,第一个文件将不会立即关闭。
有没有办法使用语言结构来执行此操作,还是需要手动管理我自己的打开文件对象堆栈?
答案 0 :(得分:4)
对于这种情况,您可以使用递归。对于这种情况来说感觉最自然(当然,如果在Python中没有特殊处理的话):
ARCHIVE_PATH = "path/to/archive.tar"
INNER_PATHS = [
"nested/within/archive/one.tar",
"nested/within/archive/two.tar",
# Arbitary number of these
]
def list_inner_contents(archive_path, inner_paths):
def rec(tf, rest_paths):
if not rest_paths:
return tf.getnames()
with TarFile(fileobj=tf.extractfile(rest_paths[0])) as tf2:
return rec(tf2, rest_paths[1:])
with TarFile(archive_path) as tf:
try:
return rec(tf, inner_paths)
except RuntimeError:
# We come here in case the inner_paths list is too long
# and we go too deeply in the recursion
return None