Question

我有一个遗留脚本，它通过python脚本获取boost库，然后提取然后构建它们。

在Windows上，提取步骤失败，因为对于boost存档中的某些文件，路径太长。例如。

IOError: [Errno 2] No such file or directory: 'C:\\<my_path>\\boost_1_57_0\\libs\\geometry\\doc\\html\\geometry\\reference\\spatial_indexes\\boost__geometry__index__rtree\\rtree_parameters_type_const____indexable_getter_const____value_equal_const____allocator_type_const___.html'

无论如何只需要生成tarfile lib extractall但忽略所有扩展名为.html的文件？

或者，有没有办法允许超过窗口限制266的路径？

Answer 1

您可以遍历tar中的所有文件，只提取那些不以＃34; .html＆＃34;结尾的文件。
进口口 import tarfile

def custom_files(members):
    for tarinfo in members:
        if os.path.splitext(tarinfo.name)[1] != ".html":
            yield tarinfo

tar = tarfile.open("sample.tar.gz")
tar.extractall(members=custom_files(tar))
tar.close()

找到了有关模块的示例代码和信息here

要克服文件名大小的限制，请参阅Microsoft doc]（https://msdn.microsoft.com/en-us/library/aa365247(VS.85).aspx）

Python tarfile extractall除了匹配字符串的文件

1 个答案: