在Ubuntu和Windows中解压缩相同文件时的不同目录结构

时间:2020-02-07 19:24:21

标签: python zip

我正在尝试提取一个zip文件的内容,可以在此处查看该文件:

https://www.geoboundaries.org/data/geoBoundaries-2_0_0/NGA/ADM1/geoBoundaries-2_0_0-NGA-ADM1-all.zip

在Ubuntu 18.04.04上,通过右键单击菜单中的“提取”选项,我从该zip文件中获得了一种文件夹结构,其中包括各种空文件夹和目录,以及其他父文件夹。如果我使用7Zip解压缩同一文件(在Windows或同一Linux盒子上),则可以得到6个文件的预期结果。

所以-这里有什么区别?

(请注意,我已经有了一个解决方案-shutil存档有效-只是试图了解不同的行为)。

这是当前用于构建有问题的zip的代码(python):

def zipdir(dirPath=None, zipFilePath=None, includeDirInZip=False, citeUsePath=False):
  if not zipFilePath:
    zipFilePath = dirPath + ".zip"
  if not os.path.isdir(dirPath):
    raise OSError("dirPath argument must point to a directory. "
            "'%s' does not." % dirPath)
  parentDir, dirToZip = os.path.split(dirPath)

  def trimPath(path):
    archivePath = path.replace(parentDir, "", 1)
    if parentDir:
      archivePath = archivePath.replace(os.path.sep, "", 1)
    if not includeDirInZip:
      archivePath = archivePath.replace(dirToZip + os.path.sep, "", 1)
    return os.path.normcase(archivePath)

  outFile = zipfile.ZipFile(zipFilePath, "w",compression=zipfile.ZIP_DEFLATED)
  for (archiveDirPath, dirNames, fileNames) in os.walk(dirPath):
    for fileName in fileNames:
      if(not fileName == zipFilePath.split("/")[-1]):
        filePath = os.path.join(archiveDirPath, fileName)
        outFile.write(filePath, trimPath(filePath))

  outFile.write(citeUsePath, os.path.basename(citeUsePath))
  outFile.close() 

1 个答案:

答案 0 :(得分:1)

zip文件useEffect(() => { const listener = e => { e.preventDefault() console.log(showMenu, ' useEffect - touchmove') } document.body.addEventListener('touchmove', listener, { passive: false }) return () = { document.body.removeEventListener('touchmove', listener, { passive: false }) } }, [showMenu]) 是非标准的。

在Linux上,geoBoundaries-2_0_0-NGA-ADM1-all.zip认为有5个文件没有路径成分

unzip

如果我尝试提取内容,则会收到很多警告。

$ unzip -l geoBoundaries-2_0_0-NGA-ADM1-all.zip
Archive:  geoBoundaries-2_0_0-NGA-ADM1-all.zip
  Length      Date    Time    Name
---------  ---------- -----   ----
   374953  2020-01-15 21:04   geoBoundaries-2_0_0-NGA-ADM1-shp.zip
  1512980  2020-01-15 21:04   geoBoundaries-2_0_0-NGA-ADM1.geojson
      804  2020-01-15 21:04   geoBoundaries-2_0_0-NGA-ADM1-metaData.json
      750  2020-01-15 21:04   geoBoundaries-2_0_0-NGA-ADM1-metaData.txt
     4656  2020-01-15 21:04   CITATION-AND-USE-geoBoundaries-2_0_0.txt
---------                     -------
  1894143                     5 files

分析

zip文件中每个条目的详细信息(包括文件名)都存储两次。一次出现在$ unzip geoBoundaries-2_0_0-NGA-ADM1-all.zip Archive: geoBoundaries-2_0_0-NGA-ADM1-all.zip geoBoundaries-2_0_0-NGA-ADM1-shp.zip: mismatching "local" filename (release/geoBoundaries-2_0_0/NGA/ADM1/geoBoundaries-2_0_0-NGA-ADM1-shp.zip), continuing with "central" filename version inflating: geoBoundaries-2_0_0-NGA-ADM1-shp.zip geoBoundaries-2_0_0-NGA-ADM1.geojson: mismatching "local" filename (release/geoBoundaries-2_0_0/NGA/ADM1/geoBoundaries-2_0_0-NGA-ADM1.geojson), continuing with "central" filename version inflating: geoBoundaries-2_0_0-NGA-ADM1.geojson geoBoundaries-2_0_0-NGA-ADM1-metaData.json: mismatching "local" filename (release/geoBoundaries-2_0_0/NGA/ADM1/geoBoundaries-2_0_0-NGA-ADM1-metaData.json), continuing with "central" filename version inflating: geoBoundaries-2_0_0-NGA-ADM1-metaData.json geoBoundaries-2_0_0-NGA-ADM1-metaData.txt: mismatching "local" filename (release/geoBoundaries-2_0_0/NGA/ADM1/geoBoundaries-2_0_0-NGA-ADM1-metaData.txt), continuing with "central" filename version inflating: geoBoundaries-2_0_0-NGA-ADM1-metaData.txt CITATION-AND-USE-geoBoundaries-2_0_0.txt: mismatching "local" filename (tmp/CITATION-AND-USE-geoBoundaries-2_0_0.txt), continuing with "central" filename version inflating: CITATION-AND-USE-geoBoundaries-2_0_0.txt 中,紧接在压缩数据之前,再次出现在文件末尾的local-header中。因此,对于存储在zip文件中的每个文件,将有一对central-header / local-header字段。这些字段对中的数据应该(大部分)相同。

在这种情况下不是。

例如,考虑central-header的{​​{1}}条目。匹配的central-header具有geoBoundaries-2_0_0-NGA-ADM1-shp.zip

此zip文件中的所有条目都是如此。

鉴于这是一个非标准/无效的zip文件,解压缩时的行为将取决于解压缩实用程序是使用local-header条目中的数据来确定文件名还是使用等效数据在release/geoBoundaries-2_0_0/NGA/ADM1/geoBoundaries-2_0_0-NGA-ADM1-shp.zip中。

好像Ubuntu在使用central-header字段,而7zip在使用local-header字段。

作为参考,zip文件的规范为APPNOTE.TXT