Question

是否存在现有方法，或者在将数据传递给ZipInputStream之前是否需要手动解析并跳过exe块？

Answer 1

在审核EXE file format和ZIP file format并测试各种选项之后，最简单的解决方案就是忽略第一个zip本地文件头的任何前导码。

Zip file layout

Zip local file header

我编写了一个输入流过滤器以绕过前导码并且它完美地运行：

ZipInputStream zis = new ZipInputStream(
    new WinZipInputStream(
    new FileInputStream("test.exe")));
while ((ze = zis.getNextEntry()) != null) {
    . . .
    zis.closeEntry();
}
zis.close();

<强> WinZipInputStream.java

import java.io.FilterInputStream;
import java.io.InputStream;
import java.io.IOException;

public class WinZipInputStream extends FilterInputStream {
    public static final byte[] ZIP_LOCAL = { 0x50, 0x4b, 0x03, 0x04 };
    protected int ip;
    protected int op;

    public WinZipInputStream(InputStream is) {
        super(is);
    }

    public int read() throws IOException {
        while(ip < ZIP_LOCAL.length) {
            int c = super.read();
            if (c == ZIP_LOCAL[ip]) {
                ip++;
            }
            else ip = 0;
        }

        if (op < ZIP_LOCAL.length)
            return ZIP_LOCAL[op++];
        else
            return super.read();
    }

    public int read(byte[] b, int off, int len) throws IOException {
        if (op == ZIP_LOCAL.length) return super.read(b, off, len);
        int l = 0;
        while (l < Math.min(len, ZIP_LOCAL.length)) {
            b[l++] = (byte)read();
        }
        return l;
    }
}

Answer 2

ZIP文件的优点在于它们的顺序结构：每个条目都是一组独立的字节，最后是中央目录索引，它列出了文件中的所有条目及其偏移量。

不好的是，java.util.zip.*类忽略了该索引，只是开始读入该文件并期望第一个条目是本地文件头块，这不是自解压ZIP存档的情况（这些从EXE部分开始）。

几年前，我编写了一个自定义ZIP解析器来提取依赖于CDI的单个ZIP条目（LFH +数据），以查找这些条目在文件中的位置。我刚刚检查过，它实际上可以列出自行提取的ZIP存档的条目，而不会给你带来偏移 - 所以你可以：

使用该代码在EXE部分之后找到第一个LFH，并将之后的所有复制到另一个File，然后将该File提供给{ 1}}到java.util.zip.ZipFile ：

编辑：只是跳过EXE部分似乎无法正常工作，ZipFile仍然无法读取它，我的原生ZIP程序会抱怨新的ZIP文件已损坏且确切我跳过的字节数被称为“缺失”（因此它实际上读取了CDI）。我想有些标题需要重写，所以下面给出的第二种方法看起来更有希望 - 或者

使用该代码进行完整的ZIP提取（类似于java.util.zip）;这需要一些额外的管道，因为代码最初不是作为替换ZIP库而是具有非常特定的用例（通过HTTP差异更新ZIP文件）

代码托管在SourceForge（project page，website）并在Apache License 2.0下获得许可，因此商业用途很好--AFAIK有一个商业游戏，使用它作为游戏资产的更新程序。

从ZIP文件中获取偏移量的有趣部分位于Indexer.parseZipFile，返回LinkedHashMap<Resource, Long>（因此第一个映射条目在文件中的偏移量最小）。这是我用来列出自解压ZIP存档的条目的代码（使用WinZIP SE创建者在Ubuntu上从acra版本文件创建）：

public static void main(String[] args) throws Exception { File archive = new File("/home/phil/downloads", "acra-4.2.3.exe"); Map<Resource, Long> resources = parseZipFile(archive); for (Entry<Resource, Long> resource : resources.entrySet()) { System.out.println(resource.getKey() + ": " + resource.getValue()); } }

除了包含所有头解析类的Indexer类和zip包之外，你可以删掉大部分代码。

Answer 3

某些自解压ZIP文件中存在伪本地文件标头标记。我认为最好向后扫描文件以查找 End Of Central Directory 记录。 EOCD 记录包含中央目录的偏移量， CD 包含第一个本地文件头的偏移量。如果您从本地文件标题的第一个字节开始读取 ZipInputStream正常工作。

显然，下面的代码不是最快的解决方案。如果要处理大型文件，则应实现某种缓冲或使用内存映射文件。

import org.apache.commons.io.EndianUtils;
...

public class ZipHandler {
    private static final byte[] EOCD_MARKER = { 0x06, 0x05, 0x4b, 0x50 };

    public InputStream openExecutableZipFile(Path zipFilePath) throws IOException {
        try (RandomAccessFile raf = new RandomAccessFile(zipFilePath.toFile(), "r")) {
            long position = raf.length() - 1;
            int markerIndex = 0;
            byte[] buffer = new byte[4];
            while (position > EOCD_MARKER.length) {
                raf.seek(position);
                raf.read(buffer, 0 ,1);
                if (buffer[0] == EOCD_MARKER[markerIndex]) {
                    markerIndex++;
                } else {
                    markerIndex = 0;
                }
                if (markerIndex == EOCD_MARKER.length) {
                    raf.skipBytes(15);
                    raf.read(buffer, 0, 4);
                    int centralDirectoryOffset = EndianUtils.readSwappedInteger(buffer, 0);
                    raf.seek(centralDirectoryOffset);
                    raf.skipBytes(42);
                    raf.read(buffer, 0, 4);
                    int localFileHeaderOffset = EndianUtils.readSwappedInteger(buffer, 0);
                    return new SkippingInputStream(Files.newInputStream(zipFilePath), localFileHeaderOffset);
                }
                position--;
            }
            throw new IOException("No EOCD marker found");
        }
    }
}

public class SkippingInputStream extends FilterInputStream {
    private int bytesToSkip;
    private int bytesAlreadySkipped;

    public SkippingInputStream(InputStream inputStream, int bytesToSkip) {
        super(inputStream);
        this.bytesToSkip = bytesToSkip;
        this.bytesAlreadySkipped = 0;
    }

    @Override
    public int read() throws IOException {
        while (bytesAlreadySkipped < bytesToSkip) {
            int c = super.read();
            if (c == -1) {
                return -1;
            }
            bytesAlreadySkipped++;
        }
        return super.read();
    }

    @Override
    public int read(byte[] b, int off, int len) throws IOException {
        if (bytesAlreadySkipped == bytesToSkip) {
            return super.read(b, off, len);
        }
        int count = 0;
        while (count < len) {
            int c = read();
            if (c == -1) {
                break;
            }
            b[count++] = (byte) c;
        }
        return count;
    }
}

Answer 4

在这种情况下，TrueZip效果最佳。（至少在我的情况下）

自解压缩zip的格式为code1 header1 file1（普通zip格式为header1 file1）...代码告诉如何解压缩

虽然Truezip提取实用程序会抱怨额外的字节并引发异常

这是代码

 private void Extract(String src, String dst, String incPath) {
    TFile srcFile = new TFile(src, incPath);
    TFile dstFile = new TFile(dst);
    try {
        TFile.cp_rp(srcFile, dstFile, TArchiveDetector.NULL);
        } 
    catch (IOException e) {
       //Handle Exception
        }
}

您可以将此方法称为Extract（new String（“C：\ 2006Production.exe”），new String（“c：\”），“”）;

文件在c盘中解压缩......您可以对文件执行自己的操作。我希望这会有所帮助。

感谢。

如何从Java中读取Winzip自解压缩（exe）zip文件？

4 个答案: