如何使用ZipOutputStream创建压缩的Zip存档,以便ZipEntry的方法getSize()返回正确的大小?

时间:2015-03-16 15:51:57

标签: java java-8 zipfile zipinputstream zipoutputstream

考虑将单个文件test_file.pdf放入zip存档test.zip然后阅读此存档的代码示例:

import java.io.*;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;

public class Main {
    public static void main(String[] args) {
        File infile = new File("test_file.pdf");
        try (
                FileInputStream fis = new FileInputStream(infile);
                ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("test.zip"));
        ) {
            int bytesRead;
            byte[] buffer = new byte[1024];
            ZipEntry entry = new ZipEntry("data");
            entry.setSize(infile.length());

            zos.putNextEntry(entry);
            while ((bytesRead = fis.read(buffer)) >= 0)
            {
                zos.write(buffer, 0, bytesRead);
            }
            zos.closeEntry();

        } catch (IOException e) {
            e.printStackTrace();
        }

        try (
                ZipInputStream zis = new ZipInputStream(new BufferedInputStream(
                        new FileInputStream(new File("test.zip"))));
        ) {
            ZipEntry entry = zis.getNextEntry();
            System.out.println("Entry size: " + entry.getSize());
            zis.closeEntry();

        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}

输出: Entry size: -1

但是如果创建未压缩的zip存档(方法ZipEntry.STORED),则getSize()返回正确的大小:

import java.io.*;
import java.util.zip.CRC32;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;

public class Main {
    public static void main(String[] args) {
        File infile = new File("test_file.pdf");
        try (
                FileInputStream fis = new FileInputStream(infile);
                ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("test.zip"));
        ) {
            int bytesRead;
            byte[] buffer = new byte[1024];
            CRC32 crc = new CRC32();
            try (
                    BufferedInputStream bis = new BufferedInputStream(new FileInputStream(infile));
             ) {
                crc.reset();
                while ((bytesRead = bis.read(buffer)) != -1) {
                    crc.update(buffer, 0, bytesRead);
                }
            }
            ZipEntry entry = new ZipEntry("data");
            entry.setMethod(ZipEntry.STORED);
            entry.setCompressedSize(infile.length());
            entry.setSize(infile.length());
            entry.setCrc(crc.getValue());

            zos.putNextEntry(entry);
            while ((bytesRead = fis.read(buffer)) >= 0)
            {
                zos.write(buffer, 0, bytesRead);
            }
            zos.closeEntry();

        } catch (IOException e) {
            e.printStackTrace();
        }

        try (
                ZipInputStream zis = new ZipInputStream(new BufferedInputStream(
                        new FileInputStream(new File("test.zip"))));
        ) {
            ZipEntry entry = zis.getNextEntry();
            System.out.println("Entry size: " + entry.getSize());
            zis.closeEntry();

        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}

输出(例如但正确): Entry size: 9223192

存在正确entry.getSize()的压缩zip存档(例如Ark程序的zip存档)。

所以问题:如何创建压缩ZipEntry.DEFLATED或另一个if exists)zip存档,只使用标准库返回正确的条目大小?

我尝试了this recommendation但它也不起作用:

import java.io.*;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;
import java.util.zip.ZipOutputStream;

public class Main {
    public static void main(String[] args) {
        File infile = new File("test_file.pdf");
        try (
                FileInputStream fis = new FileInputStream(infile);
                ZipOutputStream zos = new ZipOutputStream(new FileOutputStream("test.zip"));
        ) {
            int bytesRead;
            byte[] buffer = new byte[1024];
            ZipEntry entry = new ZipEntry("data");
            entry.setSize(infile.length());

            zos.putNextEntry(entry);
            while ((bytesRead = fis.read(buffer)) >= 0)
            {
                zos.write(buffer, 0, bytesRead);
            }
            zos.closeEntry();

        } catch (IOException e) {
            e.printStackTrace();
        }

        try (
                ZipInputStream zis = new ZipInputStream(new BufferedInputStream(
                        new FileInputStream(new File("test.zip"))));
        ) {
            ZipEntry entry = zis.getNextEntry();
            byte[] buffer = new byte[1];
            zis.read(buffer);
            System.out.println("Entry size: " + entry.getSize());
            zis.closeEntry();

        } catch (IOException e) {
            e.printStackTrace();
        }

    }
}

输出: Entry size: -1

2 个答案:

答案 0 :(得分:4)

如果您还设置了CRC和压缩大小,则只能设置未压缩的大小。由于这些信息之前存储在实际数据之前的标题中,并且ZipOutputStream无法回退任意OutputStream s,因此在编写和存储它们之后无法计算这些值(但它会计算它们)用于验证提供的值。)

这是一种在写入之前计算一遍中的值的解决方案。它利用了一个事实,即如果文件由文件支持,你可以回放它。

public static void main(String[] args) throws IOException {
    File infile  = new File("test_file.pdf");
    File outfile = new File("test.zip");
    try (FileInputStream  fis = new FileInputStream(infile);
         FileOutputStream fos = new FileOutputStream(outfile);
         ZipOutputStream  zos = new ZipOutputStream(fos) ) {

        byte[]  buffer = new byte[1024];
        ZipEntry entry = new ZipEntry("data");
        precalc(entry, fis.getChannel());
        zos.putNextEntry(entry);
        for(int bytesRead; (bytesRead = fis.read(buffer)) >= 0; )
            zos.write(buffer, 0, bytesRead);
        zos.closeEntry();
    }

    try(FileInputStream fin = new FileInputStream(outfile);
        ZipInputStream  zis = new ZipInputStream(fin) ) {

        ZipEntry entry = zis.getNextEntry();
        System.out.println("Entry size: " + entry.getSize());
        System.out.println("Compressed size: " + entry.getCompressedSize());
        System.out.println("CRC: " + entry.getCrc());
        zis.closeEntry();
    }
}

private static void precalc(ZipEntry entry, FileChannel fch) throws IOException {
    long uncompressed = fch.size();
    int method = entry.getMethod();
    CRC32 crc = new CRC32();
    Deflater def;
    byte[] drain;
    if(method != ZipEntry.STORED) {
        def   = new Deflater(Deflater.DEFAULT_COMPRESSION, true);
        drain = new byte[1024];
    }
    else {
        def   = null;
        drain = null;
    }
    ByteBuffer buf = ByteBuffer.allocate((int)Math.min(uncompressed, 4096));
    for(int bytesRead; (bytesRead = fch.read(buf)) != -1; buf.clear()) {
        crc.update(buf.array(), buf.arrayOffset(), bytesRead);
        if(def!=null) {
            def.setInput(buf.array(), buf.arrayOffset(), bytesRead);
            while(!def.needsInput()) def.deflate(drain, 0, drain.length);
        }
    }
    entry.setSize(uncompressed);
    if(def!=null) {
        def.finish();
        while(!def.finished()) def.deflate(drain, 0, drain.length);
        entry.setCompressedSize(def.getBytesWritten());
    }
    entry.setCrc(crc.getValue());
    fch.position(0);
}

它处理未压缩和压缩的条目,但不幸的是,只有默认压缩级别ZipOutputStream没有查询当前级别的方法。因此,如果您更改压缩级别,则必须保持prealc代码同步。或者,您可以将逻辑移动到ZipOutputStream的子类中并使用相同的Deflater,以便它自动具有相同的配置。

使用任意源输入流的解决方案需要缓冲整个条目数据。

答案 1 :(得分:-2)

又脏又快

public static void main(String[] args) throws IOException 
{
    FileInputStream fis = new FileInputStream( "source.txt" );
    FileOutputStream fos = new FileOutputStream( "result.zip" );
    ZipOutputStream zos = new ZipOutputStream( fos );

    byte[] buf = new byte[fis.available()];
    fis.read(buf);
    ZipEntry e = new ZipEntry( "source.txt" );

    updateEntry(e, buf);

    zos.putNextEntry(e);
    zos.write(buf);
    zos.closeEntry();

    zos.close();
}

private static void updateEntry(ZipEntry entry, byte[] buffer) throws IOException
{
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    ZipOutputStream zos = new ZipOutputStream( bos );
    zos.putNextEntry(entry);
    zos.write(buffer);
    zos.closeEntry();
    zos.close();
    bos.close();
}