不是GZIP格式 - JAVA

时间:2015-12-03 17:06:55

标签: java formatting gzip eofexception

我尝试将压缩数据写入文件,然后读取数据并使用GZIP库解压缩。我尝试将所有格式更改为StandardCharsets.UTF-8和ISO-8859-1,并且都没有修复GZIP格式错误。我想知道是否可能与我正在阅读的文件有关?这是压缩功能:

public static byte[] compress(String originalFile, String compressFile) throws IOException {

    // read in data from text file
    // The name of the file to open.
    String fileName = originalFile;

    // This will reference one line at a time
    String line = null;
    String original = "";

    try {
        // FileReader reads text files in the default encoding.
        FileReader fileReader = 
            new FileReader(fileName);

        // Always wrap FileReader in BufferedReader.
        BufferedReader bufferedReader = 
            new BufferedReader(fileReader);

        while((line = bufferedReader.readLine()) != null) {
            original.concat(line);
        }   

        // Always close files.
        bufferedReader.close();         
    }
    catch(FileNotFoundException ex) {
        System.out.println(
            "Unable to open file '" + 
            fileName + "'");                
    }
    catch(IOException ex) {
        System.out.println(
            "Error reading file '" 
            + fileName + "'");                  
        // Or we could just do this: 
        // ex.printStackTrace();
    }


    // create a new output stream for original string
    try (ByteArrayOutputStream out = new ByteArrayOutputStream())
    {
        try (GZIPOutputStream gzip = new GZIPOutputStream(out))
        {
            gzip.write(original.getBytes(StandardCharsets.UTF_8));
        }
        byte[] compressed = out.toByteArray();
        out.close();

        String compressedFileName = compressFile;

        try {
            // Assume default encoding.
            FileWriter fileWriter =
                new FileWriter(compressedFileName);

            // Always wrap FileWriter in BufferedWriter.
            BufferedWriter bufferedWriter =
                new BufferedWriter(fileWriter);

            // Note that write() does not automatically
            // append a newline character.
            String compressedStr = compressed.toString();
            bufferedWriter.write(compressedStr);

            // Always close files.
            bufferedWriter.close();
        }
        catch(IOException ex) {
            System.out.println(
                "Error writing to file '"
                + fileName + "'");
            // Or we could just do this:
            // ex.printStackTrace();
        }
        return compressed;
    }
}

(我在以下解压缩功能中接收到错误) -

GZIPInputStream compressedByteArrayStream = new GZIPInputStream(new ByteArrayInputStream(s.getBytes(StandardCharsets.UTF_8)));

减压功能:

 public static String decompress(String file) throws IOException {

    byte[] compressed = {};
    String s = "";

    File fileName = new File(file);
    FileInputStream fin = null;
    try {
        // create FileInputStream object
        fin = new FileInputStream(fileName);

        // Reads up to certain bytes of data from this input stream into an array of bytes.
        fin.read(compressed);
        //create string from byte array
        s = new String(compressed);
        System.out.println("File content: " + s);
    }
    catch (FileNotFoundException e) {
        System.out.println("File not found" + e);
    }
    catch (IOException ioe) {
        System.out.println("Exception while reading file " + ioe);
    }
    finally {
        // close the streams using close method
        try {
            if (fin != null) {
                fin.close();
            }
        }
        catch (IOException ioe) {
            System.out.println("Error while closing stream: " + ioe);
        }
    }


    // create a new input string for compressed byte array
    GZIPInputStream compressedByteArrayStream = new GZIPInputStream(new ByteArrayInputStream(s.getBytes(StandardCharsets.UTF_8)));
    ByteArrayOutputStream byteOutput = new ByteArrayOutputStream();

    byte[] buffer = new byte[8192];

    // create a string builder and byte reader for the compressed byte array
    BufferedReader decompressionBr = new BufferedReader(new InputStreamReader(compressedByteArrayStream, StandardCharsets.UTF_8));
    StringBuilder decompressionSb = new StringBuilder();

    // write data to decompressed string
    String line1;
    while((line1 = decompressionBr.readLine()) != null) {
        decompressionSb.append(line1);
    }
    decompressionBr.close();

    int len;
    String uncompressedStr = "";
    while((len = compressedByteArrayStream.read(buffer)) > 0) {
        uncompressedStr = byteOutput.toString();
    }

    compressedByteArrayStream.close();  
    return uncompressedStr;
}

这是我收到的错误消息:

[B@7852e922
File content: 
java.io.EOFException
    at java.util.zip.GZIPInputStream.readUByte(GZIPInputStream.java:268)
    at java.util.zip.GZIPInputStream.readUShort(GZIPInputStream.java:258)
    at java.util.zip.GZIPInputStream.readHeader(GZIPInputStream.java:164)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:79)
    at java.util.zip.GZIPInputStream.<init>(GZIPInputStream.java:91)
    at org.kingswoodoxford.Compression.decompress(Compression.java:136)
    at org.kingswoodoxford.Compression.main(Compression.java:183)

有关我如何解决这个问题的任何建议吗?

1 个答案:

答案 0 :(得分:0)

当您阅读文件时,您会丢弃每行末尾的新行。

执行此操作的更有效选项是一次复制一个块,即char[]。您也可以随意转换文本,而不是创建字符串或字节[]。

BTW original.concat(line);返回您要丢弃的串联字符串。

真正的问题是你写一个流并关闭另一个流。这意味着如果文件末尾有任何缓冲数据(这很有可能),文件的末尾将被截断,当您阅读它时,它会抱怨您的文件不完整或EOFException。

这是一个较短的例子

public static void compress(String originalFile, String compressFile) throws IOException {
    char[] buffer = new char[8192];
    try (
            FileReader reader = new FileReader(originalFile);
            Writer writer = new OutputStreamWriter(
                    new GZIPOutputStream(new FileOutputStream(compressFile)));
    ) {
        for (int len; (len = reader.read(buffer)) > 0; )
            writer.write(buffer, 0, len);
    }
}

在解压缩中,不要将二进制文件编码为文本并尝试获取相同的数据。它几乎肯定会被破坏。尝试使用缓冲区和循环,就像我做的压缩一样。即它不应该更复杂。