在内存中解压缩* .docx文件而不写入磁盘 - Java

时间:2015-06-01 11:38:35

标签: java zip extract unzip in-memory

我想在内存中解压缩* .docx文件而不将输出写入磁盘。我找到了以下实现,但它只允许读取压缩文件但不能查看目录结构。对我来说,了解目录树中每个文件的位置非常重要。有人能指点我吗?

private static void UnzipFileInMemory() {
    try {
        ZipFile zf = new ZipFile("d:\\a.docx");

        int i = 0;
        for (Enumeration e = zf.entries(); e.hasMoreElements();) {
            InputStream in = null;
            try {
                ZipEntry entry = (ZipEntry) e.nextElement();
                System.out.println(entry);
                in = zf.getInputStream(entry);
            } catch (IOException ex) {
                //Logger.getLogger(Tester.class.getName()).log(Level.SEVERE, null, ex);
            } finally {
                try {
                    in.close();
                } catch (IOException ex) {
                    //Logger.getLogger(Tester.class.getName()).log(Level.SEVERE, null, ex);
                }
            }

        }
    } catch (IOException ex) {
        //Logger.getLogger(Tester.class.getName()).log(Level.SEVERE, null, ex);
    }
}

3 个答案:

答案 0 :(得分:1)

使用ZipInputStream:此示例中的zEntry为您提供文件位置。

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.util.zip.ZipEntry;
import java.util.zip.ZipInputStream;

public class unzip {

    public static void main(String[] args) {

        String filePath = "D:/Tmp/Tmp.zip";
        String oPath = "D:/Tmp/";

        new unzip().unzipFile(filePath, oPath);
    }

    public void unzipFile(String filePath, String oPath) {

        FileInputStream fis = null;
        ZipInputStream zipIs = null;
        ZipEntry zEntry = null;
        try {
            fis = new FileInputStream(filePath);
            zipIs = new ZipInputStream(new BufferedInputStream(fis));
            while ((zEntry = zipIs.getNextEntry()) != null) {
                try {                   
                    FileOutputStream fos = null;
                    String opFilePath = oPath + zEntry.getName();
                    fos = new FileOutputStream(opFilePath);
                    System.out.println(zEntry.getName());

                    fos.flush();
                    fos.close();
                } catch (Exception ex) {

                }
            }
            zipIs.close();
            fis.close();
        } catch (FileNotFoundException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        } catch (IOException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}

答案 1 :(得分:0)

您将zip格式文件关联为虚拟文件系统(FileSystem)。因为java已经有jar:file://...的协议处理程序。因此,您必须在File.toURI()前添加"jar:"

URI docxUri = ,,, // "jar:file:/C:/... .docx"
Map<String, String> zipProperties = new HashMap<>();
zipProperties.put("encoding", "UTF-8");
try (FileSystem zipFS = FileSystems.newFileSystem(docxUri, zipProperties)) {
    Path documentXmlPath = zipFS.getPath("/word/document.xml");

现在您可以在真实磁盘文件系统和zip之间使用Files.delete()Files.copy

使用XML时:

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();

factory.setNamespaceAware(true);
DocumentBuilder builder = factory.newDocumentBuilder();

Document doc = builder.parse(Files.newInputStream(documentXmlPath));
//Element root = doc.getDocumentElement();

然后,您可以使用XPath查找位置,并再次写回XML。

甚至可能是您不需要XML但可以取代占位符:

byte[] content = Files.readAllBytes(documentXmlPath);
String xml = new String(content, StandardCharsets.UTF_8);
xml = xml.replace("#DATE#", "2014-09-24");
xml = xml.replace("#NAME#", StringEscapeUtils.escapeXml("Sniper")));
...
content = xml.getBytes(StandardCharsets.UTF_8);
Files.delete(documentXmlPath);
Files.write(documentXmlPath, content);

要进行快速开发,请将.docx的副本重命名为带.zip文件扩展名的名称,然后检查文件。

答案 2 :(得分:0)

只需在循环中添加文件检查代码:

if (!entry.isDirectory()) // Alternatively: if(entry.getName().contains("."))
    System.out.println(entry);