Apache Poi将大型Excel文件写入磁盘

时间:2019-09-12 09:33:06

标签: java excel performance apache-poi out-of-memory

我需要打开,更改和写入Excel文件。由于更改过程较为复杂,因此我将其省略,因为问题不在于此。我遇到的问题是将文件写入磁盘时。问题始于文件大小约为7MB(与公式或值无关)。

我准备了以下MVCE:

public static void main(String[] args) throws EncryptedDocumentException, IOException, InvalidFormatException {
    String filePath = "C:\\temp";
    String outputFilePath = "C:\\temp\\test";
    ZipSecureFile.setMinInflateRatio(0);
    File f = new File(filePath, "Test.xlsx");
    try (XSSFWorkbook workBook = new XSSFWorkbook(f)) {
        System.out.println("writing file");
        File outputFile = new File(outputFilePath, f.getName());
        try (FileOutputStream fos = new FileOutputStream(outputFile)) {
            workBook.write(fos);
        }
        workBook.close();
    }
    System.out.println("fin");
}

此代码已引起我遇到的问题。确切的堆栈跟踪为:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3236)
    at java.io.ByteArrayOutputStream.toByteArray(ByteArrayOutputStream.java:191)
    at org.apache.poi.openxml4j.opc.internal.MemoryPackagePartOutputStream.flush(MemoryPackagePartOutputStream.java:76)
    at org.apache.poi.openxml4j.opc.internal.MemoryPackagePartOutputStream.close(MemoryPackagePartOutputStream.java:51)
    at org.apache.poi.xssf.usermodel.XSSFSheet.commit(XSSFSheet.java:3575)
    at org.apache.poi.ooxml.POIXMLDocumentPart.onSave(POIXMLDocumentPart.java:462)
    at org.apache.poi.ooxml.POIXMLDocumentPart.onSave(POIXMLDocumentPart.java:467)
    at org.apache.poi.ooxml.POIXMLDocument.write(POIXMLDocument.java:236)
    at test.TestWriteOriginalWorkbook.main(TestWriteOriginalWorkbook.java:25)

尽管跟踪本身因文件而异,但异常本身始终保持不变。

据我所知,解决此问题的唯一方法是增加应用程序的可用内存。虽然可能,但我想避免这种情况。我只是检查了MaxHeapSize,它的默认大小约为260MB,这似乎很低。因此,如有必要,可以增加到1GB。

我使用-Xmx1g运行了这段代码,但由于不同的原因而得到了相同的异常:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3045)
    at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.attr(Cur.java:3065)
    at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3198)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.startElement(AbstractSAXParser.java:509)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.scanStartElement(XMLNSDocumentScannerImpl.java:374)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$FragmentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2784)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocumentScannerImpl.java:602)
    at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNSDocumentScannerImpl.java:112)
    at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:505)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:841)
    at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:770)
    at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141)
    at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213)
    at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643)
    at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3414)
    at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1272)
    at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1259)
    at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
    at org.openxmlformats.schemas.spreadsheetml.x2006.main.WorksheetDocument$Factory.parse(Unknown Source)
    at org.apache.poi.xssf.usermodel.XSSFSheet.read(XSSFSheet.java:227)
    at org.apache.poi.xssf.usermodel.XSSFSheet.onDocumentRead(XSSFSheet.java:219)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.parseSheet(XSSFWorkbook.java:452)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.onDocumentRead(XSSFWorkbook.java:417)
    at org.apache.poi.ooxml.POIXMLDocument.load(POIXMLDocument.java:184)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:286)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:323)
    at test.TestWriteOriginalWorkbook.main(TestWriteOriginalWorkbook.java:21)
Cleaning up unclosed ZipFile for archive C:\temp\Test.xlsx

总结一下问题:我可以做些什么来提高workBook.write(fos);的性能以处理7MB以上(最好为15MB)的文件?


Test.xlsx是一个仅包含值的Excel文件。我在整个第一列以及第二列(直到443269行)中都使用1创建了它。

0 个答案:

没有答案