使用poi sxssfworkbook附加到工作簿

时间:2015-08-07 20:45:32

标签: java out-of-memory apache-poi batch-processing fileupdate

我需要将行附加到工作簿的工作表中。我正在使用org.apache.poi.xssf.streaming.SXSSFWorkbook,但我无法实现低内存占用。以下是代码:

import java.io.BufferedInputStream;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.InputStream;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.xssf.streaming.SXSSFWorkbook;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelHelper {
    public static void createExcelFileWithLowMemFootprint(
            ArrayList<HashMap<String, Object>> data,
            ArrayList<String> fieldNames, String fileName, int rowNum) {
        try {
            if (rowNum == 0) {
                // Creating a new workbook and writing the top heading here
                SXSSFWorkbook workbook = new SXSSFWorkbook(1000);
                Sheet worksheet = workbook.createSheet("Sheet 1");
                int i = 0;
                Iterator<String> it0 = fieldNames.iterator();
                Row row = worksheet.createRow(i);
                int j = 0;
                while (it0.hasNext()) {
                    Cell cell = row.createCell(j);
                    String fieldName = it0.next();
                    cell.setCellValue(fieldName);
                    j++;
                }
                rowNum++;
                FileOutputStream fileOut = new FileOutputStream(fileName);
                workbook.write(fileOut);
                fileOut.flush();
                fileOut.close();
            }
            InputStream fileIn = new BufferedInputStream(new FileInputStream(
                    fileName), 1000);
            SXSSFWorkbook workbook = new SXSSFWorkbook(
                    new XSSFWorkbook(fileIn), 1000);
            Sheet worksheet = workbook.getSheetAt(0);
            Iterator<HashMap<String, Object>> it = data.iterator();
            int i = rowNum;
            while (it.hasNext()) {
                Row row = worksheet.createRow(i);
                int j = 0;
                HashMap<String, Object> rowContent = it.next();
                Iterator<String> it1 = fieldNames.iterator();
                while (it1.hasNext()) {
                    Cell cell = row.createCell(j);
                    String key = it1.next();
                    Object o = rowContent.get(key);
                    if (o instanceof String) {
                        cell.setCellValue((String) o);
                    } else if (o instanceof Double) {
                        cell.setCellType(cell.CELL_TYPE_NUMERIC);
                        cell.setCellValue((Double) o);
                    }
                    j++;
                }
                i++;
            }
            fileIn.close();
            FileOutputStream fileOut = new FileOutputStream(fileName);
            workbook.write(fileOut);
            fileOut.flush();
            fileOut.close();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

我通过批量传递内容(以便保存在jvm内存中)并通过递增变量rowNum来追加文件。

根据我的理解,当我用

重新打开文件时
SXSSFWorkbook workbook = new SXSSFWorkbook(new XSSFWorkbook(fileIn),1000);

XSSWorkbook的构造函数在内存中重新加载完整文件,导致超出gc限制。

我经历了http://poi.apache.org/spreadsheet/how-to.html但无法为我的用例找到合适的解决方案。

你们可以建议如何修复此问题以实现内存占用较少,以便将行附加到工作簿中吗?

1 个答案:

答案 0 :(得分:0)

SXSSFWorkbook不需要输出然后重新加载以进行良好的内存管理。只需一次写入所有数据。如果您尝试加载整个工作簿,它会将其存储在内存中,当它立即写入时,它会使用存储空间。另外1000在一些计算机上的构造函数中放了很多东西。如果需要,请尝试将100或其他较低的数字放在构造函数中,而不是1000