如何在Java中使用POI读取大型Excel文件?

时间:2019-09-25 05:17:14

标签: java excel apache-poi

我一直在阅读该问题的先前答案,并尝试编写代码。到目前为止,没有任何帮助。所有这些都会导致GC内存不足错误。

我正在具有12 GB RAM的Windows 10 PC上运行Eclipse 2019-09。在Eclipse中,我部署了WildFly 12内部服务器。我有JDK 1.8.0152。在WildFly启动行中,我将选项从-Xmx512m更改为-Xmx2048m。

我尝试了简单的POI。还添加了monitorjbl。然后,我从代码中删除了所有内容,但仍然无法读取大文件。我的XLSX有12张纸,每张纸有50万行。每行的格式相同:1个序列号,2个带数字和破折号的(字符串)id和一个(字符串)名称列。

程序在发出内存错误之前运行了大约12分钟。

import com.monitorjbl.xlsx.StreamingReader;

import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.PrintWriter;

import javax.servlet.ServletException;
import javax.servlet.annotation.WebServlet;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;

import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.ss.usermodel.Sheet;
import org.apache.poi.ss.usermodel.Workbook;

/**
 * Servlet implementation class StreamImport
 */
@WebServlet(description = "Import data using XLSX stream", urlPatterns = { "/StreamImport" })
public class StreamImport extends HttpServlet {
    private static final long serialVersionUID = 1L;

    /**
     * @see HttpServlet#HttpServlet()
     */
    public StreamImport() {
        super();
        // TODO Auto-generated constructor stub
    }

    /**
     * @see HttpServlet#doGet(HttpServletRequest request, HttpServletResponse response)
     */
    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
        final String strFile = "C:\\Users\\Hussain\\Downloads\\2019819515128395ATL_IT1.xlsx";
        Integer iRows = 0;

        PrintWriter out = response.getWriter();
        out.println("<html><body>");
        out.println("<h1>Importing Using Stream</h1>");
        out.println("<p>Input Excel File: " + strFile + "</p>");
        // Open file
        FileInputStream fisFile = new FileInputStream(new File(strFile));

        //Create Workbook instance holding reference to .xlsx file
         Workbook workbook = StreamingReader.builder()
                    .rowCacheSize(100)    // number of rows to keep in memory (defaults to 10)
                    .bufferSize(12000)     // buffer size to use when reading InputStream to file (defaults to 1024)
                    .open(fisFile);            // InputStream or File for XLSX file (required)

        //Get first/desired sheet from the workbook
         Sheet sheet = workbook.getSheetAt(0);

         for(Row row : sheet) {
             iRows++;
        }
        workbook.close();
        fisFile.close();
        out.println("Read " + iRows.toString() + " rows <br>");
        out.println("</body></html>");
    }
   }

一些答案​​表明,由于SXSSF方法执行其自己的流传输,因此不再需要monitorjbl。其他人则说SXSSF仅用于编写。

编辑:是的,正如我所说,我知道这是一个重复的问题。首先,我不知道MonitorXbl解决方案是否已被SXSSF取代。其次,在遵循其他解决方案之后,我仍然摆脱内存错误。

0 个答案:

没有答案