如何使用POI将巨大的.csv文件转换为Excel

时间:2014-02-11 11:59:49

标签: java excel csv apache-poi

我有将CSV转换为xlsx的Java代码。它适用于小文件大小。现在我有一个包含2万条记录(200,000)的CSV文件,在转换时我收到内存不足的错误。

我尝试将工作簿更改为SXSSFWorkbook,并将堆大小和Java内存大小增加到-Xms8G -Xmx10G。即使这样也行不通。我在UNIX机器上试过它。

在搜索时,我得到了一些关于使用BigGridDemo的代码。任何人都可以帮我自定义读取.csv文件,然后使用其逻辑写入xlsx或任何其他解决方案吗?

     try  {   FileReader input = new FileReader(strFileToConvert);
     BufferedReader bufIn = new BufferedReader(input);  
     if(strExcel_Typ.equals("-xls"))
     {
       wbAll = new HSSFWorkbook();
     }
     else
     {
        wbAll = new SXSSFWorkbook(100);
        wbAll.setCompressTempFiles(true);
     }
     sheetAll     = wbAll.createSheet();
     rowAll       = null;
     cellAll      = null;
     shoRowNumAll = 0;

     // do buffered reading from a file
     while ((line = bufIn.readLine()) != null)
     {
        intCntr++;
        //if there is any data in the line
        if (line.length() > 0)
        {
            System.out.println(shoRowNumAll);
           //create a new row on the spreadsheet
           rowAll = sheetAll.createRow((int)shoRowNumAll);
           shoRowNumAll++;

           if (line.indexOf("\"", 0) > 0)
           {
              if (intCntr == 1)
              {
                 //only issue the message the first time quotes are found
                 System.out.println("Double quotes found.  Stripping double quotes from file");
              }

              line = line.replaceAll("\"", "");
           }

           //if its the first row and no delimiters found, there is a problem
           if (line.indexOf(strDelim, 0) == -1  && intCntr == 1)
           {
                System.exit(1);
                   }

           processLine(line);
           ((SXSSFSheet) sheetAll).flushRows(100);
        }
     }
    bufIn.close();
    /write the excel file
     try
     {            String file = strOutPutFile;
                ExcelOutAll  = new FileOutputStream(file);
        wbAll.write(ExcelOutAll);
     }
     catch (IOException e)
     {
       System.err.println(e.toString());
     }

     ExcelOutAll.close();

流程线方法:

processLine(String line)
{

  //find the first next delimiter starting in position 0
  intNxtComma = line.indexOf(strDelim, 0);
   while (intCurPosInLine < line.trim().replaceAll(",","").length())
  {
     strCellContent = line.substring((intCurPosInLine), intNxtComma);

     //create a new cell on the new row
     cellAll = rowAll.createCell(intCellNum);


     //set the font defaults
     Font font_couriernew_10 = wbAll.createFont();
     font_couriernew_10.setFontHeightInPoints((short)10);
     font_couriernew_10.setFontName("Courier New");

    CellStyle cellStyle = wbAll.createCellStyle();

     //if its the first row, center the text
     if (shoRowNumAll == 1)
     {
        cellStyle.setAlignment(CellStyle.ALIGN_CENTER);
        font_couriernew_10.setBoldweight(XSSFFont.BOLDWEIGHT_BOLD);
     }

     cellStyle.setFont(font_couriernew_10);


     // if the col. needs to be numeric, set the cell format to number
     if ((strNumericCols.indexOf(Integer.toString(intCellNum), 0) > -1) && (intCntr > 1))
     {
        DataFormat datafrmt = wbAll.createDataFormat();
        cellStyle.setDataFormat(datafrmt.getFormat("$#,##0.00"));

        }

     cellAll.setCellStyle(cellStyle);



     //populate the cell
     if ((strNumericCols.indexOf(Integer.toString(intCellNum), 0) > -1) && (intCntr > 1))
     {
        //if the col. needs to be numeric populate with a number
         if(strCellContent != null &&  !"".equals(strCellContent.trim())){
        douCellContent = Double.parseDouble(strCellContent.replaceAll(",",""));
        cellAll.setCellValue(douCellContent);
         }
     }
     else
     {
        cellAll.setCellValue(strCellContent.trim());
     }

     intCellNum++;


     intCurPosInLine = intNxtComma + 1;

     //if we dont find anymore delimiters, set the variable to the line length
     if (line.indexOf(strDelim, intCurPosInLine) == -1)
     {
        intNxtComma = line.trim().length();
     }
     else
     {
        intNxtComma = line.indexOf(strDelim, intNxtComma + 1);
     }
  }
}

0 个答案:

没有答案