Question

我正在使用apache poi来处理.xlsx文件。

我有两个.xlsx文件， part.xlsx,and full.xlsx，

他们拥有相同的结构。

每条记录（poi中的Row对象）都有三个colmn：name，age，location。

part.xlsx中有近5000行，full.xlsx中有40000行。

现在我想从full.xlsx中添加与part.xlsx具有相同值的行。

例如：

part.xlsx：

Name age location
kk   23  USA
bb   24  England
......

full.xlsx

Name age location
kk   23  USA
bb   24  England
xx   25  USA
......

现在我想要额外添加'kk'和'bb'行并将它们保存到新文件中。

这是代码：

List<User> usersInpart=new ArrayList<User>();
List<Row> rows_to_be_saved=new ArrayList<Row>();

//read the part.xlsx and save them.
FileInputStream fis_part=new FileInputStream("part.xlsx");
WorkBook wb_part=WorkbookFactory.create(fis_part);
Sheet st_part=wb_part.getSheetAt(0);
for(Row row : st_part){
    if(row.getRowNum()==0) continue; //skip the first row(the title)
    User u=new User();
    u.setName(row.getCell(0).getRichStringValue().getString().trim());
    u.setAge(row.getCell(1).getNumericCellValue());
    u.setLocation(row.getCell(2).getRichStringValue().getString().trim());
    usersInpart.add(u);
}
fis_part.close();


//read the full.xlsx

FileInputStream fis_full=new FileInputStream("full.xlsx");
WorkBook wb_full=WorkbookFactory.create(fis_full);
Sheet st_full=wb_full.getSheetAt(0);
for(Row row : st_full){
    if(row.getRowNum()==0) continue; //skip the first row(the title)

    String name=row.getCell(0).getRichStringValue().getString().trim();
    double age=row.getCell(1).getNumericCellValue();
    String location=row.getCell(2).getRichStringValue().getString().trim();

    for(User u : usersInpart){
        if(u.getName.equals(name) && u.getAge==age && u.getLocation().equals(location))
            rows_to_be_saved.add(row);
    }
}
fis_full.close();

//write the selected rows to file

WorkBook wb_res=WorkbookFactory.create(fis_full);
Sheet st_res=wb_res.createSheet(0);

    int i=0;
    for (Row row : rows_to_be_saved) {
        Row rw=st_res.createRow(i);

        int k=0;
        for (Cell cell : row) {
            switch (cell.getCellType()) {
                case Cell.CELL_TYPE_STRING:
                    rw.createCell(k).setCellValue(cell.getRichStringCellValue().getString());
                    break;
                case Cell.CELL_TYPE_NUMERIC:
                    if (DateUtil.isCellDateFormatted(cell)) {
                        rw.createCell(k).setCellValue(cell.getDateCellValue());
                    } else {
                        rw.createCell(k).setCellValue(cell.getNumericCellValue());
                    }
                    break;
                case Cell.CELL_TYPE_BOOLEAN:
                    rw.createCell(k).setCellValue(cell.getBooleanCellValue());
                    break;
                case Cell.CELL_TYPE_FORMULA:
                    rw.createCell(k).setCellValue(cell.getCellFormula());
                    break;
                default:
            }
            k++;
        }
        i++;
    }
//save the wb_res
wb_res.write(new FileOutputStrem("xx.xlsx"));

现在我想知道保存文件的好主意吗？

由于我已将所选行保存在“rows_to_be_saved”中。

我创建新工作表“st_res”，我可以直接将这些行保存到“st_res”吗？从现在开始，我根据“rows_to_be_saved”中的行创建了每一行。

因此会有两个行列表。我认为这是浪费记忆。

有什么建议吗？

Answer 1

如果需要考虑内存使用情况，可以使用XSSF Event Model来读取full.xlsx文件，从而节省更多内容。您目前正在将40,000行文件加载到内存中，而事件模型一次只能在内存中保留一行。

使用apache pois在excel中添加一些记录

1 个答案: