我正在使用Java代码中的POI库读取excel文件。到目前为止还好。但是现在我有一个要求。 Excel文件包含许多记录(例如1000行)。它还具有列标题(第一行)。现在,我正在对其进行excel过滤。假设我有一个“年”列,并且正在过滤year = 2019的所有行。我得到15行。 问题:我只想在Java代码中处理这15行。 poi库中是否有任何方法或方法来确定正在读取的行是否已过滤或(另一种方式,即未过滤)。 谢谢。
我已经有工作代码,但是现在我正在寻找如何仅读取过滤后的行。除了在图书馆和论坛中搜索外,没有其他尝试过的东西。
下面的代码在方法内部。我不习惯使用stackoverflow进行格式化,因此请忽略任何格式化问题。
// For storing data into CSV files
StringBuffer data = new StringBuffer();
try {
SimpleDateFormat dtFormat = new SimpleDateFormat(CommonConstants.YYYY_MM_DD); // "yyyy-MM-dd"
String doubleQuotes = "\"";
FileOutputStream fos = new FileOutputStream(outputFile);
// Get the workbook object for XLSX file
XSSFWorkbook wBook = new XSSFWorkbook(new FileInputStream(inputFile));
wBook.setMissingCellPolicy(Row.RETURN_BLANK_AS_NULL);
// Get first sheet from the workbook
//XSSFSheet sheet = wBook.getSheetAt(0);
XSSFSheet sheet = wBook.getSheet(CommonConstants.METADATA_WORKSHEET);
//Row row;
//Cell cell;
// Iterate through each rows from first sheet
int rows = sheet.getLastRowNum();
int totalRows = 0;
int colTitelNumber = 0;
Row firstRowRecord = sheet.getRow(1);
for (int cn = 0; cn < firstRowRecord.getLastCellNum(); cn++) {
Cell cellObj = firstRowRecord.getCell(cn);
if(cellObj != null) {
String str = cellObj.toString();
if(CommonConstants.COLUMN_TITEL.equalsIgnoreCase(str)) {
colTitelNumber = cn;
break;
}
}
}
// Start with row Number 1. We don't need 0th number row as it is for Humans to read but not required for processing.
for (int rowNumber = 1; rowNumber <= rows; rowNumber++) {
StringBuffer rowData = new StringBuffer();
boolean skipRow = false;
Row rowRecord = sheet.getRow(rowNumber);
if (rowRecord == null) {
LOG.error("Empty/Null record found");
} else {
for (int cn = 0; cn < rowRecord.getLastCellNum(); cn++) {
Cell cellObj = rowRecord.getCell(cn);
if(cellObj == null) {
if(cn == colTitelNumber) {
skipRow = true;
break; // The first column cell value is empty/null. Which means Titel column cell doesn't have value so don't add this row in csv.
}
rowData.append(CommonConstants.CSV_SEPARTOR);
continue;
}
switch (cellObj.getCellType()) {
case Cell.CELL_TYPE_BOOLEAN:
rowData.append(cellObj.getBooleanCellValue() + CommonConstants.CSV_SEPARTOR);
//LOG.error("Boolean:" + cellObj.getBooleanCellValue());
break;
case Cell.CELL_TYPE_NUMERIC:
if (DateUtil.isCellDateFormatted(cellObj)) {
Date date = cellObj.getDateCellValue();
rowData.append(dtFormat.format(date).toString() + CommonConstants.CSV_SEPARTOR);
//LOG.error("Date:" + cellObj.getDateCellValue());
} else {
rowData.append(cellObj.getNumericCellValue() + CommonConstants.CSV_SEPARTOR);
//LOG.error("Numeric:" + cellObj.getNumericCellValue());
}
break;
case Cell.CELL_TYPE_STRING:
String cellValue = cellObj.getStringCellValue();
// If string contains double quotes then replace it with pair of double quotes.
cellValue = cellValue.replaceAll(doubleQuotes, doubleQuotes + doubleQuotes);
// If string contains comma then surround the string with double quotes.
rowData.append(doubleQuotes + cellValue + doubleQuotes + CommonConstants.CSV_SEPARTOR);
//LOG.error("String:" + cellObj.getStringCellValue());
break;
case Cell.CELL_TYPE_BLANK:
rowData.append("" + CommonConstants.CSV_SEPARTOR);
//LOG.error("Blank:" + cellObj.toString());
break;
default:
rowData.append(cellObj + CommonConstants.CSV_SEPARTOR);
}
}
if(!skipRow) {
rowData.append("\r\n");
data.append(rowData); // Appending one entire row to main data string buffer.
totalRows++;
}
}
}
pTransferObj.put(CommonConstants.TOTAL_ROWS, (totalRows));
fos.write(data.toString().getBytes());
fos.close();
wBook.close();
} catch (Exception ex) {
LOG.error("Exception Caught while generating CSV file", ex);
}
答案 0 :(得分:1)
在工作表中不可见的所有行的高度均为零。因此,如果仅需要读取可见行,则可以通过Row.getZeroHeight进行检查。
示例
表格:
代码:
import java.io.FileInputStream;
import org.apache.poi.ss.usermodel.*;
class ReadExcelOnlyVisibleRows {
public static void main(String[] args) throws Exception {
Workbook workbook = WorkbookFactory.create(new FileInputStream("SAMPLE.xlsx"));
DataFormatter dataFormatter = new DataFormatter();
CreationHelper creationHelper = workbook.getCreationHelper();
FormulaEvaluator formulaEvaluator = creationHelper.createFormulaEvaluator();
Sheet sheet = workbook.getSheetAt(0);
for (Row row : sheet) {
if (!row.getZeroHeight()) { // if row.getZeroHeight() is true then this row is not visible
for (Cell cell : row) {
String cellContent = dataFormatter.formatCellValue(cell, formulaEvaluator);
System.out.print(cellContent + "\t");
}
System.out.println();
}
}
workbook.close();
}
}
结果:
F1 F2 F3 F4
V2 2 2-Mai FALSE
V4 4 4-Mai FALSE
V2 6 6-Mai FALSE
V4 8 8-Mai FALSE
答案 1 :(得分:0)
您必须使用Apache Poi库中提供的自动过滤器,并且还设置了冻结。我在下面提供了简短的代码段,您可以相应地使用。
XSSFSheet sheet = wBook.getSheet(CommonConstants.METADATA_WORKSHEET);
sheet.setAutoFilter(new CellRangeAddress(0, 0, 0, numColumns));
sheet.createFreezePane(0, 1);
答案 2 :(得分:0)
我不得不重写一些钩子,并想出自己的方法来合并对隐藏行的过滤,以防止对其进行处理。下面是代码片段。我的方法包括打开同一工作表的第二个副本,以便我可以查询正在处理的当前行以查看其是否被隐藏。上面的答案涉及到这一点,下面的内容对此进行了扩展,以显示如何将其很好地合并到Spring批处理excel框架中。一个缺点是您必须打开同一文件的第二个副本,但是我无法找出一种方法(也许没有!)来尝试使用内部工作簿工作表,其中还有其他原因,因为org.springframework.batch.item.excel.poi.PoiSheet
是包私有的(请注意,以下语法是Groovy !!! ):
/**
* Produces a reader that knows how to ingest a file in excel format.
*/
private PoiItemReader<String[]> createExcelReader(String filePath) {
File f = new File(filePath)
PoiItemReader<String[]> reader = new PoiItemReader<>()
reader.setRowMapper(new PassThroughRowMapper())
Resource resource = new DefaultResourceLoader().getResource("file:" + f.canonicalPath)
reader.setResource(resource)
reader.setRowSetFactory(new VisibleRowsOnlyRowSetFactory(resource))
reader.open(new ExecutionContext())
reader
}
...
// The "hooks" I overwrote to inject my logic
static class VisibleRowsOnlyRowSet extends DefaultRowSet {
Workbook workbook
Sheet sheet
VisibleRowsOnlyRowSet(final Sheet sheet, final RowSetMetaData metaData) {
super(sheet, metaData)
}
VisibleRowsOnlyRowSet(final Sheet sheet, final RowSetMetaData metaData, Workbook workbook) {
this(sheet, metaData)
this.workbook = workbook
this.sheet = sheet
}
boolean next() {
boolean moreLeft = super.next()
if (moreLeft) {
Row row = workbook.getSheet(sheet.name).getRow(getCurrentRowIndex())
if (row?.getZeroHeight()) {
log.warn("Row $currentRow is hidden in input excel sheet, will omit it from output.")
currentRow.eachWithIndex { _, int i ->
currentRow[i] = ''
}
}
}
moreLeft
}
}
static class VisibleRowsOnlyRowSetFactory extends DefaultRowSetFactory {
Workbook workbook
VisibleRowsOnlyRowSetFactory(Resource resource) {
this.workbook = WorkbookFactory.create(resource.inputStream)
}
RowSet create(Sheet sheet) {
new VisibleRowsOnlyRowSet(sheet, super.create(sheet).metaData, workbook)
}
}