public void processOneSheet(String filename) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader( pkg );
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
// To look up the Sheet Name / Sheet Order / rID,
// you need to process the core Workbook stream.
// Normally it's of the form rId# or rSheet#
InputStream sheet2 = r.getSheet("rId2");
InputSource sheetSource = new InputSource(sheet2);
parser.parse(sheetSource);
sheet2.close();
}
public void processAllSheets(String filename) throws Exception {
OPCPackage pkg = OPCPackage.open(filename);
XSSFReader r = new XSSFReader( pkg );
SharedStringsTable sst = r.getSharedStringsTable();
XMLReader parser = fetchSheetParser(sst);
Iterator<InputStream> sheets = r.getSheetsData();
while(sheets.hasNext()) {
System.out.println("Processing new sheet:\n");
InputStream sheet = sheets.next();
InputSource sheetSource = new InputSource(sheet);
parser.parse(sheetSource);
sheet.close();
System.out.println("end Processing");
}
}
public XMLReader fetchSheetParser(SharedStringsTable sst) throws SAXException {
XMLReader parser =
XMLReaderFactory.createXMLReader(
"org.apache.xerces.parsers.SAXParser"
);
ContentHandler handler = new SheetHandler(sst);
parser.setContentHandler(handler);
return parser;
}
/**
* See org.xml.sax.helpers.DefaultHandler javadocs
*/
private static class SheetHandler extends DefaultHandler {
private SharedStringsTable sst;
private String lastContents;
private boolean nextIsString;
private SheetHandler(SharedStringsTable sst) {
this.sst = sst;
}
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
// c => cell
if(name.equals("c")) {
// Print the cell reference
// Figure out if the value is an index in the SST
String cellType = attributes.getValue("t");
if(cellType != null && cellType.equals("s")) {
nextIsString = true;
} else {
nextIsString = false;
}
}
// Clear contents cache
lastContents = "";
}
public void endElement(String uri, String localName, String name)
throws SAXException {
// Process the last contents as required.
// Do now, as characters() may be called more than once
if(nextIsString) {
int idx = Integer.parseInt(lastContents);
lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString();
nextIsString = false;
}` here
我的文件具有以下结构:
A B C D
1个文本文本文本
2个文本文本文本
我正在阅读excel文件,而不是添加一些数据更改并将其作为输出。 但有时文本可能是空的,问题是在编写excel文件时,它没有考虑空单元格,而是用下一个单元格的内容取出。 请问我该如何处理?
答案 0 :(得分:2)
Excel文件“稀疏地”写入,不包括不用于减少空间的单元格。当您正在阅读文件时,您需要考虑到这个
如果你正在使用简单(但内存饥渴)UserModel,你可以使用a MissingCellPolicy之类的东西来控制如何处理这些丢失的单元格
如果您希望进入低级别并使用SAX事件处理它,就像您所看到的那样,那么您需要自己处理。只需抓住每个单元格经过的参考,跟踪你看到的最后一个单元格,并在那时处理任何缺失的单元格
Apache POI has a great example of doing this, in the SAX-powered XLSX to CSV example converter。你需要在很大程度上遵循相同的逻辑,例如
private class SheetToCSV implements SheetContentsHandler {
private boolean firstCellOfRow = false;
private int currentCol = -1;
public void cell(String cellReference, String formattedValue,
XSSFComment comment) {
if (firstCellOfRow) {
firstCellOfRow = false;
} else {
output.append(',');
}
// Did we miss any cells?
int thisCol = (new CellReference(cellReference)).getCol();
int missedCols = thisCol - currentCol - 1;
for (int i=0; i<missedCols; i++) {
output.append(',');
}
currentCol = thisCol;
// .... Rest of cell contents handling method goes here ....
答案 1 :(得分:0)
对于基于SAX事件的xlsx解析器,我也遇到了相同的情况,我只是扩展了Gagravarr的方法以适合我的要求,即在此模型中,您需要跟踪最后处理的列,因此按照您的要求在当前列之前处理跳过的列希望&您知道这些单元格是空白的。
@Override
public void cell(String cellReference, String formattedValue) {
int currentColumnIndex = (new CellReference(cellReference)).getCol();
for (int columnIndex = previousCol + 1; columnIndex <= currentColumnIndex; ++columnIndex) {
if (columnIndex != currentColumnIndex) {
setValueInRowObject(columnIndex, "");
} else {
setValueInRowObject(columnIndex, formattedValue);
}
}
previousCol = currentColumnIndex;
}
其中previousCol
是实例字段,初始化为-1
并在调用-startRow
方法
@Override
public void startRow(int rowNum) {
previousCol = -1;
}
我基本上有一定的金额和日期列,如果这些列带有无效值并且空白不是有效的金额或日期,则需要填充警告消息。但是由于cell
方法未针对空单元格被调用,因此我无法为该特定单元格填充该警告消息。