我正在解析Web应用程序中的excel(xlsx)工作表。我正在使用Apache POI,Streaming API,因为我有大文件。现在,Excel在内部将日期存储为数字,例如我得到42035而不是2015年1月31日。 因为我使用的是XML解析,所以无法访问POI提供的日期格式化方法。有什么想法吗?
这是解析器(来自此示例https://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api,但已更改):
/**
* See org.xml.sax.helpers.DefaultHandler javadocs
*/
private static class SheetHandler extends DefaultHandler {
private SharedStringsTable table;
private String lastContents;
private boolean nextIsString;
boolean isFirstRow = true;
private int quantityOfColumns;
private int currentColumnNumber = 1;
int currentRowNumber = 1;
private int rowNumberOfLastCell = 1;
private DataSet data = new DataSet();
private Tuple tuple;
static final Logger LOG = LoggerFactory.getLogger(SheetHandler.class);
private SheetHandler(SharedStringsTable sst) {
this.table = sst;
LOG.debug("Sheethandler created");
}
@Override
public void startElement(String uri, String localName, String name,
Attributes attributes) throws SAXException {
// c => cell
LOG.debug("Reading element of type {} ", name);
if (name.equals("c")) {
rowNumberOfLastCell = currentRowNumber;
String cellType = attributes.getValue("t");
LOG.debug("Extracting cell {} with type {}", attributes.getValue("r"), cellType);
currentRowNumber = extractIntFromString(attributes.getValue("r"));
if (isFinishedRowHeaderRow()) {
extractHeaders();
}
// Figure out if the value is an index in the SST (Static String Table)
if (cellType != null && cellType.equals("s")) {
System.out.println("We have content!");
nextIsString = true;
} else {
nextIsString = false;
}
}
// Clear contents cache
lastContents = "";
}
private boolean isFinishedRowHeaderRow() {
return (rowNumberOfLastCell != currentRowNumber) && isFirstRow;
}
private void extractHeaders() {
quantityOfColumns = data.getHeaders().size();
LOG.debug("{} rows detected", quantityOfColumns);
LOG.debug("{} headers parsed", data.getHeaders().size());
isFirstRow = false;
currentColumnNumber = 1;
tuple = new Tuple(quantityOfColumns);
}
@Override
public void characters(char[] ch, int start, int length)
throws SAXException {
lastContents += new String(ch, start, length);
LOG.debug("Extracted value \"{}\"", lastContents);
}
@Override
public void endElement(String uri, String localName, String name)
throws SAXException {
// Process the last contents as required.
// Do now, as characters() may be called more than once
if (nextIsString) {
extractStringFromCell();
}
// v => contents of a cell
// Output after we've seen the string contents
if (name.equals("v")) {
addExtractedValueToDataSet();
LOG.debug("Ended parsing of column {}", currentColumnNumber);
if (currentColumnNumber == (quantityOfColumns)) {
concludeRow();
} else {
currentColumnNumber++;
}
}
}
private void concludeRow() {
data.addRow(tuple);
LOG.debug("Row added to DataSet");
tuple = new Tuple(quantityOfColumns);
currentColumnNumber = 1;
LOG.debug("Row {} parsed", rowNumberOfLastCell);
}
private void addExtractedValueToDataSet() {
if (isFirstRow) {
data.getHeaders().add(lastContents);
} else {
tuple.getRowEntries()[currentColumnNumber - 1] = lastContents;
}
}
private void extractStringFromCell() throws NumberFormatException {
int idx = Integer.parseInt(lastContents);
lastContents = new XSSFRichTextString(table.getEntryAt(idx)).toString();
LOG.debug("Extracted value \"{}\"", lastContents);
nextIsString = false;
}
public int extractIntFromString(String original) {
char[] chars = original.toCharArray();
for (int i = 0; i < chars.length; i++) {
try {
return Integer.valueOf(original.substring(i, original.length()));
} catch (NumberFormatException ex) {
}
}
return 0;
}
public DataSet getData() {
return data;
}
}
答案 0 :(得分:1)
@Gagravarr感谢您的回复,出于某种原因我忽略了这一点,这有助于我解决我的问题: