Apache POI-读取由TEXT()公式格式化的单元格

时间:2019-01-11 08:36:33

标签: java excel apache-poi

我有一个带有日期的EXCEL文件。它们的格式为TEXT,例如:=TEXT(TODAY(); "yyyy-MM-dd")

在EXCEL中,日期已正确设置为文本格式,但是当我使用Apache POI读取单元格时,它将返回数字值。 为什么?为什么POI无法读取格式化的文本值?

我不想在JAVA应用程序中格式化日期,因为EXCEL文件应定义格式(每个值可能不同)。

这是我读取单元格值的代码:

private static String getString(Cell cell) {
 if (cell == null) return null; 

 if (cell.getCellTypeEnum() != CellType.FORMULA) { 
  switch (cell.getCellTypeEnum()) { 
   case STRING: 
    return cell.getStringCellValue().trim(); 
   case BOOLEAN: 
    return String.valueOf(cell.getBooleanCellValue());
   case NUMERIC: 
    return String.valueOf(cell.getNumericCellValue()); 
   case BLANK: 
    return null; 
   case ERROR: 
    throw new RuntimeException(ErrorEval.getText(cell.getErrorCellValue())); 
   default: 
    throw new RuntimeException("unexpected cell type " + cell.getCellTypeEnum());
  }
 } 
 FormulaEvaluator evaluator = cell.getSheet().getWorkbook().getCreationHelper().createFormulaEvaluator();
 try { 
  CellValue cellValue = evaluator.evaluate(cell); 
  switch (cellValue.getCellTypeEnum()) { 
   case NUMERIC: 
    return String.valueOf(cellValue.getNumberValue());
   case STRING: 
    return cellValue.getStringValue().trim(); 
   case BOOLEAN: 
    return String.valueOf(cellValue.getBooleanValue()); 
   case ERROR: 
    throw new RuntimeException(ErrorEval.getText(cellValue.getErrorValue())); 
   default: 
    throw new RuntimeException("unexpected
cell type " + cellValue.getCellTypeEnum()); 
  } 
 } catch (RuntimeException e) { 
  throw new RuntimeException("Could not evaluate the value of " + cell.getAddress() + " in sheet " + cell.getSheet().getSheetName(), e);
 }
}

2 个答案:

答案 0 :(得分:2)

仅当使用的Excel不是英语时,才会出现此问题。那么该公式实际上不是=TEXT(A2,"yyyy-MM-dd"),而是例如我的德语=TEXT(A2,"JJJJ-MM-TT")中的Excel

如您所见,TEXT函数中的格式部分始终取决于语言环境,尽管所有其他公式部分始终都是en_US语言环境。这是因为格式部分位于公式内的字符串中,不会更改。因此,在德语中为=TEXT(A2,"JJJJ-MM-TT")(年份= Jahr,Day =标记),在法语中为{{11}}(年份=Année,Day = Jour)。

并且由于=TEXT(A2,"AAAA-MM-JJ")的{​​{1}}到目前为止没有语言环境设置,因此该公式无法正确求值。

那么我们有两种可能性。

首先,我们希望存储的单元格值应该是所需的字符串。因此,如果单元格公式以“ TEXT”开头并且包含“ JJJJ-MM-TT”,则不要进行评估,因为这样做不合适。取而代之的是从apache poi的最后一次评估中获取字符串单元格值。

第二,我们可以在公式中用en_US替换语言环境相关的格式部分,然后让FormulaEvaluator求值。至少如果我们只想读取而不重写Excel文件,则不会破坏apache poi文件中的某些内容。


代码优先方法:

Excel

德语Excel:

enter image description here

结果:

Excel

英语计算器:

enter image description here

结果:

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.usermodel.CellType;
import org.apache.poi.ss.util.*;

import org.apache.poi.ss.formula.eval.ErrorEval;

import java.io.FileInputStream;

class ReadExcelExample {

 private static String getString(Cell cell, FormulaEvaluator evaluator) {
  if (cell == null) return "null";
  String text = "";
  switch (cell.getCellType()) {
  //switch (cell.getCellTypeEnum()) {
   case STRING:
    text = cell.getRichStringCellValue().getString();
   break;
   case NUMERIC:
    if (DateUtil.isCellDateFormatted(cell)) {
     text = String.valueOf(cell.getDateCellValue());
    } else {
     text = String.valueOf(cell.getNumericCellValue());
    }
   break;
   case BOOLEAN:
    text = String.valueOf(cell.getBooleanCellValue());
   break;
   case FORMULA:
    text = cell.getCellFormula();

    //if formula is TEXT(...,"JJJJ-MM-TT") then do not evaluating:
    if (cell.getCellFormula().startsWith("TEXT") && cell.getCellFormula().contains("JJJJ-MM-TT")) {
     text = text + ": value got from cell = " + cell.getRichStringCellValue().getString();

    } else {
     CellValue cellValue = evaluator.evaluate(cell); 
     switch (cellValue.getCellType()) {
     //switch (cellValue.getCellTypeEnum()) {
      case STRING:
       text = text + ": " + cellValue.getStringValue();
      break;
      case NUMERIC:
       if (DateUtil.isCellDateFormatted(cell)) {
        text = text + ": " + String.valueOf(DateUtil.getJavaDate(cellValue.getNumberValue()));
       } else {
        text = text + ": " + String.valueOf(cellValue.getNumberValue());
       }
      break;
      case BOOLEAN:
       text = text + ": " + String.valueOf(cellValue.getBooleanValue());
      break;
      case ERROR:
       throw new RuntimeException("from CellValue: " + ErrorEval.getText(cellValue.getErrorValue()));
      default:
       throw new RuntimeException("unexpected cellValue type " + cellValue.getCellType()); 
     }
    }
   break;
   case ERROR:
    throw new RuntimeException("from Cell: " + ErrorEval.getText(cell.getErrorCellValue())); 
   case BLANK:
    text = "";
   break;
   default:
    throw new RuntimeException("unexpected cell type " + cell.getCellType());
  }

  return text;
 }

 public static void main(String[] args) throws Exception {

  //Workbook wb  = WorkbookFactory.create(new FileInputStream("SAMPLE.xls"));
  Workbook wb  = WorkbookFactory.create(new FileInputStream("SAMPLE.xlsx"));

  DataFormatter formatter = new DataFormatter(new java.util.Locale("en", "US"));
  FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator();

  Sheet sheet = wb.getSheetAt(0);
  for (Row row : sheet) {
   for (Cell cell : row) {
    CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
    System.out.print(cellRef.formatAsString());
    System.out.print(" - ");

    String text = "";
    try {
    text = getString(cell, evaluator);
    } catch (Exception ex) {
     text = ex.toString();
    }
    System.out.println(text);

   }
  }

  wb.close();

 }
}

代码第二种方法(用en_US替换语言环境相关的格式部分):

A1 - Value
B1 - Formula
A2 - Fri Jan 11 00:00:00 CET 2019
B2 - TEXT(A2,"JJJJ-MM-TT"): value got from cell = 2019-01-11
A3 - 123.45
B3 - A3*2: 246.9
B4 - java.lang.RuntimeException: from CellValue: #DIV/0!
B5 - TODAY(): Fri Jan 11 00:00:00 CET 2019
B6 - B5=A2: true
A7 - java.lang.RuntimeException: from CellValue: #N/A
B8 - TEXT(TODAY(),"JJJJ-MM-TT"): value got from cell = 2019-01-11

法语Calc:

enter image description here

结果:

A1 - Value
B1 - Formula
A2 - Fri Jan 11 00:00:00 CET 2019
B2 - TEXT(A2,"yyyy-MM-dd"): 2019-01-11
A3 - 123.45
B3 - A3*2: 246.9
B4 - java.lang.RuntimeException: from CellValue: #DIV/0!
B5 - TODAY(): Fri Jan 11 00:00:00 CET 2019
B6 - B5=A2: true
A7 - java.lang.RuntimeException: from CellValue: #N/A
B8 - TEXT(TODAY(),"yyyy-MM-dd"): 2019-01-11

提示:这里使用的import org.apache.poi.ss.usermodel.*; import org.apache.poi.ss.usermodel.CellType; import org.apache.poi.ss.util.*; import java.io.FileInputStream; import java.util.Locale; class ExcelEvaluateTEXTDiffLocales { private static String getString(Cell cell, DataFormatter formatter, FormulaEvaluator evaluator, Locale locale) { String text = ""; if (cell.getCellType() == CellType.FORMULA) { String cellFormula = cell.getCellFormula(); text += cellFormula + ":= "; if (cellFormula.startsWith("TEXT")) { int startFormatPart = cellFormula.indexOf('"'); int endFormatPart = cellFormula.lastIndexOf('"') + 1; String formatPartOld = cellFormula.substring(startFormatPart, endFormatPart); String formatPartNew = formatPartOld; if ("de".equals(locale.getLanguage())) { formatPartNew = formatPartNew.replace("T", "D"); // Tag = Day // Monat = Month formatPartNew = formatPartNew.replace("J", "Y"); // Jahr = Year //... } else if ("fr".equals(locale.getLanguage())) { formatPartNew = formatPartNew.replace("J", "D"); // Jour = Day // Mois = Month formatPartNew = formatPartNew.replace("A", "Y"); // Année = Year //... } //... cellFormula = cellFormula.replace(formatPartOld, formatPartNew); cell.setCellFormula(cellFormula); } } try { text += formatter.formatCellValue(cell, evaluator); } catch (org.apache.poi.ss.formula.eval.NotImplementedException ex) { text += ex.toString(); } return text; } public static void main(String[] args) throws Exception { //Workbook wb = WorkbookFactory.create(new FileInputStream("SAMPLE.xls")); Workbook wb = WorkbookFactory.create(new FileInputStream("SAMPLE.xlsx")); Locale locale = new Locale("fr", "CH"); DataFormatter formatter = new DataFormatter(locale); FormulaEvaluator evaluator = wb.getCreationHelper().createFormulaEvaluator(); Sheet sheet = wb.getSheetAt(0); for (Row row : sheet) { for (Cell cell : row) { CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex()); System.out.print(cellRef.formatAsString()); System.out.print(" - "); String text = ""; text = getString(cell, formatter, evaluator, locale); System.out.println(text); } } wb.close(); } } 版本是A1 - Value B1 - Formula A2 - 1/11/2019 B2 - TEXT(A2,"AAAA-MM-JJ"):= 2019-01-11 A3 - 123.45 B3 - A3*2:= 246.9 B4 - 1/A4:= #DIV/0! B5 - TODAY():= 1/12/2019 B6 - B5=A2:= FALSE A7 - NA():= #N/A B8 - TEXT(TODAY(),"AAAA-MM-JJ"):= 2019-01-12 。较低的版本可能会有进一步的评估问题。

答案 1 :(得分:0)

org/apache/poi/ss/formula/functions/TextFunction.java提供补丁

当然,我的第一个答案只是纠正症状。最终的解决方案显然应该是评估NSPredicate函数时应考虑不同的语言环境。

工作草案:

更改了TEXT如下:

org/apache/poi/ss/formula/functions/TextFunction.java

然后获取内容就这么简单:

...
    /**
     * An implementation of the TEXT function<br>
     * TEXT returns a number value formatted with the given number formatting string. 
     * This function is not a complete implementation of the Excel function, but
     *  handles most of the common cases. All work is passed down to 
     *  {@link DataFormatter} to be done, as this works much the same as the
     *  display focused work that that does. 
     *
     * <b>Syntax<b>:<br> <b>TEXT</b>(<b>value</b>, <b>format_text</b>)<br>
     */
    public static final Function TEXT = new Fixed2ArgFunction() {

        public ValueEval evaluate(int srcRowIndex, int srcColumnIndex, ValueEval arg0, ValueEval arg1) {
            double s0;
            String s1;
            try {
                s0 = evaluateDoubleArg(arg0, srcRowIndex, srcColumnIndex);
                s1 = evaluateStringArg(arg1, srcRowIndex, srcColumnIndex);
            } catch (EvaluationException e) {
                return e.getErrorEval();
            }

            try {
            // Correct locale dependent format strings
                Locale locale = org.apache.poi.util.LocaleUtil.getUserLocale();
                if ("de".equals(locale.getLanguage())) {
                    s1 = s1.replace("T", "D"); // Tag = Day
                    // Monat = Month
                    s1 = s1.replace("J", "Y"); // Jahr = Year
                    //... further replacements
                } else if ("fr".equals(locale.getLanguage())) {
                    s1 = s1.replace("J", "D"); // Jour = Day
                    // Mois = Month
                    s1 = s1.replace("A", "Y"); // Année = Year
                    //... further replacements
                } //... further languages

            // Ask DataFormatter to handle the String for us
                String formattedStr = formatter.formatRawCellContents(s0, -1, s1);
                return new StringEval(formattedStr);
            } catch (Exception e) {
                return ErrorEval.VALUE_INVALID;
            }
        }
    };
...