Apache POI无法检测哈希格式化的数字

时间:2017-10-19 16:21:53

标签: java excel apache-poi

我需要将通过xls / xlsx上传的电话号码读取到Java String变量中,尽可能接近excel文件中显示的内容。

所以我填写了这些数据: enter image description here

正如您所看到的,单元格中的实际值为166609999,其格式为60#############,因此最后我们会看到单元格中出现60166609999

我想将单元格内容捕获为String中的60166609999,但到目前为止我只设法捕获166609999,有人可以告诉我什么是错的吗?

注意:如果我将格式从60############更改为60000000000,我可以毫无问题地捕获60166609999,但是excel是通过公共网站上传的,因此我无法强制执行

代码简单如下:

Cell cell = getTheCell(); // Got this after reading the sheets and rows
DataFormatter df = new DataFormatter();
String value = df.formatCellValue(cell);
// Here in value
// If format is 600000000, I can get 60166609999 (right)
// If format is 60#######, I get 166609999 (wrong)

我正在使用的图书馆:

  • poi(poi)3.17
  • poi(poi-ooxml)3.17
  • poi(poi-ooxml-schemas)3.17
  • Java 7

任何人都知道我需要做些什么才能做到正确?

谢谢。

1 个答案:

答案 0 :(得分:1)

问题是多方面的。

首先,无法使用60############来应用数字格式Java。它导致java.lang.IllegalArgumentException: Malformed pattern "60############"使用DecimalFormat

但如果需要将每个数字加上前缀为" 60",那么Excel数字格式\6\0#"60"#应该是可能的,并且应该被翻译成DecimalFormat模式'60'#。但是apache poi' DataFormatter没有,因为它只是删除Excel格式字符串中的所有引用,这导致60#也是MyDataFormatter格式错误。

问题出在DataFormatter.java:671ff

我已经在我的... // Now, handle the other aspects like // quoting and scientific notation for(int i = 0; i < sb.length(); i++) { char c = sb.charAt(i); /* // remove quotes and back slashes if (c == '\\' || c == '"') { sb.deleteCharAt(i); i--; */ // handle quotes and back slashes if (c == '\\') { sb.setCharAt(i, '\''); sb.insert(i+2, '\''); i+=2; } else if (c == '"') { sb.setCharAt(i, '\''); // for scientific/engineering notation } else if (c == '+' && i > 0 && sb.charAt(i - 1) == 'E') { sb.deleteCharAt(i); i--; } } formatStr = sb.toString(); formatStr = formatStr.replace("''", ""); return formatStr; } ... 中对此进行了补丁:

import org.apache.poi.ss.usermodel.*;
import org.apache.poi.ss.util.*;

import java.io.FileInputStream;

import java.lang.reflect.Method;

class ExcelDataformatterExample {

 public static void main(String[] args) throws Exception {

  Workbook wb  = WorkbookFactory.create(new FileInputStream("ExcelExample.xlsx"));

  DataFormatter df = new DataFormatter();
  MyDataFormatter mydf = new MyDataFormatter();

  Sheet sheet = wb.getSheetAt(0);
  for (Row row : sheet) {
   for (Cell cell : row) {
    if (cell.getCellTypeEnum() == CellType.NUMERIC) {
     CellReference cellRef = new CellReference(row.getRowNum(), cell.getColumnIndex());
     System.out.println("Cell " + cellRef.formatAsString());

     System.out.print("Excel's data format string: ");
     String formatStr = cell.getCellStyle().getDataFormatString();
     System.out.println(formatStr);

     System.out.print("Value using poi's data formatter: ");
     Method cleanFormatForNumber = DataFormatter.class.getDeclaredMethod("cleanFormatForNumber", String.class); 
     cleanFormatForNumber.setAccessible(true); 
     String cleanFormatStr = (String)cleanFormatForNumber.invoke(df, formatStr);
     System.out.print("using poi's cleanFormatStr: ");
     System.out.print(cleanFormatStr + " result: ");
     String value = df.formatCellValue(cell);
     System.out.println(value);

     System.out.print("Value using my data formatter: ");
     cleanFormatForNumber = MyDataFormatter.class.getDeclaredMethod("cleanFormatForNumber", String.class); 
     cleanFormatForNumber.setAccessible(true); 
     cleanFormatStr = (String)cleanFormatForNumber.invoke(mydf, formatStr);
     System.out.print("using my cleanFormatStr: ");
     System.out.print(cleanFormatStr + " result: ");
     value = mydf.formatCellValue(cell);
     System.out.println(value);

    }
   }
  }
  wb.close();

 }

}

在此示例中使用此选项:

199901234

如果值A1位于A4格式的ExcelCell A1 Excel's data format string: \60########## Value using poi's data formatter: using poi's cleanFormatStr: 60########## result: 199901234 Value using my data formatter: using my cleanFormatStr: '6'0########## result: 199901234 Cell A2 Excel's data format string: \60000000000 Value using poi's data formatter: using poi's cleanFormatStr: 60000000000 result: 60199901234 Value using my data formatter: using my cleanFormatStr: '6'0000000000 result: 60199901234 Cell A3 Excel's data format string: "60"# Value using poi's data formatter: using poi's cleanFormatStr: 60# result: 199901234 Value using my data formatter: using my cleanFormatStr: '60'# result: 60199901234 Cell A4 Excel's data format string: \6\0# Value using poi's data formatter: using poi's cleanFormatStr: 60# result: 199901234 Value using my data formatter: using my cleanFormatStr: '60'# result: 60199901234 中,则会导致以下输出:

SELECT Years, Data
FROM 
(
    SELECT  Years,
            Data,
            ROW_NUMBER() OVER(PARTITION BY Years ORDER BY count(*) DESC) rn

    FROM TableA
    GROUP BY Years, Data
) x
WHERE rn = 1
ORDER BY Years, Data