使用Apache POI读取xlsx文件时出现异常(org.apache.poi.openxml4j.exceptions.InvalidFormatException:日期格式不正确,....)?

时间:2016-04-09 20:26:11

标签: java excel apache

我正在使用以下jar文件:

poi-3.14-20160307.jar
poi-ooxml-3.14-20160307.jar
poi-ooxml-schemas-3.14-20160307.jar
xmlbeans-2.6.0.jar

代码:

package firstExcel;

import java.io.BufferedReader;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.InputStreamReader;

import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.FormulaEvaluator;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class Test {

     public static void main( String[] args ) throws IOException {

        FileInputStream fis = new FileInputStream ( new File ("excel1.xlsx"));

        XSSFWorkbook wb = new XSSFWorkbook(fis);

        XSSFSheet sheet = wb.getSheetAt(0);

        FormulaEvaluator formulaEvaluator = wb.getCreationHelper().createFormulaEvaluator();

        for (Row row: sheet) {

            for (Cell cell: row){

                switch (formulaEvaluator.evaluateInCell(cell).getCellType()){

                case Cell.CELL_TYPE_NUMERIC:

                    System.out.print(cell.getNumericCellValue() + " t\t");
                    break;

                case Cell.CELL_TYPE_STRING:

                    System.out.print(cell.getStringCellValue() + " t\t" );
                    break;          
                }               
            }
        }
   }
}

错误讯息:

Exception in thread "main" java.lang.IllegalArgumentException: Date for    created could not be parsed: 2016-04-05T07:13:50+03:00
at     org.apache.poi.openxml4j.opc.internal.PackagePropertiesPart.setCreatedProperty(PackagePropertiesPart.java:393)
at org.apache.poi.openxml4j.opc.internal.unmarshallers.PackagePropertiesUnmarshaller.unmarshall(PackagePropertiesUnmarshaller.java:124)
at org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:726)
at org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:280)
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:37)
at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:274)
at firstExcel.Test.main(Test.java:45)
Caused by: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Date 2016-04-05T07:13:50+03:00Z not well formated, expected format yyyy-MM-dd'T'HH:mm:ss'Z' or yyyy-MM-dd'T'HH:mm:ss.SS'Z'
at org.apache.poi.openxml4j.opc.internal.PackagePropertiesPart.setDateValue(PackagePropertiesPart.java:575)
at org.apache.poi.openxml4j.opc.internal.PackagePropertiesPart.setCreatedProperty(PackagePropertiesPart.java:391)
... 6 more

Excel文件由Web提供程序自动生成,无法进行调整。它在几个不同的系统上在Excel中运行良好。所有细胞都形成为“一般”。例如,无设置为日期和时间,因为错误表明日期格式不正确。它应该只是作为一个字符串读取。如果可能导致问题,那么文件中有很多希伯来语文本?

有没有人有想法解决这个问题?谢谢你的帮助!

1 个答案:

答案 0 :(得分:2)

请使用excel1.xlsx实用程序打开ZIP。查看/docProps/core.xml存档中的ZIP。你会发现类似的东西:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<cp:coreProperties xmlns:cp="http://schemas.openxmlformats.org/package/2006/metadata/core-properties" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:dcterms="http://purl.org/dc/terms/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<dcterms:created xsi:type="dcterms:W3CDTF">2016-04-05T07:13:50+03:00</dcterms:created>
...
</cp:coreProperties>

问题在于2016-04-05T07:13:50+03:00。 Excel将接受此Z一个GMT + 03:00,但apache poi将不接受此。 Apache poi只会接受2016-04-05T07:13:50Z

可能不是<dcterms:created而是<dcterms:modified或其他日期。问题是一样的。

由于在创建工作簿时抛出此异常,因此您没有太多可能性。您可以要求网络提供商不在那里使用这样的日期。或者您可以使用手动方法在此XML文件中更改该日期。或者您可以创建一个到apache poi的错误报告。

为什么这是一个错误?

http://dublincore.org/documents/dcmi-terms/ - &gt; http://dublincore.org/documents/dcmi-terms/#terms-created - &gt; http://dublincore.org/documents/2012/06/14/dcmi-terms/?v=elements#date - &gt; http://www.w3.org/TR/NOTE-datetime

格式如下。确切地说,此处显示的组件必须存在,正好是这个标点符号。请注意,“T”字面上会出现在字符串中,以指示时间元素的开头,如ISO 8601中所指定。

   Year:
      YYYY (eg 1997)
   Year and month:
      YYYY-MM (eg 1997-07)
   Complete date:
      YYYY-MM-DD (eg 1997-07-16)
   Complete date plus hours and minutes:
      YYYY-MM-DDThh:mmTZD (eg 1997-07-16T19:20+01:00)
   Complete date plus hours, minutes and seconds:
      YYYY-MM-DDThh:mm:ssTZD (eg 1997-07-16T19:20:30+01:00)
   Complete date plus hours, minutes, seconds and a decimal fraction of a
second
      YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45+01:00)

其中:

 YYYY = four-digit year
 MM   = two-digit month (01=January, etc.)
 DD   = two-digit day of month (01 through 31)
 hh   = two digits of hour (00 through 23) (am/pm NOT allowed)
 mm   = two digits of minute (00 through 59)
 ss   = two digits of second (00 through 59)
 s    = one or more digits representing a decimal fraction of a second
 TZD  = time zone designator (Z or +hh:mm or -hh:mm)