尝试读取docx文件时获取[InvalidFormatException]

时间:2014-12-08 01:31:45

标签: java apache apache-poi docx

使用的套餐: Apache poi 3.11 POI-OOXML-架构 - 3.11 POI-OOXML-3.11 xmlbeans 2.6.0

我正在尝试阅读docx文件并将其转换为文本。这是代码

XWPFDocument wd;
        try {
            wd = new XWPFDocument(is);
            @SuppressWarnings("resource")
            XWPFWordExtractor wde = new XWPFWordExtractor(wd);
            bodyText=wde.getText();
            System.out.println(bodyText);
        } catch (IOException e) {
            e.printStackTrace();
        }

这是错误消息

org.apache.poi.POIXMLException: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13]
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:122)
at project.fileupload.TagParsing.fileParseForTags(TagParsing.java:33)
at project.fileupload.FileUploadServlet.doPost(FileUploadServlet.java:105)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:503)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Unknown Source)

引起:org.apache.poi.openxml4j.exceptions.InvalidFormatException:包应包含内容类型部分[M1.13]     在org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:203)     在org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:673)     在org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:274)     在org.apache.poi.util.PackageHelper.open(PackageHelper.java:37)     ......还有22个

我已经做了很多寻找解决方案,但无济于事。我绝对相信该文件是docx,它在Word中打开得很好。帮助赞赏。

0 个答案:

没有答案