使用的套餐: Apache poi 3.11 POI-OOXML-架构 - 3.11 POI-OOXML-3.11 xmlbeans 2.6.0
我正在尝试阅读docx文件并将其转换为文本。这是代码
XWPFDocument wd;
try {
wd = new XWPFDocument(is);
@SuppressWarnings("resource")
XWPFWordExtractor wde = new XWPFWordExtractor(wd);
bodyText=wde.getText();
System.out.println(bodyText);
} catch (IOException e) {
e.printStackTrace();
}
这是错误消息
org.apache.poi.POIXMLException: org.apache.poi.openxml4j.exceptions.InvalidFormatException: Package should contain a content type part [M1.13]
at org.apache.poi.util.PackageHelper.open(PackageHelper.java:39)
at org.apache.poi.xwpf.usermodel.XWPFDocument.<init>(XWPFDocument.java:122)
at project.fileupload.TagParsing.fileParseForTags(TagParsing.java:33)
at project.fileupload.FileUploadServlet.doPost(FileUploadServlet.java:105)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:646)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:727)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:303)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:503)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:170)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:421)
at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1070)
at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:314)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
at java.lang.Thread.run(Unknown Source)
引起:org.apache.poi.openxml4j.exceptions.InvalidFormatException:包应包含内容类型部分[M1.13] 在org.apache.poi.openxml4j.opc.ZipPackage.getPartsImpl(ZipPackage.java:203) 在org.apache.poi.openxml4j.opc.OPCPackage.getParts(OPCPackage.java:673) 在org.apache.poi.openxml4j.opc.OPCPackage.open(OPCPackage.java:274) 在org.apache.poi.util.PackageHelper.open(PackageHelper.java:37) ......还有22个
我已经做了很多寻找解决方案,但无济于事。我绝对相信该文件是docx,它在Word中打开得很好。帮助赞赏。