我正在使用Apache poi插件实现索引器来读取.docx类型的文档。我所做的代码在
之下XWPFDocument doc = new XWPFDocument(new ByteArrayInputStream(fileData.data));
XWPFWordExtractor msWord2007Extractor = new XWPFWordExtractor(doc);
wordText = msWord2007Extractor.getText();
当我从第1行执行此操作时,我会遇到异常。
org.apache.xmlbeans.SchemaTypeLoaderException:无法解析句柄的类型_XY_Q = space | R = space @ http://www.w3.org/XML/1998/namespace(schemaorg_apache_xmlbeans.system.sE130CAA0A01A7CDE5A2B4FEB8B311707.cttext7f5btype) - 代码13 at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl $ XsbReader.readHandle(SchemaTypeSystemImpl.java:2021) at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl $ XsbReader.readTypeRef(SchemaTypeSystemImpl.java:2095) at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl $ XsbReader.loadAttribute(SchemaTypeSystemImpl.java:2922) at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl $ XsbReader.readAttributeData(SchemaTypeSystemImpl.java:2914) at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl $ XsbReader.finishLoadingType(SchemaTypeSystemImpl.java:2531) at org.apache.xmlbeans.impl.schema.SchemaTypeSystemImpl.resolveHandle(SchemaTypeSystemImpl.java:3507) 在org.apache.xmlbeans.SchemaComponent $ Ref.getComponent(SchemaComponent.java:104) at org.apache.xmlbeans.SchemaType $ Ref.get(SchemaType.java:872) at org.apache.xmlbeans.impl.schema.SchemaPropertyImpl.getType(SchemaPropertyImpl.java:92) at org.apache.xmlbeans.impl.schema.SchemaTypeImpl.createElementType(SchemaTypeImpl.java:965) at org.apache.xmlbeans.impl.values.XmlObjectBase.create_element_user(XmlObjectBase.java:893) 在org.apache.xmlbeans.impl.store.Xobj.getUser(Xobj.java:1657) 在org.apache.xmlbeans.impl.store.Cur.getUser(Cur.java:2654) 在org.apache.xmlbeans.impl.store.Cur.getObject(Cur.java:2647) 在org.apache.xmlbeans.impl.store.Cursor._getObject(Cursor.java:995) 在org.apache.xmlbeans.impl.store.Cursor.getObject(Cursor.java:2904) 在org.apache.poi.xwpf.usermodel.XWPFParagraph。(XWPFParagraph.java:90) 在org.apache.poi.xwpf.usermodel.XWPFDocument.onDocumentRead(XWPFDocument.java:146) 在org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:159) 在org.apache.poi.xwpf.usermodel.XWPFDocument。(XWPFDocument.java:123) at org.wso2.carbon.registry.samples.handler.MSWordIndexer.getIndexedDocument(MSWordIndexer.java:42) 在org.wso2.carbon.registry.indexing.solr.SolrClient.indexDocument(SolrClient.java:178) 在org.wso2.carbon.registry.indexing.AsyncIndexer $ IndexingTask.doWork(AsyncIndexer.java:203) 在org.wso2.carbon.registry.indexing.AsyncIndexer $ IndexingTask.run(AsyncIndexer.java:189) at java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:471) 在java.util.concurrent.FutureTask.run(FutureTask.java:262) 在java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:615) 在java.lang.Thread.run(Thread.java:745)
当文档包含某些值时会发生这种情况。它为空文档编制索引。
答案 0 :(得分:0)
在Weblogic中运行应用程序时,我遇到了相同的异常。 poi-ooxml-schemas被打包为applib放在耳朵里,但是问题仍然没有解决。
我在Alfresco Jira中遇到了解决方案,我将其发布在这里以供将来参考。
解决方案是为类加载器添加提示,使其更喜欢应用程序包:
<prefer-application-packages>
<package-name>schemaorg_apache_xmlbeans.system.sXMLCONFIG.*</package-name>
<package-name>schemaorg_apache_xmlbeans.system.sXMLLANG.*</package-name>
<package-name>schemaorg_apache_xmlbeans.system.sXMLSCHEMA.*</package-name>
<package-name>schemaorg_apache_xmlbeans.system.sXMLTOOLS.*</package-name>
</prefer-application-packages>