我正在使用POI从excel文件中提取数据。 (Excel工作表中的第5列包含我的文件系统中存在的文件的名称) 我循环遍历表的行(使用POI提取单元格的内容),并为每行创建Tika实例,并在文件为Office文档时解析第5列中使用Tika“parseToString(file)”命名的文件(excel,ppt,word)我收到此错误:
Exception in thread "AWT-EventQueue-0" java.lang.NoSuchFieldError: filesystem
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:185)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:131)
at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:61)
at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:182)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:197)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:135)
at org.apache.tika.Tika.parseToString(Tika.java:357)
at org.apache.tika.Tika.parseToString(Tika.java:423)
at org.apache.tika.Tika.parseToString(Tika.java:403)
at HP.BuildMailExcelDoc.getTextFromTika(BuildMailExcelDoc.java:355)
at HP.BuildMailExcelDoc.addExcelDoc(BuildMailExcelDoc.java:314)
at HP.BuildMailExcelDoc.buildDoc(BuildMailExcelDoc.java:196)
at HP.BuildMailExcelDoc.buildMailDoc(BuildMailExcelDoc.java:102)
at HP.BuildMailExcelDoc.indexDirectory(BuildMailExcelDoc.java:69)
at HP.BuildMailExcelDoc.indexDirectory(BuildMailExcelDoc.java:78)
at HP.BuildMailExcelDoc.buildDoc(BuildMailExcelDoc.java:63)
at HP.IndexGUI$1.mouseClicked(IndexGUI.java:281)
at java.awt.AWTEventMulticaster.mouseClicked(Unknown Source)
at java.awt.Component.processMouseEvent(Unknown Source)
at javax.swing.JComponent.processMouseEvent(Unknown Source)
at java.awt.Component.processEvent(Unknown Source)
at java.awt.Container.processEvent(Unknown Source)
at java.awt.Component.dispatchEventImpl(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.LightweightDispatcher.retargetMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.processMouseEvent(Unknown Source)
at java.awt.LightweightDispatcher.dispatchEvent(Unknown Source)
at java.awt.Container.dispatchEventImpl(Unknown Source)
at java.awt.Window.dispatchEventImpl(Unknown Source)
at java.awt.Component.dispatchEvent(Unknown Source)
at java.awt.EventQueue.dispatchEvent(Unknown Source)
at java.awt.EventDispatchThread.pumpOneEventForFilters(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForFilter(Unknown Source)
at java.awt.EventDispatchThread.pumpEventsForHierarchy(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.pumpEvents(Unknown Source)
at java.awt.EventDispatchThread.run(Unknown Source)
我认为此问题是由POI中的嵌套使用导致的。 一次在excel表中,然后再次在Tika解析调用中。
听起来合理吗?我该如何处理这个问题?
谢谢:-) Reuth
答案 0 :(得分:4)
您的类路径上看起来有两个POI副本。我猜你有Tika提供的新版本以及旧版本。问题是Java正在你的类路径上找到第一个版本,这是旧版本。
您的解决方案是从类路径中删除旧版本。有关如何识别旧副本的来源,请参阅this POI FAQ Entry