我正在编写一个应用程序来显示和编辑文件.doc我正在使用POI和HWPF。现在我可以从文件中读取文本并写入.doc文件。但是我的读者只读取由msoffice创建的默认文件.doc,它无法读取我的编写者创建的文件,msoffice也可以读取此文件并且所有内容都显示正确。它总是显示错误:
Exception in thread "main" java.lang.RuntimeException:java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at org.apache.poi.hwpf.extractor.WordExtractor.getText(WordExtractor.java:322)
at ReadPOI.main(ReadPOI.java:18)
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.poi.hwpf.usermodel.Range.binarySearchStart(Range.java:1016)
at org.apache.poi.hwpf.usermodel.Range.findRange(Range.java:1095)
at org.apache.poi.hwpf.usermodel.Range.initParagraphs(Range.java:982)
at org.apache.poi.hwpf.usermodel.Range.numParagraphs(Range.java:311)
at org.apache.poi.hwpf.converter.AbstractWordConverter.processParagraphes(AbstractWordConverter.java:1058)
at org.apache.poi.hwpf.converter.WordToTextConverter.processSection(WordToTextConverter.java:435)
at org.apache.poi.hwpf.converter.AbstractWordConverter.processSingleSection(AbstractWordConverter.java:1126)
at org.apache.poi.hwpf.converter.AbstractWordConverter.processDocument(AbstractWordConverter.java:722)
at org.apache.poi.hwpf.extractor.WordExtractor.getText(WordExtractor.java:304)
... 1 more
msoffice创建的文件和我的编写者创建的文件之间是否有任何不同,以及如何修复它。请帮我。 Java中有我的演示代码。谢谢
我的读者:
import java.io.File;
import java.io.FileInputStream;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.hwpf.usermodel.Range;
public class ReadPOI
{
public static void main(String args[]) throws Exception
{
File file = new File("Test.doc");
FileInputStream fin = new FileInputStream(file);
HWPFDocument doc = new HWPFDocument(fin);
Range range = doc.getRange();
WordExtractor extractor = new WordExtractor(doc);
System.out.println("starting\n" + extractor.getText() + "end\n");
fin.close();
}
}
我的作家:
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import org.apache.poi.hwpf.HWPFDocument;
public class WritePOI
{
public static void main(String args[]) throws Exception
{
File file = new File("Template.doc");
FileInputStream fin = new FileInputStream(file);
HWPFDocument doc = new HWPFDocument(fin);
doc.getRange().replaceText("Haha\n", false);
FileOutputStream fout = new FileOutputStream("Test.doc");
doc.write(fout);
fout.close();
fin.close();
}
}
答案 0 :(得分:1)
它是WordExtractor getText()中的一个错误,它甚至可以保留到版本3.10-FINAL。它不应该给你一个:
Caused by: java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
at java.util.ArrayList.rangeCheck(ArrayList.java:571)
at java.util.ArrayList.get(ArrayList.java:349)
at org.apache.poi.hwpf.usermodel.Range.binarySearchStart(Range.java:1016)
api中未将其标记为已弃用,但它表示getTextFromPieces()更快。我使用你的例子仔细检查了它,它工作正常。
所以在ReadPOI中使用:
System.out.println(extractor.getTextFromPieces());
或者
String [] dataArray = extractor.getParagraphText();
for(int i=0;i<dataArray.length;i++)
{
System.out.println("\n–" + dataArray[i]);
}