如何从microsoft word文档中读取单词注释(注释)?
请尽可能提供一些示例代码......
感谢你......
答案 0 :(得分:3)
最后,我找到了答案
这是代码段...
File file = null;
FileInputStream fis = null;
HWPFDocument document = null;
Range commentRange = null;
try {
file = new File(fileName);
fis = new FileInputStream(file);
document = new HWPFDocument(fis);
commentRange = document.getCommentsRange();
int numComments = commentRange.numParagraphs();
for (int i = 0; i < numComments; i++) {
String comments = commentRange.getParagraph(i).text();
comments = comments.replaceAll("\\cM?\r?\n", "").trim();
if (!comments.equals("")) {
System.out.println("comment :- " + comments);
}
}
} catch (Exception e) {
e.printStackTrace();
}
我正在使用Poi poi-3.5-beta7-20090719.jar,poi-scratchpad-3.5-beta7-20090717.jar。如果您希望使用基于OpenXML的文件格式,则需要其他档案 - poi-ooxml-3.5-beta7-20090717.jar和poi-dependencies-3.5-beta7-20090717.zip。
我很感谢马克B的帮助,他实际上找到了这个解决方案......
答案 1 :(得分:0)
获取HWPFDocument对象(例如,通过在输入流中传递Word文档)。
然后,您可以通过getSummaryInformation()获取摘要,这将通过getSummary()
答案 2 :(得分:0)
请参考以下链接,它可能符合您的要求......
http://bihag.wordpress.com/2009/11/04/how-to-read-comments-from-word-with-poi-jav/#comment-13
答案 3 :(得分:0)
我也是apache poi的新手。听到我的程序工作正常这个程序提取word格式文档到文本...我希望这个程序将帮助你在你运行这个程序之前你可以在你的类路径中设置相应的lib文件。
/*
* FileExtract.java
*
* Created on April 12, 2010, 9:46 AM
*
* To change this template, choose Tools | Template Manager
* and open the template in the editor.
*/
import java.io.File;
import java.io.FileInputStream;
import java.io.IOException;
import java.io.InputStream;
import javax.swing.text.BadLocationException;
import javax.swing.text.DefaultStyledDocument;
import javax.swing.text.rtf.RTFEditorKit;
import java.io.*;
import org.apache.poi.POIOLE2TextExtractor.*;
import org.apache.poi.POIOLE2TextExtractor;
import org.apache.poi.POITextExtractor;
import org.apache.poi.extractor.ExtractorFactory;
import org.apache.poi.hdgf.extractor.VisioTextExtractor;
import org.apache.poi.hslf.extractor.PowerPointExtractor;
import org.apache.poi.hssf.usermodel.HSSFWorkbook;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.ss.extractor.ExcelExtractor;
import org.apache.poi.xwpf.extractor.XWPFWordExtractor;
import javax.swing.text.Document;
/**
*
* @author ChandraMouil V
*/
public class RtfDocTextExtract {
/** Creates a new instance of FileExtract */
static String filePath;
static String rtfFile;
static FileInputStream fis;
static int x=0;
public RtfDocTextExtract() {
}
//This function for .DOC File
public static void meth(String filePath) {
try {
if(x!=0){
fis = new FileInputStream("D:/DummyRichTextFormat.doc");
POIFSFileSystem fileSystem = new POIFSFileSystem(fis);
WordExtractor oleTextExtractor = (WordExtractor) ExtractorFactory.createExtractor(fileSystem);
String[] paragraphText = oleTextExtractor.getParagraphText();
FileWriter fw = new FileWriter("E:/resume-template.txt");
for (String paragraph : paragraphText) {
fw.write(paragraph);
}
fw.flush();
}
}catch(Exception e){
e.printStackTrace();
}
}
}