将文档转换为html时,图形或形状未转换为html格式

时间:2018-07-16 10:21:52

标签: html apache-poi doc

我们想在浏览器的对话框中显示doc文件。这就是为什么我将其转换为html文件的原因。因此,doc文件已成功转换为html,但如果doc文件具有图形或任何形状,则它将转换为html文件。但是图形软件不会转换为img等之类的任何html标签,并且不会显示在UI上显示的文件中,

那么我们如何将具有图形或形状的doc文件转换为html。

\b

所以请帮助我在浏览器中显示文档文件。

1 个答案:

答案 0 :(得分:0)

AbstractWordConverter.setPicturesManager必须在AbstractWordConverter.processDocument之前完成。当然,Interface PicturesManager中的方法PicturesManager.savePicture需要具有将图片保存在实现此接口的类中的功能。

以下示例从我的主目录中获取一个WordDocument.doc,并将其转换为包含图片的HTML,并将结果文件(HTML文件和图像文件)放置在新创建的目录html中。请注意,WordDocument.doc中包含的图片必须是*.gif*.png*.jpg,因为用于Writing/Saving an Image的方法仅支持这些类型。

import org.apache.poi.hwpf.converter.WordToHtmlConverter;
import org.apache.poi.hwpf.converter.PicturesManager;

import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.usermodel.PictureType;
import org.apache.poi.util.XMLHelper;
import org.w3c.dom.Document;

import javax.xml.transform.OutputKeys;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.dom.DOMSource;
import javax.xml.transform.stream.StreamResult;

import java.io.StringWriter;
import java.io.FileInputStream;
import java.io.ByteArrayInputStream;
import java.io.File;

import java.awt.image.BufferedImage;
import javax.imageio.ImageIO;

public class TestWordToHtmlConverter {

 private static void convertDocToHTML(String docFilePathAndName, String htmlPath, String htmlFileName) throws Exception {

  new File(htmlPath).mkdir();

  HWPFDocument hwpfDocument = new HWPFDocument(new FileInputStream(docFilePathAndName));

  Document newDocument = XMLHelper.getDocumentBuilderFactory().newDocumentBuilder().newDocument();
  WordToHtmlConverter wordToHtmlConverter = new WordToHtmlConverter(newDocument);

  wordToHtmlConverter.setPicturesManager(
   new PicturesManager() {
    public String savePicture(byte[] content, PictureType pictureType, String suggestedName, float widthInches, float heightInches) {
     /*
     System.out.println(content);
     System.out.println(pictureType);
     System.out.println(suggestedName);
     System.out.println(widthInches);
     System.out.println(heightInches);
     */
     try {
      BufferedImage image = ImageIO.read(new ByteArrayInputStream(content));
      ImageIO.write(image, pictureType.getExtension(), new File(htmlPath, suggestedName));
     } catch (Exception e) {
      e.printStackTrace();
     }
     return suggestedName;
    }
   }
  );

  wordToHtmlConverter.processDocument(hwpfDocument);

  Transformer transformer = TransformerFactory.newInstance().newTransformer();
  transformer.setOutputProperty(OutputKeys.INDENT, "yes");
  transformer.setOutputProperty(OutputKeys.ENCODING, "utf-8");
  transformer.setOutputProperty(OutputKeys.METHOD, "html");
  transformer.transform(new DOMSource(wordToHtmlConverter.getDocument()),
                        new StreamResult(new File(htmlPath, htmlFileName)));

 }

 public static void main(String[] args) throws Exception {

  convertDocToHTML("/home/axel/Dokumente/WordDocument.doc", "/home/axel/Dokumente/html", "WordDocument.html");

 }

}