使用Apache POI从文档中获取图像

时间:2014-01-03 05:49:09

标签: java image apache-poi

我正在使用Apache Poi从docx读取图像。

这是我的代码:

enter code here

public Image ReadImg(int imageid) throws IOException {
    XWPFDocument doc = new XWPFDocument(new FileInputStream("import.docx"));
    BufferedImage jpg = null;
    List<XWPFPictureData> pic = doc.getAllPictures();
    XWPFPictureData pict = pic.get(imageid);
    String extract = pict.suggestFileExtension();
    byte[] data = pict.getData();
    //try to read image data using javax.imageio.* (JDK 1.4+)
    jpg = ImageIO.read(new ByteArrayInputStream(data));
    return jpg;
}

它可以正确读取图像,但不按顺序读取。

例如,如果文档包含

image1.jpeg image2.jpeg image3.jpeg image4.jpeg image5.jpeg

它读取

图像4 图像3 此搜索 图像5 图像2

你能帮我解决一下吗?

我想按顺序阅读图片。

谢谢, Sithik

1 个答案:

答案 0 :(得分:1)

public static void extractImages(XWPFDocument docx) {
    try {

        List<XWPFPictureData> piclist = docx.getAllPictures();
        // traverse through the list and write each image to a file
        Iterator<XWPFPictureData> iterator = piclist.iterator();
        int i = 0;
        while (iterator.hasNext()) {
            XWPFPictureData pic = iterator.next();
            byte[] bytepic = pic.getData();
            BufferedImage imag = ImageIO.read(new ByteArrayInputStream(bytepic));
            ImageIO.write(imag, "jpg", new File("D:/imagefromword/" + pic.getFileName()));
            i++;
        }

    } catch (Exception e) {
        System.exit(-1);
    }

}