图像类型为UNKNOWN,带有PdfBox和JPEG2000示例

时间:2018-11-26 12:37:29

标签: java pdfbox jpeg2000

我从FNordware examples page中提取了一个样本Jpeg2000。

但是,当我尝试将该图像添加到PDF时:

PDDocument document = new PDDocument();
PDImageXObject pdImage = pdImage = PDImageXObject.createFromFileByContent(
   "samples/relax.jp2", document);
PDPage page = new PDPage(new PDRectangle(pageWidth, pageHeight));
PDPageContentStream contentStream = new PDPageContentStream(document, page);
contentStream.drawImage(pdImage, matrix);
contentStream.close();

我得到了例外:

  

原因:java.lang.IllegalArgumentException:图像类型未知   支持:relax.jp2         在org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.createFromFileByContent(PDImageXObject.java:313)

我在Maven中具有pdfbox依赖性:

    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>pdfbox</artifactId>
        <version>2.0.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>fontbox</artifactId>
        <version>2.0.12</version>
    </dependency>
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>jempbox</artifactId>
        <version>1.8.16</version>
    </dependency>       
    <dependency>
        <groupId>org.apache.pdfbox</groupId>
        <artifactId>jbig2-imageio</artifactId>
        <version>3.0.2</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-core</artifactId>
        <version>1.4.0</version>
    </dependency>
    <dependency>
        <groupId>com.github.jai-imageio</groupId>
        <artifactId>jai-imageio-jpeg2000</artifactId>
        <version>1.3.0</version>
    </dependency>

我在这里做错什么了吗?还是PdfBox和/或我使用的样本有问题?

其他apache库Tika将这种示例文件的mime类型检测为"image/jp2"

TikaConfig tika = new TikaConfig();
Metadata metadata = new Metadata();
MediaType mimetype = tika.getDetector().detect(
     TikaInputStream.get(new FileInputStream("samples/relax.jp2"), metadata);

1 个答案:

答案 0 :(得分:1)

从pdfbox的文档中:createFromFileByContent“支持以下文件类型:jpg,jpeg,tif,tiff,gif,bmp和png。”

查看源代码,在createFromFileByContent内部调用的是它们自己对已知文件类型的检查,与基础库无关,检测代码如下:https://jar-download.com/artifacts/org.apache.pdfbox/pdfbox/2.0.3/source-code/org/apache/pdfbox/util/filetypedetector/FileTypeDetector.java

此检查无法识别jpeg 2000。

实际上createFromFileByExtension可能是更好的选择:

if ("gif".equals(ext) || "bmp".equals(ext) || "png".equals(ext))
{
    BufferedImage bim = ImageIO.read(file);
    return LosslessFactory.createFromImage(doc, bim);
}

只要您假装有gif,bmp或png并且ImageIO支持j2k,这可能会起作用。 (未测试)