如何使用iText和XMLWorker生成有效的PDF / A文件(HTML到PDF / A流程)

时间:2014-09-01 10:41:56

标签: pdf fonts itext pdfa xmlworker

我目前正在开发一种接受HTML输入并将其转换为有效PDF / A文件的方法。我知道如何使用iText以编程方式构建有效的PDF / A文件(参考:http://itextsupport.com/download/pdfa3.html),但是我无法使用HTML作为输入生成有效的PDF / A文件,并使用XMLWorker将此输入转换为一个PDF文件。我现在遇到的问题是由于PDF / A格式的嵌入字体要求。我总是得到这个例外:

线程中的异常" main" com.itextpdf.text.pdf.PdfAConformanceException:必须嵌入所有字体。这个不是:Helvetica

我尝试通过CSS文件强制HTML输入使用哪些字体,并通过XMLWorkerFontProvider类在输出PDF文件中注册我想要使用的字体,但似乎我做错了,因为上面注释的异常总是被抛出。

为了使XMLWorker使用通过XMLWorkerFontProvider类注册的字体,还需要什么?我想避免在输入中出现的每个HTML元素中使用默认字体Helvetica。

以下是我用于测试的代码:

style.css(只有1行):

* { font: normal 100% Arial, sans-serif !important; }

Main.java:

package com.itextpdf;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.Reader;
import java.io.StringReader;

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.ICC_Profile;
import com.itextpdf.text.pdf.PdfAConformanceLevel;
import com.itextpdf.text.pdf.PdfAWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerFontProvider;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.css.CssFile;
import com.itextpdf.tool.xml.css.StyleAttrCSSResolver;
import com.itextpdf.tool.xml.html.CssAppliers;
import com.itextpdf.tool.xml.html.CssAppliersImpl;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;

public class Main {

    /**
     * @param args
     */
    public static void main(String[] args) {

        StringBuffer buf = new StringBuffer();

        buf.append("<!DOCTYPE html>");
        buf.append("<html>");
        buf.append("<head>");
        buf.append("<title>Test</title>");
        buf.append("</head>");
        buf.append("<body>");
        buf.append("<p>This is a test</p>");
        buf.append("</body>");
        buf.append("</html>");

        OutputStream file = null;
        Document document = null;
        PdfAWriter writer = null;

        try {

            file = new FileOutputStream(new File("C:\\Users\\amartin\\Desktop\\Test.pdf"));
            document = new Document();
            writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);

            // Create XMP metadata. It's a PDF/A requirement.
            writer.createXmpMetadata();

            document.open();

            // Set output intent. PDF/A requirement.
            ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream("./src/main/resources/com/itextpdf/sRGB Color Space Profile.icm"));
            writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);

            // CSS
            CSSResolver cssResolver = new StyleAttrCSSResolver();
            CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
            cssResolver.addCss(cssFile);

            XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
            fontProvider.register("./fonts/arial.ttf");
            fontProvider.register("./fonts/sans-serif.ttf");
            fontProvider.addFontSubstitute("lowagie", "garamond");

            CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
            HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
            htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());

            // Pipelines
            PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
            HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
            CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

            XMLWorker worker = new XMLWorker(css, true);
            XMLParser p = new XMLParser(worker);

            Reader reader = new StringReader(buf.toString());
            p.parse(reader);

        } catch (Exception e) {

            e.printStackTrace();

        } finally {

            if (document != null && document.isOpen())
                document.close();

            try {

                if (file != null)
                    file.close();

            } catch (IOException e) {}

            if (writer != null && !writer.isCloseStream())
                writer.close();

        }

    }

}

编辑:

回答Bruno,我扩展了FontFactoryImp类,重写了getFont()方法(包含所有参数的方法)。它调用System.out.println函数,如下所示:

System.out.println("=fontname: " + fontname + " =encoding: " + encoding + " =embedded : " + embedded + " =size: " + size + " =style: " + style + " =BaseColor: " + color)

然后使用相同的参数调用parent.getFont()方法。我看到的唯一输出是:

  

= fontname:null = encoding:Cp1252 = embedded:true = size:-1.0 = style:-1 = BaseColor:null   = fontname:null = encoding:Cp1252 = embedded:true = size:-1.0 = style:-1 = BaseColor:null

并抛出异常,粘贴在此代码之前。

2 个答案:

答案 0 :(得分:3)

根据您发送给System.out的反馈,似乎XML Worker没有选择您要使用的字体系列。

请指定如下字体系列:

font-family: "Arial"

在CSS中使用'font'可能会有效,但这很棘手。我认为iText看到normal并将其解释为使用默认字体

答案 1 :(得分:3)

使此示例有效的完整代码如下:

的style.css:

* {
    font-family: "Arial";
    font-style: normal;
}

Main.java:

package com.itextpdf;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.io.Reader;
import java.io.StringReader;

import com.itextpdf.text.Document;
import com.itextpdf.text.pdf.ICC_Profile;
import com.itextpdf.text.pdf.PdfAConformanceLevel;
import com.itextpdf.text.pdf.PdfAWriter;
import com.itextpdf.tool.xml.XMLWorker;
import com.itextpdf.tool.xml.XMLWorkerHelper;
import com.itextpdf.tool.xml.css.CssFile;
import com.itextpdf.tool.xml.css.StyleAttrCSSResolver;
import com.itextpdf.tool.xml.html.CssAppliers;
import com.itextpdf.tool.xml.html.CssAppliersImpl;
import com.itextpdf.tool.xml.html.Tags;
import com.itextpdf.tool.xml.parser.XMLParser;
import com.itextpdf.tool.xml.pipeline.css.CSSResolver;
import com.itextpdf.tool.xml.pipeline.css.CssResolverPipeline;
import com.itextpdf.tool.xml.pipeline.end.PdfWriterPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipeline;
import com.itextpdf.tool.xml.pipeline.html.HtmlPipelineContext;

public class Main {

    public static void main(String[] args) {

        StringBuffer buf = new StringBuffer();

        String title = "Test";

        // Sample HTML content.
        buf.append("<!DOCTYPE html>");
        buf.append("<html>");
        buf.append("<head>");
        buf.append("<title>" + title + "</title>");
        buf.append("</head>");
        buf.append("<body>");
        buf.append("<p>This is a test</p>");
        buf.append("</body>");
        buf.append("</html>");

        OutputStream file = null;
        Document document = null;
        PdfAWriter writer = null;

        try {

            file = new FileOutputStream(new File("C:\\Users\\amartin\\Desktop\\Test.pdf"));
            document = new Document();
            writer = PdfAWriter.getInstance(document, file, PdfAConformanceLevel.PDF_A_1B);

            // Avoid discrepances between document title and XMP metadata information.
            document.addTitle(title);

            // Create XMP metadata. It's a PDF/A requirement.
            writer.createXmpMetadata();

            document.open();

            // Set output intent. PDF/A requirement.
            ICC_Profile icc = ICC_Profile.getInstance(new FileInputStream("./src/main/resources/com/itextpdf/sRGB Color Space Profile.icm"));
            writer.setOutputIntents("Custom", "", "http://www.color.org", "sRGB IEC61966-2.1", icc);

            // CSS stylesheet.
            CSSResolver cssResolver = new StyleAttrCSSResolver();
            CssFile cssFile = XMLWorkerHelper.getCSS(new FileInputStream("./css/style.css"));
            cssResolver.addCss(cssFile);

            MyFontProvider fontProvider = new MyFontProvider();
            fontProvider.register("./fonts/arial.ttf");

            /* DEBUG
            System.out.println("Fonts present in " + fontProvider.getClass().getName());
            Set<String> registeredFonts = fontProvider.getRegisteredFonts();
            for (String font : registeredFonts)
                System.out.println(font);
            */

            CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
            HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
            htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());

            // Pipelines.
            PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
            HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
            CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);

            XMLWorker worker = new XMLWorker(css, true);
            XMLParser p = new XMLParser(worker);

            Reader reader = new StringReader(buf.toString());
            p.parse(reader);

        } catch (Exception e) {

            e.printStackTrace();

        } finally {

            if (document != null && document.isOpen())
                document.close();

            try {

                if (file != null)
                    file.close();

            } catch (IOException e) {}

            if (writer != null && !writer.isCloseStream())
                writer.close();

        }

    }

}

MyFontProvider.java:

package com.itextpdf;

import com.itextpdf.text.BaseColor;
import com.itextpdf.text.Font;
import com.itextpdf.text.FontFactoryImp;

public class MyFontProvider extends FontFactoryImp {

    @Override
    public Font getFont(String fontname, String encoding, boolean embedded,
            float size, int style, BaseColor color) {

        System.out.println("=fontname: " + fontname + " =encoding: " + encoding + " =embedded : " + embedded + " =size: " + size + " =style: " + style + " =BaseColor: " + color);

        return super.getFont(fontname, encoding, embedded, size, style, color);

    }

}

再次,谢谢你,布鲁诺。我很高兴在这里得到你的帮助:)。