Java iText HTML to PDF <pre> block formatting

时间:2015-12-17 13:17:36

标签: java html pdf itext

I'm using iText to convert HTML file structure to PDF. My HTML files contain code snippets in <pre> blocks, but iText doesn't leave them formatted as they are.

Example of my <pre> block:

      some content

This is what iText outputs to PDF:

<something>  <somethingelse> some content  </somethingelse>  </something>

Is there a way to configure iText to format this correctly?

My iText code snippet:

FileOutputStream os = new FileOutputStream(...);
Document doc = new Document(PageSize.A4);
PdfWriter writer = PdfWriter.getInstance(doc, os);
CSSResolver cssResolver = XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
HtmlPipelineContext htmlContext = new HtmlPipelineContext();

htmlContext.setImageProvider(new AbstractImageProvider() {
    public String getImageRootPath() {

Pipeline<?> pipeline = new CssResolverPipeline(cssResolver,
                       new HtmlPipeline(htmlContext,
                       new PdfWriterPipeline(doc, writer)));
XMLWorker worker = new XMLWorker(pipeline, true);
XMLParser parser = new XMLParser(worker);;

for (String inputFile : inputFiles) {
    parser.parse(new FileInputStream(inputFile), StandardCharsets.UTF_8);


2 个答案:

答案 0 :(得分:1)


HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);

TagProcessorFactory tagFactory = Tags.getHtmlTagProcessorFactory();
tagFactory.addProcessor(new TagProcessor() {

    public List<Element> startElement(WorkerContext ctx, Tag tag) {
        return null;

    public List<Element> content(WorkerContext ctx, Tag tag, String content) {
        return null;

    public List<Element> endElement(WorkerContext ctx, Tag tag, List<Element> currentContent) {
        return null;

    public boolean isStackOwner() {
        return false;
}, "pre");



答案 1 :(得分:1)

以下代码段(基于您的代码段和XMLWorker Documentation)会创建一个包含<pre>块的PDF。

public class HtmlToPdf {

    // proper exception handling needs to be implemented
    public static void main(String[] args) throws Exception {
        Document document = new Document(PageSize.A4);
        PdfWriter pdfWriter = PdfWriter.getInstance(document,
                new FileOutputStream("r:/temp/testpdf.pdf")

        CSSResolver cssResolver = XMLWorkerHelper.getInstance()
        HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);


        Pipeline<?> pipeline = new CssResolverPipeline(cssResolver,
                new HtmlPipeline(htmlContext,
                        new PdfWriterPipeline(document, pdfWriter)
        XMLWorker worker = new XMLWorker(pipeline, true);
        XMLParser parser = new XMLParser(worker);;

        String str = "<!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \n"
                + "   \"\">\n"
                + "<html xmlns=\"\" lang=\"en\" xml:lang=\"en\">\n"
                + "  <head>\n"
                + "    <title>sample html</title>\n"
                + "  </head>\n"
                + "  <body>\n"
                + "    <h2>sample text</h2>\n"
                + "    <pre>\n"
                + "      &lt;something&gt;\n"
                + "        &lt;somethingelse&gt;\n"
                + "          some content\n"
                + "        &lt;/somethingelse&gt;\n"
                + "      &lt;/something&gt;\n"
                + "    </pre>\n"
                + "  </body>\n"
                + "</html>";
        parser.parse(new StringReader(str));