转换为pdf时,iText无法正确呈现html标头标签

时间:2017-10-27 17:17:05

标签: javascript html css pdf itext

这个新问题与此issue有关。基本上我能够将具有图像和CSS样式的html文件转换为PDF。但是,我只是注意到标题标记(h1 - h6)无法正确呈现。我意识到在iText网页上有一个这样的例子,但代码不能与我现在拥有的一样。有些人可以帮我解决这个问题。感谢。

public class HtmlToPdfConverter {

    public static final String HTML = "C:\\Users\\APPS\\Desktop\\helppages_COPY\\Help_SiteMap.htm";
    public static final String DEST = "C:\\Users\\APPS\\Desktop\\output.pdf";

    //public static final String HTML = "C:\\Users\\APPS\\Desktop\\helppages_COPY\\general\\Training1.htm";
    //public static final String DEST = "C:\\Users\\APPS\\Desktop\\Training1.pdf";

    public static final String IMG_PATH = "C:\\Users\\APPS\\Desktop\\helppages_COPY\\images\\";
    public static final String RELATIVE_PATH = "C:\\Users\\APPS\\Desktop\\helppages_COPY\\images\\";
    public static final String CSS_DIR = "C:\\Users\\APPS\\Desktop\\helppages_COPY\\css\\";

    public void createPdf(String file) throws IOException, DocumentException {
        Document document = new Document();
        PdfWriter.getInstance(document, new FileOutputStream(file));
        document.open();

        // CSS
        CSSResolver cssResolver = XMLWorkerHelper.getInstance().getDefaultCssResolver(false);
        FileRetrieve fileRetrieve = new FileRetrieveImpl(CSS_DIR);
        cssResolver.setFileRetrieve(fileRetrieve);

        // HTML
        HtmlPipelineContext htmlPipelineContext = new HtmlPipelineContext(null);
        htmlPipelineContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
        htmlPipelineContext.setImageProvider(new AbstractImageProvider() {

            @Override
            public String getImageRootPath() {
                return IMG_PATH;
            }
        });

        htmlPipelineContext.setLinkProvider(new LinkProvider() {

            @Override
            public String getLinkRoot() {
                return RELATIVE_PATH;
            }
        });

        // Pipelines
        ElementList elements = new ElementList();
        ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
        //PdfWriterPipeline pdfWriterPipeline = new PdfWriterPipeline(document, pdfWriter);
        //HtmlPipeline hhHtmlPipeline = new HtmlPipeline(htmlPipelineContext, pdfWriterPipeline);
        HtmlPipeline hhHtmlPipeline = new HtmlPipeline(htmlPipelineContext, end);
        CssResolverPipeline cssResolverPipeline = new CssResolverPipeline(cssResolver, hhHtmlPipeline);

        // XML Worker
        XMLWorker xmlWorker = new XMLWorker(cssResolverPipeline, true);
        XMLParser xmlParser = new XMLParser(xmlWorker);
        xmlParser.parse(new FileInputStream(HTML));

        document.open();
        for (Element e : elements) {
            document.add(e);
        }
        document.add(Chunk.NEWLINE);


        document.close();

        System.out.println("**** Conversion Complete ****");
    }

    public static void main(String[] args) throws IOException, DocumentException {
        new HtmlToPdfConverter().createPdf(DEST);
    }

}

这是我的一个html页面的示例。请注意,这些页面只是静态html页面。它们与在用户桌面上运行的基于Java FX的帮助页面应用程序一起使用。它允许他们查看页面,然后如果他们选择将其打印为pdf。页面由Tomcat服务器提供。页面使用外部CSS样式表和图像。 html文件是用HTML5编写的。除了标题标记之外,html文件中的所有内容都在pdf文件中正确呈现。例如,在页面中我添加了“教程”一词。应该是一个,但你可以看到它不是。我已经在浏览器中包含了我的页面以及我的pdf的样子,因此您可以看到我的问题。我希望这个解释得很好。我不知道如何让这更清楚。

HTML文件:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="UTF-8" />
    <title>Alert Notification</title>
    <link rel="stylesheet" type="text/css" href="../css/helppages.css" />
</head>

<body>

    <div class="page-padding">

        <table class="header">
            <tr class="header-footer-tr">
                <td class="header-td-left">
                    <a href="#LINKS">Topic References</a>
                </td>
                <td class="header-td-center">
                    <a id="TOP">Create Alert</a>
                </td>
                <td class="header-td-right">
                    <a href="../Help_SiteMap.htm">Topics Map</a>
                </td>
            </tr>
        </table>

        <p>The Create Alert function provides an Administrator a means to send a &#8220;System&#8221; Alert that immediately displays
            a message window to all users who are currently logged on and to the rest when they do log on until a specified time.
            A system Alert might be used to notify users of changes that effect system access or availability such as scheduled maintenance.
        </p>

        <p>This function is executed as an Admin menu option.</p>

        <div class="image-padding center-content">
            <!-- <img src="../images/Create_Alert.png" alt="Create Alert" /> -->
            <img src="http://via.placeholder.com/300x100" alt="dummy image" />
        </div>

        <p>This window has two text fields, two selection lists, and two buttons.</p>

        <ul>
            <li>Subject - text input to accept a title of the message to all users</li>
            <li>Expire time hour - selection list to specify the hour of the day until which users will see this message at login</li>
            <li>Expire time minute - selection list to specify the minute of the hour until which users will see this message at login
            </li>
            <li>Multi-line text input field to accept message body</li>
            <li>OK - button to accept and broadcast the Alert then close the window</li>
            <li>Close - button to close the window discarding the message</li>
        </ul>

        <br />

        <div class="thin-horizontal-line"></div>

        <h2 id="TUTORIAL">Tutorial</h2>

        <p>
            <b>Send an Alert</b>
        </p>

        <ul>
            <li>Enter an Alert title in the Subject text input field and information for the Alert body (both required fields)</li>
            <li>Optionally, change the expiration time (must be at least 5 minutes in future, 10 is safer)</li>
            <li>Click the OK button</li>
        </ul>

        <br />

        <table class="footer">
            <tr class="header-footer-tr">
                <td style="width: 50%; text-align: center">
                    <a id="LINKS"></a>Local Topics</td>
                <td style="width: 50%; text-align: center">Related Topics</td>
            </tr>
            <tr class="verticle-align-top">
                <td class="local-topics-padding">
                    <p>
                        <a href="#TOP">Top of this Page</a>
                    </p>
                    <p>
                        <a href="#TUTORIAL">Tutorial</a>
                    </p>
                </td>
                <td class="text-align-right related-topics-padding">
                    <p>
                        <a href="../admin/Admin_Main.htm">Admin Functions</a>
                    </p>
                </td>
            </tr>
        </table>
    </div>
</body>

</html>

页面应该是什么样的:

enter image description here

这是我转换的pdf的样子:

enter image description here

谢谢,我感谢迄今为止给予的所有帮助。

0 个答案:

没有答案