HTML到PDF从Itext 5.5和XMLWorker转换阿拉伯语文本

时间:2014-05-21 12:04:36

标签: java html pdf itext xmlworker

我正在尝试使用来自Itext 5.5&的非法文本转换HTML字符串。 XMLWorker。 转换后,阿拉伯字符显示为空白。

使用的代码片段如下:

public class CreateArabic {

    public static void main(String args[]) {

        try {

            Rectangle pagesize = new Rectangle(8.5f * 72, 11 * 72);

            Document document = new Document(pagesize, 72, 72, 72, 72);

            PdfWriter writer = PdfWriter.getInstance(document,
                    new FileOutputStream("c:\\report.pdf"));

            writer.getAcroForm().setNeedAppearances(true);

            document.open();

            FontFactory.registerDirectories();
            Font font = FontFactory.getFont("C:\\damase.ttf",
                    BaseFont.IDENTITY_H, true, 22, Font.BOLD);


            document.open();



             XMLWorkerHelper helper = XMLWorkerHelper.getInstance();
             // CSS
             CSSResolver cssResolver = new StyleAttrCSSResolver();
             CssFile cssFile = helper.getCSS(new FileInputStream(
             "D:\\Itext_Test\\Test\\src\\test.css"));
             cssResolver.addCss(cssFile);

             // HTML
             XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
             fontProvider.getFont("C:\\damase.ttf", BaseFont.IDENTITY_H,BaseFont.EMBEDDED);
             fontProvider.register("C:\\damase.ttf");


             CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
             HtmlPipelineContext htmlContext = new HtmlPipelineContext(
             cssAppliers);
             htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());

             PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
             HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
             CssResolverPipeline css = new CssResolverPipeline(cssResolver,
             html);

             XMLWorker worker = new XMLWorker(css, true);

             XMLParser p = new XMLParser(worker);

                         String htmlString = "<html><head></head><body>"+"اب"+"</body></html>";

ByteArrayInputStream is = new ByteArrayInputStream(htmlString.getBytes("UTF-8"));

p.parse(is, Charset.forName("UTF-8"));



             document.close();
        } catch (Exception ex) {
            ex.printStackTrace();

        }

    }
}

1 个答案:

答案 0 :(得分:1)

我有同样的问题,区别在于我使用土耳其字体并且缺少: -

请参阅我的回答solution了解

希望对你有所帮助 的问候,