我正在从Itext 5.5和XMLWorker中将HTML格式转换为PDF格式的阿拉伯语文本。
当运行独立的Java程序时,这非常适用。 (它按预期在RTL中使用sprint语言)但在tomcat中运行相同的程序时始终打印LTR。 (甚至尝试使用硬编码字符串,文件在tomcat应用程序代码中)。
以下是示例代码。取自http://developers.itextpdf.com/question/how-convert-arabic-html-pdf
public void createPdf(File file)
throws IOException, DocumentException {
// step 1
Document document = new Document();
// step 2
PdfWriter writer =
PdfWriter.getInstance(document, new FileOutputStream(file));
// step 3
document.open();
// step 4
// Styles
CSSResolver cssResolver = new StyleAttrCSSResolver();
XMLWorkerFontProvider fontProvider =
new XMLWorkerFontProvider(XMLWorkerFontProvider.DONTLOOKFORFONTS);
fontProvider.register("/Users/ashish/Downloads/NotoNaskhArabicRegular.ttf");
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
// Pipelines
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
String htmlContentAr ="<table><tr><td>String of Arabia</td><td dir=\"rtl\" style=\"font-family: Noto Naskh Arabic\">لورانس العرب</td></tr></table>";
//p.parse(new FileInputStream(HTML), Charset.forName("UTF-8"));
p.parse( new ByteArrayInputStream(htmlContentAr.getBytes(StandardCharsets.UTF_8)), Charset.forName("UTF-8"));
// step 5
document.close();
}
答案 0 :(得分:0)
对不起。愚蠢的问题。问题是在我的战争部署中,2个版本的itext被复制,这导致了问题。
这适用于5.5.5 Jar的Itext和Xmlworker。