我正在运行此代码块将html页面转换为pdf文档。但我没有在'result.pdf'上看到土耳其字符。我的工作是:
try {
Rectangle pagesize = new Rectangle(800,1200);
final Document document = new Document(pagesize);
OutputStream os = new FileOutputStream("deneme.pdf");// ByteArrayOutputStream();
PdfWriter writer = PdfWriter.getInstance(document,os);
document.open();
HtmlCleaner cleaner = new HtmlCleaner();
CleanerProperties props = cleaner.getProperties();
TagNode rootNode = cleaner.clean("Source Html");
XmlSerializer serial = new PrettyXmlSerializer(props);
String htmlClean = serial.getAsString(rootNode);
System.out.println(htmlClean);//Tidy Html
CSSResolver cssResolver = XMLWorkerHelper.getInstance().getDefaultCssResolver(true);
/*
XMLWorkerFontProvider fontProvider = new XMLWorkerFontProvider();
// fontProvider.setUseUnicode(true);
fontProvider.isRegistered("Helvetica");
fontProvider.addFontSubstitute("Helvetica", "Arial");
CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
*/
// HTML
HtmlPipelineContext htmlContext = new HtmlPipelineContext(null);
htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
htmlContext.setImageProvider(new ImageProvider());
PdfWriterPipeline pdf = new PdfWriterPipeline(document, writer);
HtmlPipeline html = new HtmlPipeline(htmlContext, pdf);
CssResolverPipeline css = new CssResolverPipeline(cssResolver, html);
/*
BaseFont courier = BaseFont.createFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.EMBEDDED);
Font font = new Font(courier, 12, Font.NORMAL);
Chunk chunk = new Chunk("",font);
document.add(chunk);
*/
// XML Worker
XMLWorker worker = new XMLWorker(css, true);
XMLParser p = new XMLParser(worker);
p.parse(new ByteArrayInputStream(htmlClean.getBytes("utf-8")));
document.close();
} catch (Exception e) {
e.printStackTrace();
}
我在评论行中尝试了代码,但结果是相同的,错误的。
如何使用土耳其语字符更改结果?
当我尝试代码块时
BaseFont freeSans = BaseFont.createFont("FreeSans.ttf","Cp1254", true);
Font font = new Font(freeSans,12, Font.NORMAL);
Chunk chunk = new Chunk("ŞşĞğİıÖö",font);
document.add(chunk);
我在'result.pdf'中看到'ŞşĞğİıÖö'
但是如何在解析之前编辑XmlParser?