我想编写一个将.html转换为pdf的java代码。我使用adobe的itext api进行html到pdf的转换。但是当我将错误的html文件作为输入时,这种转换失败了。(Html标签不是因此我使用了Htmlcleaner解析器来清除坏的html但是无法获得可以重建新html的代码。有人知道如何从解析的html标签节点构建新的html吗?
答案 0 :(得分:0)
HtmlCleaner附带了一组serializers,你可以使用它们,例如:
final HtmlCleaner cleaner = new HtmlCleaner();
final CleanerProperties properties = cleaner.getProperties();
final Serializer serializer = new SimpleHtmlSerializer(properties);
TagNode node = cleaner.clean("hello world");
StringWriter writer = new StringWriter();
serializer.write(node, writer, "UTF-8");
System.out.println(writer.toString());