如何从java中的htmlparser生成的解析标签节点构建新的html?

时间:2015-09-19 13:42:03

标签: java htmlcleaner html-generation

我想编写一个将.html转换为pdf的java代码。我使用adobe的itext api进行html到pdf的转换。但是当我将错误的html文件作为输入时,这种转换失败了。(Html标签不是因此我使用了Htmlcleaner解析器来清除坏的html但是无法获得可以重建新html的代码。有人知道如何从解析的html标签节点构建新的html吗?

1 个答案:

答案 0 :(得分:0)

HtmlCleaner附带了一组serializers,你可以使用它们,例如:

    final HtmlCleaner cleaner = new HtmlCleaner();
    final CleanerProperties properties = cleaner.getProperties();
    final Serializer serializer = new SimpleHtmlSerializer(properties);

    TagNode node = cleaner.clean("hello world");
    StringWriter writer = new StringWriter();
    serializer.write(node, writer, "UTF-8");

    System.out.println(writer.toString());