HTML DOM Tree to String - Transformer NullPointerException

时间:2012-07-05 08:46:05

标签: java xml dom transform transformer

我正在尝试将org.w3c.dom.Document对象的内容转换为字符串。我得到了JBrowser组件中显示的当前页面的Document对象。将文档dom树转换为字符串的最常见方法似乎是使用javax.xml.transform.Transformer。所以我实现了这个:

ByteArrayOutputStream baos = new ByteArrayOutputStream();

TransformerFactory.newInstance().newTransformer().transform(
            new DOMSource(aDocument), new StreamResult(baos));

return baos.toString();

这适用于简单的网站,但它们越复杂,我获得此异常的可能性就越大:

    ERROR:  ''
05.07.2012 10:17:09 com.de.test.Demonstrator$1 run
FATAL: null
javax.xml.transform.TransformerException: java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:717)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at com.de.test.DocumentUtils.toHTML(DocumentUtils.java:47)
at com.de.test.Demonstrator$1.run(Demonstrator.java:172)
Caused by: java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:178)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:132)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:94)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:662)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:708)
... 3 more
---------
java.lang.NullPointerException
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:178)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:226)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:132)
at com.sun.org.apache.xalan.internal.xsltc.trax.DOM2TO.parse(DOM2TO.java:94)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transformIdentity(TransformerImpl.java:662)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:708)
at com.sun.org.apache.xalan.internal.xsltc.trax.TransformerImpl.transform(TransformerImpl.java:313)
at com.de.test.DocumentUtils.toHTML(DocumentUtils.java:47)
at com.de.test.Demonstrator$1.run(Demonstrator.java:172)

经过一些研究发现有关提示后,某些文本元素可能为null,这会导致Transformer崩溃。所以我做到了这一点:

    public static final void traverseLevel(TreeWalker walker, Document aDocument, String indent)
{
    // describe current node:
    Node parent = walker.getCurrentNode();

    if (parent != null && parent.getNodeValue() == null)
        parent.setNodeValue(" ");

    System.out.println(indent + "- <" + ((Element) parent).getTagName() + ">" + parent.getNodeValue());

    // traverse children:
    for (Node n = walker.firstChild(); n != null; n = walker.nextSibling())
    {
        if(n != null)
            traverseLevel(walker, aDocument, indent + '\t');
    }

    System.out.println("</"+ ((Element) parent).getTagName() + ">");

    // return position to the current (level up):
    walker.setCurrentNode(parent);
}

这是我发现“parent.getNodeValue()”始终返回null的地方。有趣的是,问题也发生在简单的网站上,但变压器仍然输出树的值。我知道替换空文本节点有什么问题吗?还有其他可能导致此问题的潜在问题吗?

谢谢!

1 个答案:

答案 0 :(得分:0)

好的,我找到了解决问题的方法。我已经将JBrowser更改为具有convertToHtml()函数的DJ Project浏览器。我无法解决变压器问题,这就是我采用这条路线的原因。