Question

{{ structure()
            .linkAttributesDropdown({'class': 'pure-menu-item pure-menu-has-children pure-menu-allow-hover'})
            .listClass('pure-menu-list')
            .elementClass('pure-menu-item')
            .childListClass('pure-menu-children')
            .render() }}

我有如上所示的DOM元素结构。使用htmlunit，我想只提取值“hello”，因为我有HtmlElement对象引用“td”节点。我尝试使用getTextContent（），但它返回“hirehello”，我不想要。

Answer 1

查看文档，getTextContent清楚地说它返回元素和的后代的文本，而且我没有看到任何其他方法只返回文本节点，所以我认为你需要一个循环。例如，假设element引用td元素：

StringBuffer sb = new StringBuffer(/*some appropriate size*/);
for (DomNode n : element.getChildNodes()) {
    if (n.getNodeType() == Node.TEXT_NODE) {
        sb.append(n.getTextContent());
    }
}
String text = sb.toString();

请注意，您引用的结构中文本节点的总和不仅仅是"hello"，它前后都有空格那。如果您只是想要"hello"，则需要将其修剪掉。

在htmlunit中获取td元素的值

1 个答案: