我正在从字典api解析定义。我有这行xml
<dt>:any of a small genus (<it>Apteryx</it>) of flightless New Zealand birds with rudimentary wings, stout legs, a long bill, and grayish brown hairlike plumage</dt>
我如何获得dt元素的完整行。我的问题是,当它到达这个部分(Apteryx)时它不起作用,因为元素中还有其他标记。我如何将整个dt元素作为一个完整的字符串。这是我目前的代码。
Element def = (Element) element.getElementsByTagName("def").item(0);
System.out.println(getValue("dt",def).replaceAll("[^\\p{L}\\p{N} ]", ""));
其中def是保存dt元素的元素。
这是我的getValue代码
private static String getValue(String tag, Element element)
{
NodeList nodes = element.getElementsByTagName(tag).item(0).getChildNodes();
Node node = (Node) nodes.item(0);
return node.getNodeValue();
}
有时在dt元素中有多个嵌套标签
答案 0 :(得分:0)
混合https://stackoverflow.com/a/5948326/145757和Get a node's inner XML as String in Java DOM我们得到:
public static String getInnerXml(Node node)
{
DOMImplementationLS lsImpl = (DOMImplementationLS)node.getOwnerDocument().getImplementation().getFeature("LS", "3.0");
LSSerializer lsSerializer = lsImpl.createLSSerializer();
lsSerializer.getDomConfig().setParameter("xml-declaration", false);
NodeList childNodes = node.getChildNodes();
StringBuilder sb = new StringBuilder();
for (int i = 0; i < childNodes.getLength(); i++)
{
sb.append(lsSerializer.writeToString(childNodes.item(i)));
}
return sb.toString();
}
添加我的评论,这给出了:
getInnerXml(document.getElementsByTagName("dt").item(0));
结果:
:any of a small genus (<it>Apteryx</it>) of flightless New Zealand birds...
希望这会有所帮助......