我正在尝试使用java -
从xml中读取以下项目Xml是 -
<?xml version="1.0" encoding="utf-8"?>
<Item1>
<Field>content1</Field>
<Row>
<Item>3</Item>
<id>33</id>
<content><script type="text/javascript" xml:space="preserve">
</script><span style="FONT-WEIGHT: bold">Access to $data</span><br />
The $data is&#160;&#160;the$word oneas, $companyname's our website - $link.<br />
<br />
recommend on your $name ase or visit one of our $name dealer - $dealer.<br />
<br />
<span style="FONT-WEIGHT: bold">Meters</span><br />accurate. earlier..<br />
</content>
</Row>
</Item1>
我能够阅读内容,但它也带有特殊字符。
在阅读内容时,想要删除特殊字符,例如&amp; lt,它应该是&lt;。
有人可以建议我如何继续。
我正在阅读xml,如下所示 -
public class ReadXml {
public static void main(String argv[]) {
try {
File fXmlFile = new File("test.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
/*InputSource is = new InputSource(fXmlFile);
is.setEncoding("UTF-8");*/
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("Row");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
//System.out.println("Staff id : " + eElement.getAttribute("item"));
System.out.println("Id : " + eElement.getElementsByTagName("id").item(0).getTextContent());
}
}
} catch (Exception e) {
e.printStackTrace();
}
}
}
答案 0 :(得分:0)
您可以使用java xerces库。我使用了SAX解析器,它自动转换&amp; lt;到&lt;和&amp; gt; &GT; http://xerces.apache.org/xerces-j/