我有以下xml条目。我希望在d:index标签关闭到条目结尾后提取所有内容。
<d:entry id="some_id" d:title="some_title">
<d:index d:value="some_value"/>
<h1>headlines</h1>
<p>paragraphs</p>
<div>
<ul>
<li>lists</li>
</ul>
</div>
text like that
</d:entry>
我尝试使用
dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(file);
doc.getDocumentElement().normalize();
eList = doc.getElementsByTagName("d:entry");
for (int i = 0; i < eList.getLength(); i++){
Node nNode = eList.item(i);
textList[i] = nNode.getTextContent();
}
但是,.getTextContent()只给我'那样的文字'而不是
<h1>headlines</h1>
<p>paragraphs</p>
<div>
<ul>
<li>lists</li>
</ul>
</div>
text like that
答案 0 :(得分:0)
根据您的确切想要做的事情,您可以执行以下操作:
import java.io.File;
import java.io.IOException;
import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;
import org.w3c.dom.Document;
import org.w3c.dom.Element;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;
public class Arbeiter {
public void arbeiten(File datei)
{
Document doc = getDoc(datei);
Element element = doc.getDocumentElement();
print(element);
}
private Document getDoc(File datei)
{
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
Document doc = null;
try {
DocumentBuilder db = dbf.newDocumentBuilder();
doc = db.parse(datei);
} catch (ParserConfigurationException | SAXException | IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return doc;
}
private void print(Node node)
{
for (int i=0; i<node.getChildNodes().getLength(); i++)
{
print(node.getFirstChild());
}
if(node.getTextContent()!=null)
{
System.out.println(node.getTextContent());
}
}
}
输出结果为:
headlines
paragraphs
lists
text like that