假设xml文件如下:
<!DOCTYPE html [
<!ENTITY ldquo "♥">
]>
<DATA>
<ROW>
<Id>29855</Id>
<content><p>Did the summer fly as fast “</p>
<a href="https://www.ex.com/" target="_blank"></content>
<ROW>
<ROW>
<Id>11223</Id>
<content><p>Fly as fast “</p>
<a href="https://www.ex.com/" target="_blank"></content>
<ROW>
</DATA>
要求是从xml中获取“ id”和“ content”。内容应采用xml结构中的xml结构。就像:
<p>Fly as fast “</p>
<a href="https://www.ex.com/" target="_blank">
我尝试过,但是我正在以字符串格式获取内容,例如:快飞“
这是我用来解析xml的代码:
File fXmlFile = new File("D:\\customer_connect_posts.xml");
DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
Document doc = dBuilder.parse(fXmlFile);
doc.getDocumentElement().normalize();
System.out.println("Root element :" + doc.getDocumentElement().getNodeName());
NodeList nList = doc.getElementsByTagName("ROW");
System.out.println("----------------------------");
for (int temp = 0; temp < nList.getLength(); temp++) {
Node nNode = nList.item(temp);
System.out.println("\nCurrent Element :" + nNode.getNodeName());
if (nNode.getNodeType() == Node.ELEMENT_NODE) {
Element eElement = (Element) nNode;
/*System.out.println("Staff id : "
+ eElement.getAttribute("Name"));*/
System.out.println("First Name : "
+ eElement.getElementsByTagName("Id")
.item(0).getTextContent());
System.out.println("Last Name : "
+ eElement.getElementsByTagName("content").item(0).getTextContent())
);
}
}
} catch (Exception e) {
e.printStackTrace();
}
问题是我正在调用“ getTextContent()”方法来返回文本。还有其他方法可以做到这一点。 需要帮助...
答案 0 :(得分:0)
要从DOM std::vector<double> A_vec(n*n); // allocate data into A_vec
Eigen::Map<Eigen::MatrixXd> A(A_vec.data(), n, n);
// fill matrix A.
// data is immediately stored into A_vec
的html中获取文本,应将其序列化为html。您可以使用Saxon并使用默认的Node
Similar problem。
Transformer
您应该看到下一个输出:
Node content = eElement.getElementsByTagName("content").item(0);
StringWriter sw = new StringWriter();
Result result = new StreamResult(sw);
TransformerFactory factory = new TransformerFactoryImpl();
Transformer proc = factory.newTransformer();
proc.setOutputProperty(OutputKeys.METHOD, "html");
for (int i = 0; i < content.getChildNodes().getLength(); i++) {
proc.transform(new DOMSource(content.getChildNodes().item(i)), result);
}
System.out.println("Content:" + sw.toString().trim());
并且在文档标签Current Element :ROW
First Name : 29855
Content:<p>Did the summer fly as fast</p>
<a href="https://www.ex.com/" target="_blank"></a>
Current Element :ROW
First Name : 11223
Content:<p>Fly as fast</p>
<a href="https://www.ex.com/" target="_blank"></a>
中应使用<ROW>
关闭。也适用于</ROW>
。但是您可以使用简化的记录<a>
。
答案 1 :(得分:0)
您需要使用 CDATA 或对HTML进行编码以将HTML存储在XML内,否则HTML元素将被解释为XML元素。同样,您的ROW
元素似乎没有关闭。
我建议像这样使用 CDATA :
<DATA>
<ROW>
<Id>29855</Id>
<content><![CDATA[<p>Did the summer fly as fast “</p>
<a href="https://www.ex.com/" target="_blank">]]>
</content>
</ROW>
<ROW>
<Id>11223</Id>
<content><![CDATA[<p>Fly as fast “</p>
<a href="https://www.ex.com/" target="_blank">]]>
</content>
</ROW>
</DATA>