我有一个这种格式的字符串:
<params text="To apply for and obtain a loan from Cash4Rent, Inc., you must agree to receive all information and disclosures regarding your loan electronically prior to submitting your loan application.
The following information will be provided by electronic communication."
/>
当你看到它在xml节点&#34; text&#34;的值中有特定的换行符和空格。我正在尝试解析XML字符串并将其打印在我的网站上。但是当我解析它时,它会丢失白色空格并打印成一行。解析后我怎样才能保留格式。下面是我的代码。
Java类:
public class XMLParser {
public static void main(String[] args) throws SAXException, ParserConfigurationException, IOException{
String inputXml = <params text="To apply for and obtain a loan from Cash4Rent, Inc., you must agree to receive all information and disclosures regarding your loan electronically prior to submitting your loan application. The following information will be provided by electronic communication." />";System.out.println(inputXml);
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = null;
dbf.setIgnoringElementContentWhitespace(true);
db = dbf.newDocumentBuilder();
InputSource is = new InputSource();
is.setCharacterStream(new StringReader(inputXml));
ArrayList<String> nodeNameList = new ArrayList<>();
ArrayList<String> nodeValueList = new ArrayList<>();
try {
Document doc = db.parse(is);
for(int i=0 ; i<doc.getDocumentElement().getAttributes().getLength() ; i++ ){
nodeNameList.add(doc.getDocumentElement().getAttributes().item(i).getNodeName());
nodeValueList.add(doc.getDocumentElement().getAttributes().item(i).getNodeValue());
}
} catch (SAXException e) {
// handle SAXException
}
} }
结果:
[text]
[To apply for and obtain a loan from Cash4Rent, Inc., you must agree to receive all information and disclosures regarding your loan electronically prior to submitting your loan application. The following information will be provided by electronic communication.]
期望的结果:
[text]
[To apply for and obtain a loan from Cash4Rent, Inc., you must agree to receive all information and disclosures regarding your loan electronically prior to submitting your loan application.
The following information will be provided by electronic communication.]
答案 0 :(得分:0)
3.3.1属性类型
XML属性类型有三种:字符串类型,一组标记化类型和枚举类型。字符串类型可以将任何文字字符串作为值;标记化类型更受约束。语法中记录的有效性约束在属性值已经规范化之后应用,如3.3.3属性值规范化中所述。
和Attribute-Value Normalization
3.3.3属性值规范化
在将属性的值传递给应用程序或检查其有效性之前,XML处理器必须通过应用下面的算法或使用某些其他方法来规范化属性值,以便传递给应用程序的值相同正如算法产生的那样。
所有换行符必须在输入到#xA时进行标准化,如2.11行尾处理中所述,因此此算法的其余部分对以这种方式标准化的文本进行操作。
从包含空字符串的标准化值开始。
- 醇>
对于非标准化属性值中的每个字符,实体引用或字符引用,从第一个开始,一直到最后一个,执行以下操作:
对于字符引用,将引用的字符附加到规范化值。
对于实体引用,递归地将此算法的第3步应用于实体的替换文本。
对于空格字符(#x20,#xD,#xA,#x9),请在标准化值后附加空格字符(#x20)。
对于另一个字符,请将该字符附加到规范化值。