我的问题是解析XML,其中字符串值包含HTML标记:
def xmlString = '''
<resource>
<string name="my_test">No problem here!</string>
<string name="my_text">
<b> <big>My bold and big title</big></b>
Rest of the text
</string>
</resource>
'''
(这是一个Android资源文件)
当我使用XmlSlurper时,HTML标记将被删除。这段代码:
def resources = new XmlSlurper().parseText(xmlString )
resources.string.each { string ->
println "string name = " + string.@name + ", string value = " + string.text()
}
将产生
string name = my_test, string value = No problem here!
string name = my_text, string value = My bold and big title
Rest of the text
我可以使用CDATA来阻止解析HTML标记,但是当使用字符串my_text时,不会处理这些HTML标记。
我也尝试使用StreamingMarkupBuilder,如本答案中所述:How to extract HTML Code from a XML File using groovy,但只显示HTML标签及其间的文字:
<b><big>My bold and big title</big></b>
并且不显示第一个字符串。 提前谢谢!
答案 0 :(得分:1)
def xmlString = '''
<resource>
<string name="my_test">No problem here!</string>
<string name="my_text">
<b><big>My bold and big title</big></b>
Rest of the text
</string>
</resource>
'''
def result = []
def resources = new XmlSlurper().parseText(xmlString).string
resources.each { resource ->
result << new groovy.xml.StreamingMarkupBuilder().bind { mkp.yield resource.getBody() }
}