Question

我的问题是解析XML，其中字符串值包含HTML标记：

def xmlString = '''
<resource>
   <string name="my_test">No problem here!</string>
   <string name="my_text">
<b> <big>My bold and big title</big></b>
   Rest of the text
  </string>
</resource>
'''

（这是一个Android资源文件）

当我使用XmlSlurper时，HTML标记将被删除。这段代码：

def resources = new XmlSlurper().parseText(xmlString )
resources.string.each { string ->
    println "string name = " + string.@name + ", string value = " + string.text()
}

将产生

string name = my_test, string value = No problem here!
string name = my_text, string value = My bold and big title
   Rest of the text

我可以使用CDATA来阻止解析HTML标记，但是当使用字符串my_text时，不会处理这些HTML标记。

我也尝试使用StreamingMarkupBuilder，如本答案中所述：How to extract HTML Code from a XML File using groovy，但只显示HTML标签及其间的文字：

<b><big>My bold and big title</big></b>

并且不显示第一个字符串。提前谢谢！

Answer 1

def xmlString = '''
<resource>
    <string name="my_test">No problem here!</string>
    <string name="my_text">
        <b><big>My bold and big title</big></b>
        Rest of the text
    </string>
</resource>
'''

def result = []
def resources = new XmlSlurper().parseText(xmlString).string

resources.each { resource ->
    result << new groovy.xml.StreamingMarkupBuilder().bind { mkp.yield resource.getBody() }
}

Groovy：在里面用HTML标签解析xml

1 个答案: