说我有以下的xml:
<a><b><c>Text for c</c>Text for b</b></a>
除了子节点的文本外,我如何仅获取节点的文本?即。
a:&#34;&#34;
b:&#34; b&#34;
c:&#34; c&#34;
的文字
由于Node.text方法包含任何子节点的文本
答案 0 :(得分:0)
在XML中,node
只能包含childNodes
和&#34;文本&#34;您所指的是Text
类型的节点序列。因此,解决方案是过滤类型为Text
的子类,并将它们作为单个字符串连接在一起,如下面的方法extractNodeText
所示:
scala> import scala.xml._
import scala.xml._
scala> def extractNodeText(node: Node) =
| node.child.filter(_.isInstanceOf[Text]).map(_.text).mkString("")
extractNodeText: (node: scala.xml.Node)String
scala> val a = XML.loadString("<a><b><c>Text for c</c>Text for b</b></a>")
a: scala.xml.Elem = <a><b><c>Text for c</c>Text for b</b></a>
scala> val aStr = extractNodeText(a)
aStr: String = ""
scala> val b = XML.loadString("<b><c>Text for c</c>Text for b</b>")
b: scala.xml.Elem = <b><c>Text for c</c>Text for b</b>
scala> val bStr = extractNodeText(b)
bStr: String = "Text for b"
scala> val c = XML.loadString(<c>Text for c</c>")
c: scala.xml.Elem = <c>Text for c</c>
scala> val cStr = extractNodeText(c)
cStr: String = "Text for c"
答案 1 :(得分:0)
您可以检查您的子节点是否为原子节点,并仅从该节点提取文本:
val xml = <a><b><c>Text for c</c>Text for b</b></a>
val a = (xml \\ "a")
val b = (xml \\ "b")
val c = (xml \\ "c")
def text(name: String, nodeSeq: NodeSeq) = {
val text = (for {
n <- nodeSeq.headOption
atomNodeText <- n.child.filter(_.isAtom).headOption
} yield (atomNodeText)) getOrElse("")
println(name + ": " + text)
}
text("a", a)
text("b", b)
text("c", c)
产生:
a:
b: Text for b
c: Text for c