我的xml看起来像这样:(它是NodeSeq
)
<first>...</first>
<second>...</second>
<third>
<foo>
<keepattr> ... </keepattr>
<otherattr1> ... </otherattr1>
</foo>
<otherattr2> ... </otherattr2>
</third>
我需要保留<first>
,删除<second>
及其中的任何内容,并且只在<keepattr>
内保留<third>
,同时保留数据架构(保留foo标记)
我怎么能在Scala中做到这一点?
我尝试了这个,但是我被迫停留在一个级别
val removeJunk = new RewriteRule {
override def transform(node: Node): NodeSeq = node match {
case e: Elem => e.label match {
case "second" => NodeSeq.Empty
case "third" => //?
}
case o => o
}
}
我可能有兴趣在计划中降低几个等级
编辑:我希望在不损害数据模型的同时保留数据
<third>
<foo>
<keepattr> ... </keepattr>
<otherattr1> ... </otherattr1>
</foo>
<otherattr2> ... </otherattr2>
</third>
应该成为
<third>
<foo>
<keepattr> ... </keepattr>
</foo>
</third>
答案 0 :(得分:2)
您可以使用filterNot
和RewriteRule
的组合。由于在每一步使用\\
运算符,这可能效率低下,但我现在无法想到任何其他解决方案:
val input: NodeBuffer = <first>foo</first>
<second>remove me</second>
<third>
<foo>
<keepattr>meh</keepattr>
<otherattr1>bar</otherattr1>
</foo>
<otherattr2>quux</otherattr2>
</third>
val extractKeepAttr = new RewriteRule {
override def transform(node: Node): NodeSeq = node match {
case e: Elem => e.label match {
case "keepattr" => e
case _ if (e \\ "keepattr").nonEmpty =>
e copy (child = e.child.filter(c => (c \\ "keepattr").nonEmpty) flatMap transform)
case _ => e
}
}
}
// returns <first>foo</first>, <third><foo><keepattr>meh</keepattr></foo></third>
val updatedXml = input.filterNot(_.label == "second").transform(extractKeepAttr)
编辑:更新回答
答案 1 :(得分:0)
我想指出另一个消除了很多复杂性的答案,但并不是那么漂亮......从XML中提取所需的所有信息,将其存储在val中,如果你知道,则手动重建XML结构提前。