在Scala中,如何使用transformer / Rewrite规则在XML元素中添加递增ID

时间:2014-08-13 02:05:04

标签: xml scala scala-xml

我想在XML文件中读取并在特定元素中添加递增ID。以下是我编写的一些测试代码,以确定如何执行此操作:

import scala.xml._
import scala.xml.transform._

val testXML =
 <document>
    <authors>
      <author>
        <first-name>Firstname</first-name>
        <last-name>Lastname</last-name>
      </author>
    </authors>
 </document>


def addIDs(node : Node) : Node = {

    object addIDs extends RewriteRule {
      var authorID = -1
      var emailID = -1
      var instID = -1

      override def transform(elem: Node): Seq[Node] =
      {
        elem match {

          case Elem(prefix, "author", attribs, scope, _*) =>
            //println("element author: " + elem.text)
            if ((elem \ "@id").isEmpty) {
              println("element id is empty:" + elem\"@id")
              authorID += 1
              println("authorID is " + authorID)
              elem.asInstanceOf[Elem] % Attribute(None, "id", Text(authorID.toString), Null)
            } else {
              elem
            }


        case Elem(prefix, "email", attribs, scope, _*) =>
          println("EMAIL")
          elem.asInstanceOf[Elem] % Attribute(None, "id", Text(authorID.toString), Null)

        case Elem(prefix, "institution", attribs, scope, _*) =>
          println("INST")
          elem.asInstanceOf[Elem] % Attribute(None, "id", Text(instID.toString), Null)

        case other =>
          other
      }
    }
  }
  object transform extends RuleTransformer(addIDs)
  transform(node)
}


val newXML = addIDs(testXML)

此代码功能正常 - 但是,ID并未按预期显示:

element id is empty:
authorID is 0
element id is empty:
authorID is 1
element id is empty:
authorID is 2
element id is empty:
authorID is 3
element id is empty:
authorID is 4
element id is empty:
authorID is 5
element id is empty:
authorID is 6
element id is empty:
authorID is 7
newXML:scala.xml.Node=<document>
    <authors>
        <author id="7">
           <first-name>Firstname</first-name>
           <last-name>Lastname</last-name>
        </author>
    </authors>
  </document>

看起来变压器多次击中每个节点,增加id然后在id达到7时最终停止。为什么在最终完成之前它多次触摸节点?有什么我可以做的不同,告诉它完成该节点?

我想也许它正在遍历新修改的节点,因此我检查包含名为'id'的属性的元素。但这似乎不起作用。也许这首先做到这一点是个坏主意?

感谢您对此提供任何帮助。

1 个答案:

答案 0 :(得分:0)

看起来我遇到了这个scala bug:https://issues.scala-lang.org/browse/SI-3689 - BasicTransformer具有指数复杂度

我的解决方法是:https://stackoverflow.com/a/1089519/3935595