如何使用Scala计算XML文件中的元素

时间:2018-10-17 13:06:17

标签: xml scala

我在scala中使用assert函数,以便比较xml文件。 我的问题是我希望能够计算元素数量,例如:

<recording>
      <mousemove y="53" x="300" t="9031"/>
      <keydown kc="s" t="9759"/>
      <keypress cc="s" t="9759"/>
      <keyup kc="s" t="9829"/>
      <execextern streamID="18" t="9833"/>
      <keydown kc="s" t="10135"/>
      <keypress cc="s" t="10135"/>
      <keyup kc="s" t="10207"/>
      <execextern streamID="19" t="10207"/>
      <keydown kc="s" t="10934"/>
      <keypress cc="s" t="10934"/>
      <keyup kc="s" t="10989"/>
      <execextern streamID="20" t="10989"/>
      <keydown kc="s" t="11362"/>
      <keypress cc="s" t="11366"/>
</recording>

我希望能够计算keydown元素,keydown元素,按键等的数量...

2 个答案:

答案 0 :(得分:4)

将您的录制标签放入Seq[Node],并计数每个标签:

scala> :paste
// Entering paste mode (ctrl-D to finish)

val xml = <recording>
      <mousemove y="53" x="300" t="9031"/>
      <keydown kc="s" t="9759"/>
      <keypress cc="s" t="9759"/>
      <keyup kc="s" t="9829"/>
      <execextern streamID="18" t="9833"/>
      <keydown kc="s" t="10135"/>
      <keypress cc="s" t="10135"/>
      <keyup kc="s" t="10207"/>
      <execextern streamID="19" t="10207"/>
      <keydown kc="s" t="10934"/>
      <keypress cc="s" t="10934"/>
      <keyup kc="s" t="10989"/>
      <execextern streamID="20" t="10989"/>
      <keydown kc="s" t="11362"/>
      <keypress cc="s" t="11366"/>
</recording>

// number of empty entries for whatever reason.
// Not necessary if you're not using children again after this.
val children = xml.child.filterNot(_.toString().trim.isEmpty)

val mousemoveCount = children.count(_.label == "mousemove")
val keydownCount = children.count(_.label == "keydown")
val keypressCount = children.count(_.label == "keypress")
val keyupCount = children.count(_.label == "keyup")
val execexternCount = children.count(_.label == "execextern")

println(s"number of mousemove events: $mousemoveCount")
println(s"number of keydown events: $keydownCount")
println(s"number of keypress events: $keypressCount")
println(s"number of keyup events: $keyupCount")
println(s"number of execextern events: $execexternCount")


// Exiting paste mode, now interpreting.

number of mousemove events: 1
number of keydown events: 4
number of keypress events: 4
number of keyup events: 3
number of execextern events: 3

编辑

要计算recording内部的所有XML节点,请保留.filternot(...)部分并使用val allCount = children.size。 IE:

val children = xml.child.filterNot(_.toString().trim.isEmpty)
val allCount = children.size

此外,要将其变成通用函数,您只需将要搜索的子节点变成变量:

scala> :paste
// Entering paste mode (ctrl-D to finish)

val xml = <recording>
      <mousemove y="53" x="300" t="9031"/>
      <keydown kc="s" t="9759"/>
      <keypress cc="s" t="9759"/>
      <keyup kc="s" t="9829"/>
      <execextern streamID="18" t="9833"/>
      <keydown kc="s" t="10135"/>
      <keypress cc="s" t="10135"/>
      <keyup kc="s" t="10207"/>
      <execextern streamID="19" t="10207"/>
      <keydown kc="s" t="10934"/>
      <keypress cc="s" t="10934"/>
      <keyup kc="s" t="10989"/>
      <execextern streamID="20" t="10989"/>
      <keydown kc="s" t="11362"/>
      <keypress cc="s" t="11366"/>
</recording>

val children = xml.child.filterNot(_.toString().trim.isEmpty)

def countNodes(nodeName: String): Int = children.count(_.label == nodeName)

val allCount = children.size

println(s"number of mousemove events: ${countNodes("mousemove")}")
println(s"number of keydown events: ${countNodes("keydown")}")
println(s"number of keypress events: ${countNodes("keypress")}")
println(s"number of keyup events: ${countNodes("keyup")}")
println(s"number of execextern events: ${countNodes("execextern")}")

println(s"total number of events: $allCount")


// Exiting paste mode, now interpreting.

number of mousemove events: 1
number of keydown events: 4
number of keypress events: 4
number of keyup events: 3
number of execextern events: 3
total number of events: 15

编辑2

如果要使其通用,我建议您通过按节点的标签分组将其放入地图中。

例如,如果仅需要节点名称和大小,则可以执行以下操作:

children.groupBy(_.label).map {
      case(k, v) => (k, v.size)
}
// Map(mousemove -> 1, keydown -> 4, execextern -> 3, keypress -> 4, keyup -> 3)

如果需要整个节点,则只需删除.map

import scala.xml.Node
val nodeSizeMap: Map[String, Seq[Node]] = children.groupBy(_.label)
// Map(
//   mousemove -> ArrayBuffer(<mousemove y="53" x="300" t="9031"/>),
//   keydown -> ArrayBuffer(<keydown kc="s" t="9759"/>, <keydown kc="s" t="10135"/>, <keydown kc="s" t="10934"/>, <keydown kc="s" t="11362"/>),
//   execextern -> ArrayBuffer(<execextern streamID="18" t="9833"/>, <execextern streamID="19" t="10207"/>, <execextern streamID="20" t="10989"/>),
//   keypress -> ArrayBuffer(<keypress cc="s" t="9759"/>, <keypress cc="s" t="10135"/>, <keypress cc="s" t="10934"/>, <keypress cc="s" t="11366"/>),
//   keyup -> ArrayBuffer(<keyup kc="s" t="9829"/>, <keyup kc="s" t="10207"/>, <keyup kc="s" t="10989"/>)
// )

在上下文中:

scala> :paste
// Entering paste mode (ctrl-D to finish)

val xml = <recording>
      <mousemove y="53" x="300" t="9031"/>
      <keydown kc="s" t="9759"/>
      <keypress cc="s" t="9759"/>
      <keyup kc="s" t="9829"/>
      <execextern streamID="18" t="9833"/>
      <keydown kc="s" t="10135"/>
      <keypress cc="s" t="10135"/>
      <keyup kc="s" t="10207"/>
      <execextern streamID="19" t="10207"/>
      <keydown kc="s" t="10934"/>
      <keypress cc="s" t="10934"/>
      <keyup kc="s" t="10989"/>
      <execextern streamID="20" t="10989"/>
      <keydown kc="s" t="11362"/>
      <keypress cc="s" t="11366"/>
</recording>

val children = xml.child.filterNot(_.toString().trim.isEmpty)

def countNodes(nodeName: String): Int = children.count(_.label == nodeName)

val allCount = children.size

// if you just want to print
children.groupBy(_.label).foreach {
      case (k, v) => println(s"number of $k events: ${v.size}")
}

println()

// if you want to do something with the values
val nodeSizeMap: Map[String, Int] = children.groupBy(_.label).map {
      case(k, v) => (k, v.size)
}

// ... do something with nodeSizeMap

nodeSizeMap.foreach {
      case (k, v) => println(s"number of $k events: $v")
}


// Exiting paste mode, now interpreting.

number of mousemove events: 1
number of keydown events: 4
number of execextern events: 3
number of keypress events: 4
number of keyup events: 3

number of mousemove events: 1
number of keydown events: 4
number of execextern events: 3
number of keypress events: 4
number of keyup events: 3

编辑3

要使此甚至更多通用,并允许在嵌套标签中进行搜索,可以使用魔术XML通配符_进行搜索。这是一个示例(请原谅XML的愚蠢性):

scala> :paste
// Entering paste mode (ctrl-D to finish)

val xml = <family>
    <mother name="julie" />
    <father name="harold" />
    <child name="billy" status="good child" />
    <child name="charlie" status="good child" />
    <child name="mandy" status="bad child" />
    <child name="nigel" status="bad child" />
    <extendedfamily>
        <uncle name="jeff" />
        <auntie name="vicky" />
        <cousin name="little boy 1" />
        <cousin name="little boy 2" />
    </extendedfamily>
</family>

val familyMap = (xml \\ "_").groupBy(_.label).map { case (k, v) => (k, v.size) }

familyMap foreach {
    case (k, v) => println(s"$k count: $v")
}


// Exiting paste mode, now interpreting.

mother count: 1
auntie count: 1
uncle count: 1
child count: 4
extendedfamily count: 1
father count: 1
cousin count: 2
family count: 1

答案 1 :(得分:3)

让我们说您在变量中读取了xml文件

val xmlParam = <recording>
  <mousemove y="53" x="300" t="9031"/>
  <keydown kc="s" t="9759"/>
  <keypress cc="s" t="9759"/>
  <keyup kc="s" t="9829"/>
  <execextern streamID="18" t="9833"/>
  <keydown kc="s" t="10135"/>
  <keypress cc="s" t="10135"/>
  <keyup kc="s" t="10207"/>
  <execextern streamID="19" t="10207"/>
  <keydown kc="s" t="10934"/>
  <keypress cc="s" t="10934"/>
  <keyup kc="s" t="10989"/>
  <execextern streamID="20" t="10989"/>
  <keydown kc="s" t="11362"/>
  <keypress cc="s" t="11366"/>
</recording>

您可以使用

计算xml文件的元素
(xmlParam \\ "keydown").size
(xmlParam \\ "keypress").size
(xmlParam \\ "keyup").size

它将告诉您文件中这些元素的数量。这将为您提供

的输出
res0: Int = 4
res1: Int = 4
res2: Int = 3

您也可以参考https://dzone.com/articles/basic-xml-processing-scala进行xml处理。 XML支持内置在scala中,最好使用scala为我们提供的功能。