如何使用XQuery计算XML中节点的出现次数?

时间:2017-09-19 21:01:05

标签: xml xpath xquery

输入文件:

<?xml version="1.0" encoding="UTF-8"?> 
        <books>
            <book id="6636551">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">72771KAM3</xref>
                        <xref type="Non_Fiction" type_id="2">US72771KAM36</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>24.95</price>
                    <publish_date>2000-10-01</publish_date>
                    <description>An in-depth look at creating applications with XML.</description>
                </book_details>
            </book>
            <book id="119818569">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">070185UL5</xref>
                        <xref type="Non_Fiction" type_id="2">US070185UL50</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>19.25</price>
                    <publish_date>2002-11-01</publish_date>
                    <description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
                </book_details>
            </book>
            <book id="119818568">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">070185UK7</xref>
                        <xref type="Non_Fiction" type_id="2">US070185UK77</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>5.95</price>
                    <publish_date>2004-05-01</publish_date>
                    <description>After the collapse of a nanotechnology society in England, the young survivors lay the foundation for a new society.</description>
                </book_details>
            </book>
            <book id="119818567">
                <master_information>
                    <book_xref>
                        <xref type="Fiction" type_id="1">070185UJ0</xref>
                        <xref type="Non_Fiction" type_id="2">US070185UJ05</xref>
                    </book_xref>
                </master_information>
                <book_details>
                    <price>4.95</price>
                    <publish_date>2000-09-02</publish_date>
                    <description>When Carla meets Paul at an ornithology conference, tempers fly as feathers get ruffled.</description>
                </book_details>
            </book>
        </books>

我编写了XQuery来显示特定字段的计数,如下所示:

for $x in //book_xref
let $c := string-join(('name of element:', count($x)), '&#10;')
return $c

预期输出:

name of element: 4

但输出结果如下:

name of element:
1
name of element:
1
name of element:
1
name of element:
1

之后我理解为什么会这样做。我尝试聚合计数值,但没有成功。此外,无法找到任何自动获取元素名称的功能,以便它自动包含在字符串中。

理想情况下,目标输出是

book_xref:4

我需要做什么?我错过了什么?

谢谢!感谢您的回复。

2 个答案:

答案 0 :(得分:3)

concat('book_xref:', count(//book_xref))怎么样?

您在输出中获得4个不同结果的原因是因为您正在迭代所有出现的book_xref

此外,您可以使用$x/name()获得名称,但由于您已经知道自己选择了什么,因此没有必要。

获取所有元素名称及其出现次数的一种简单但不是非常有效的方法是:

let $names := distinct-values(//*/name())
for $x in $names
let $c := concat($x, ':', count(//*[name()=$x]), '&#10;')
return $c

产生:

books:1
book:4
master_information:4
book_xref:4
xref:8
book_details:4
price:4
publish_date:4
description:4

答案 1 :(得分:2)

对于你的初步目标,@ daniel-haley已经有了一个简洁的解决方案。

如果要有效计算文档中所有元素名称的出现次数,可以在 XQuery 3.0 中使用地图和fn:fold-left(...)函数迭代处理所有元素并保持不变计算(至少在支持迭代评估的XQuery处理器中)永远不会同时在内存中包含所有元素,甚至不是同名的元素:

fold-left(
  //*,
  map{},
  function($map, $node) {
    let $name := $node/local-name()
    return map:merge((map { $name: ($map($name), 0)[1] + 1 }, $map))
  }
)

使用group by的更简单的解决方案更具可读性,但内存效率可能更低:

map:merge(
  for $node in //*
  group by $name := $node/node-name()
  return map { $name: count($node) }
)

两个查询的结果相同:

map {
  'price': 4,
  'book': 4,
  'books': 1,
  'book_details': 4,
  'master_information': 4,
  'book_xref': 4,
  'xref': 8,
  'description': 4,
  'publish_date': 4
}