我正在尝试使用字符串连接函数从XML文件中提取多个元素,该函数适用于单个元素。但是,当我尝试在我的代码中添加另一个时,我看到的数据不正确。我怀疑我在某个地方错过了一件简单的事情,但似乎无法找到它......
示例XML数据: -
<books>
<book id="6636551">
<master_information>
<book_xref>
<xref type="Fiction" type_id="1">72771KAM3</xref>
<xref type="Non_Fiction" type_id="2">US72771KAM36</xref>
</book_xref>
</master_information>
<book_details>
<price>24.95</price>
<publish_date>2000-10-01</publish_date>
<description>An in-depth look at creating applications with XML.</description>
</book_details>
<global_information>
<ratings>
<rating agency="ABC Agency" type="Author Rating">A++</rating>
<rating agency="DEF Agency" type="Author Rating">A+</rating>
<rating agency="DEF Agency" type="Book Rating">A</rating>
</ratings>
</global_information>
<country_info>
<country_code>US</country_code>
</country_info>
</book>
<book id="119818569">
<master_information>
<book_xref>
<xref type="Fiction" type_id="1">070185UL5</xref>
<xref type="Non_Fiction" type_id="2">US070185UL50</xref>
</book_xref>
</master_information>
<book_details>
<price>19.25</price>
<publish_date>2002-11-01</publish_date>
<description>A former architect battles corporate zombies, an evil sorceress, and her own childhood to become queen of the world.</description>
</book_details>
<global_information>
<ratings>
<rating agency="ABC Agency" type="Author Rating">A+</rating>
<rating agency="ABC Agency" type="Book Rating">A</rating>
<rating agency="DEF Agency" type="Author Rating">A</rating>
<rating agency="DEF Agency" type="Book Rating">B+</rating>
</ratings>
</global_information>
<country_info>
<country_code>CA</country_code>
</country_info>
</book>
</book>
</books>
XQuery用于提取单个元素: -
for $x in string-join(('book_id,book_price', //book/book_details/price/string-join((ancestor::book/@id, .), ',')), ' ')
return $x
哪个工作正常,并按如下方式吐出样本输出:
book_id,book_price
6636551,24.95
119818569,19.25
问题是如何从单个XML文件中提取多个元素或元素和属性的组合,仍然可能使用字符串连接?
我尝试使用以下内容(大部分都可以正常工作)但我注意到,对于更大的数据集,值似乎随机填充错误的列。例如。在下面的代码中,如果数据中./publish_date
为空,我注意到./description
列中会填充./publish_date
数据。
for $x in string-join(('book_id,book_price,book_pub_date,book_desc', //book/book_details/string-join((ancestor::book/@id, ./price, ./publish_date, ./description), ',')), ' ')
return $x
仅供参考,我还在学习XQuery。感谢您的见解/意见/帮助!
答案 0 :(得分:4)
XQuery中的序列展平:表达式(1, (2, 3), ((4)), (), 5)
和(1, 2, 3, 4, 5)
是等效的。这意味着如果某些XPath子查询没有返回任何结果,则序列(ancestor::book/@id, ./price, ./publish_date, ./description)
的长度会有所不同。由于函数fn:string-join($strings, $sep)
只是将分隔符放在$strings
(展平)中的每对相邻项之间,因此结果字符串中可以包含不同数量的逗号。
为了保留CSV表的对齐方式,只要缺少值,就可以插入空字符串。一种简单的方法是使用展平优势:($possibly-empty, '')[1]
$possibly-empty
包含一个项目(例如'foo'
),那么此评估结果为('foo', '')[1]
- &gt; 'foo'
。()
,则表达式的计算结果为((), '')[1]
- &gt; ('')[1]
(展平) - &gt; ''
。工作示例(您的封闭FLWOR表达式(for
/ return
)完全是多余的,因为您只迭代单个字符串元素,因此我省略了它:)
string-join(
(
'book_id,book_price,book_pub_date,book_desc',
//book/book_details/string-join(
(
(ancestor::book/@id, '')[1],
(./price, '')[1],
(./publish_date, '')[1],
(./description, '')[1]
),
','
)
),
' '
)
您还可以将该功能抽象为其自己的功能:
declare function local:non-empty($possibly-empty) {
($possibly-empty, '')[1]
};
string-join(
(
'book_id,book_price,book_pub_date,book_desc',
//book/book_details/string-join(
(
local:non-empty(ancestor::book/@id),
local:non-empty(./price),
local:non-empty(./publish_date),
local:non-empty(./description)
),
','
)
),
' '
)