Question

我有2条XML记录，除了ID之外，所有值都相同。

<Record ID="2006-06-01">
  <author>sam</author>
  <Year>2006</Year>
  <Month>6</Month>
</Record>


<Record Id="2006-06-02">
  <author>sam</author>
  <Year>2006</Year>
  <Month>6</Month>
</Record>

我想要抑制记录，即：我想只显示一条记录，当我在author元素中搜索'sam'时，即使使用Xquery＆amp; Marklogic。这可能吗？？如有可能，任何人都可以详细说明。

感谢。

Answer 1

我认为你可以非常简单地使用它，它会找到作者包含字符串“sam”的所有记录，然后只返回第一个。

(//Record[contains(author, "sam")])[1]

Answer 2

如果该属性不存在，则可以使用deep-equal($node1, $node2)。将它应用于这些记录中的每个子节点：

let $record1 :=
  <Record ID="2006-06-01">
    <author>sam</author>
    <Year>2006</Year>
    <Month>6</Month>
  </Record>
let $record2 :=
  <Record Id="2006-06-02">
    <author>sam</author>
    <Year>2006</Year>
    <Month>6</Month>
  </Record>

return $record1[not(
  every $node in $record1/*
  satisfies deep-equal($node, $record2/*[local-name() = $node/local-name()])
)]

如果不支持量化表达式，则必须将它们转换为FLWOR表达式，但Marklogic应该在所有更新版本中执行。此外，这仅测试记录的子节点，如果您还想测试属性（除了@ID），则必须为它们添加测试。

Answer 3

可以消除这些重复，但重复数据删除不能很好地扩展。

正如Jens Erat概述的那样，你可以使用fn:deep-equal或其他一些平等测试（但不是fn:distinct-nodes）。或者我可能会使用map:map项来跟踪不同的键，并以确定的方式构建这些键。这可能看起来像这样：

let $m := map:map()
for $n in $results
let $key := $n/author||'/'||$n/Year||'/'||$n/Month
where not(map:contains($m, $key))
return (
  map:put($m, $key, true()),
  $n)

但是你可以看到这些方法需要查看每个节点，这对性能不利。如果您关心性能，则应重新构建数据库，以便URI本质上是唯一的。例如，如果你的URI类似于/records/{ $author }/{ $year }/{ $month }，那么就不可能有这种重复。

如何抑制具有不同ID的相同XML记录？

3 个答案: