使用XQuery

时间:2016-06-07 04:56:00

标签: xquery marklogic marklogic-8

我有这样的XML -

<a:price-range xmlns:c="http://iddn.icis.com/ns/core" xmlns:f="http://iddn.icis.com/ns/fields" xmlns:a="http://iddn.icis.com/ns/assets" xmlns:r="http://iddn.icis.com/ns/refdata">
    <c:id>
        http://iddn.icis.com/series-item/petchem/4021090-pricehistory-19990730000000</c:id>
    <c:type>series-item</c:type>
    <f:assessment-low>8.946586935</f:assessment-low>
    <f:assessment-high>9.946586935</f:assessment-high>
    <f:mid>9.44658693500000000000</f:mid>
    <f:period-label>
        <c:l10n xml:lang="en"/>
    </f:period-label>
</a:price-range>

我想规范化XML中的空间。与上面的示例一样, c:id 元素中有空格。规范化空格后,上面的XML看起来像 -

<a:price-range xmlns:c="http://iddn.icis.com/ns/core" xmlns:f="http://iddn.icis.com/ns/fields" xmlns:a="http://iddn.icis.com/ns/assets" xmlns:r="http://iddn.icis.com/ns/refdata">
    <c:id>http://iddn.icis.com/series-item/petchem/4021090-pricehistory-19990730000000</c:id>
    <c:type>series-item</c:type>
    <f:assessment-low>8.946586935</f:assessment-low>
    <f:assessment-high>9.946586935</f:assessment-high>
    <f:mid>9.44658693500000000000</f:mid>
    <f:period-label>
        <c:l10n xml:lang="en"/>
    </f:period-label>
</a:price-range>

我查看了fn:normalise-space,但它仅适用于字符串。

4 个答案:

答案 0 :(得分:2)

我不认为通过应用序列化选项可以实现这一点,您必须通过应用transformation pattern的树来完成。从该页面稍微调整的示例,以规范化空间并支持名称空间:

declare function local:copy($node as node()) as node() {
  typeswitch($node)
    case $text as text()
      return text { normalize-space($text) }
    case $element as element()
      return
        element { QName(namespace-uri($element), name($element)) } {
                  $element/@*,
                  for $child in $element/(* | text()) return local:copy($child)
                }
    default return $node
 };


local:copy(
  <a:price-range xmlns:c="http://iddn.icis.com/ns/core" xmlns:f="http://iddn.icis.com/ns/fields" xmlns:a="http://iddn.icis.com/ns/assets" xmlns:r="http://iddn.icis.com/ns/refdata">
    <c:id>
        http://iddn.icis.com/series-item/petchem/4021090-pricehistory-19990730000000</c:id>
    <c:type>series-item</c:type>
    <f:assessment-low>8.946586935</f:assessment-low>
    <f:assessment-high>9.946586935</f:assessment-high>
    <f:mid>9.44658693500000000000</f:mid>
    <f:period-label>
        <c:l10n xml:lang="en"/>
    </f:period-label>
  </a:price-range>
)

Marklogic也允许apply an XSLT stylesheet,这可能是@Raj提出的<xsl:strip-space elements="*"/>更优雅的版本。

答案 1 :(得分:0)

我猜<xsl:strip-space elements="*"/>完美无缺,您需要先通过xslt将xml转换为xml。

答案 2 :(得分:0)

这个功能对我很好 -

(:
  The rules/assumptions are:
  #1 Retain one leading space if the node isn't first, has non-space content, and has leading space.
  #2 Retain one trailing space if the node isn't last, isn't first, and has trailing space. 
  #3 Retain one trailing space if the node isn't last, is first, has trailing space, and has non-space content.
  #4 Retain a single space if the node is an only child and only has space content.
  :)
  declare function local:normalize-space-in-xml($input)
  {
     element {node-name($input)}
       {$input/@*,
         for $child in $input/node()
         return
           if ($child instance of element())
           then local:normalize-space-in-xml($child)
           else
             if ($child instance of text())
             then
               (:#1 Retain one leading space if node isn't first, has non-space content, and has leading space:)
               if ($child/position() ne 1 and matches($child,'^\s') and normalize-space($child) ne '')
               then (' ', normalize-space($child))
               else
                 (:#4 retain one space, if the node is an only child, and has content but it's all space:)
                 if ($child/last() eq 1 and string-length($child) ne 0 and normalize-space($child) eq '')
                 (: this overrules standard normalization:)
                 then ' '
                 else
                   (:#2 if the node isn't last, isn't first, and has trailing space, retain trailing space and collapse and trim the rest:)
                   if ($child/position() ne 1 and $child/position() ne last() and matches($child,'\s$'))
                   then (normalize-space($child), ' ')
                   else
                     (:#3 if the node isn't last, is first, has trailing space, and has non-space content, then keep trailing space:)
                     if ($child/position() eq 1 and matches($child,'\s$') and normalize-space($child) ne '')
                     then (normalize-space($child), ' ')
                     (:if the node is an only child, and has content which is not all space, then trim and collapse, that is, apply standard normalization:)
                     else normalize-space($child)
              else $child
      }
  };

答案 3 :(得分:-1)

有人打算为此打我,我冒着投票的风险,WTH ......

MarkLogic,xQuery,完成。

let  $xml := <a:price-range xmlns:c="http://iddn.icis.com/ns/core" xmlns:f="http://iddn.icis.com/ns/fields" xmlns:a="http://iddn.icis.com/ns/assets" xmlns:r="http://iddn.icis.com/ns/refdata">
<c:id>
    http://iddn.icis.com/series-item/petchem/4021090-pricehistory-19990730000000</c:id>
<c:type>series-item</c:type>
<f:assessment-low>8.946586935</f:assessment-low>
<f:assessment-high>9.946586935</f:assessment-high>
<f:mid>9.44658693500000000000</f:mid>
<f:period-label>
    <c:l10n xml:lang="en"/>
</f:period-label>
</a:price-range>

return xdmp:unquote(fn:replace(xdmp:quote($xml), "(<[^<]+>)\W+", "$1"))