如何在XQuery中的NCX文件(用于epub)中生成正确的playOrder?

时间:2016-06-24 14:37:07

标签: xpath xquery

我为一个非常具体的事情而努力 - 为epubs生成NCX文件。问题出在每个playOrder元素的navPoint属性中,因为数字通常只是增加而没有任何嵌套的重要性。另一方面,文件是通过迭代嵌套元素自然生成的(它拒绝简单使用at $count计数样式)。我尝试直接在数组章节上迭代生成这个,我也尝试从准备好的toc文件生成它(可能更容易,因为我迭代一个节点,而不是数组)。问题是一样的。

NCX文件的示例部分:

<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
    <head>
        <meta name="dtb:uid" content=""/>
        <meta name="dtb:depth" content="1"/>
        <meta name="dtb:totalPageCount" content="0"/>
        <meta name="dtb:maxPageNumber" content="0"/>
    </head>
    <docTitle>
        <text/>
    </docTitle>
    <navMap>
        <navPoint id="title-page" playOrder="1">
            <navLabel>
                <text>Title Page</text>
            </navLabel>
            <content src="title-page.xhtml"/>
        </navPoint>
        <navPoint id="chapter-1.xhtml#anch1lev1" playOrder="@@@">
            <navLabel>
                <text>ÚVOD</text>
            </navLabel>
            <content src="chapter-1.xhtml#anch1lev1"/>
            <navPoint id="chapter-1.xhtml#anch2lev1" playOrder="@@@">
                <navLabel>
                    <text>Přehled bádání nad nálezy terry sigillaty v Čechách</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch2lev1"/>
            </navPoint>
            <navPoint id="chapter-1.xhtml#anch2lev2" playOrder="@@@">
                <navLabel>
                    <text>Poválečné bádání nad nálezy terry sigillaty v evropském barbariku</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch2lev2"/>
            </navPoint>
            <navPoint id="chapter-1.xhtml#anch2lev3" playOrder="@@@">
                <navLabel>
                    <text>Terminologie a tvarová klasifikace</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch2lev3"/>
            </navPoint>
        </navPoint>
        <navPoint id="chapter-2.xhtml#anch1lev1" playOrder="@@@">
            <navLabel>
                <text>KATALOG</text>
            </navLabel>
            <content src="chapter-2.xhtml#anch1lev1"/>
            <navPoint id="chapter-2.xhtml#anch2lev1" playOrder="@@@">
                <navLabel>
                    <text>Struktura a metodické pojetí katalogu</text>
                </navLabel>
                <content src="chapter-2.xhtml#anch2lev1"/>
            </navPoint>
            <navPoint id="chapter-2.xhtml#anch2lev2" playOrder="@@@">
                <navLabel>

这里我将playOrder属性仅作为占位符。是否有一些简单的方法如何用简单增加的计数器(每@@@个)替换navPoint?我已经尝试typeswitch(无法使其工作)并详尽计算前面的标题级别 - 它工作但是非常笨拙和缓慢随着标题的增加,并且非常不稳定由于xpath轴略有变化,因此跨文档。我需要一种简单,防弹的方式。我想计算许多前面的水平并不是正确的选择。

2 个答案:

答案 0 :(得分:1)

在我自己解决这个问题时,我可能会在第一次创建NCX文件时尝试生成正确的值。但是,如果挑战是如何修复文件中的playOrder属性,其中值充满空,虚拟或其他不正确的属性,我可以想到两种技术:使用XQuery typeswitch表达式迭代所有文档中的节点并交换所需的值,或使用XQuery Update手动更新值。在下面的每个示例中,两者都采用相同的方法:使用ancestorpreceding XPath轴来计算playOrder属性的值。注意:我对您的示例XML的唯一更改是关闭最终元素以使其格式良好。

更新:在我的第一个版本中,我错误地省略了ancestor轴计数,导致值不正确。我忘记了preceding轴不包含ancestor轴。从我最喜欢的XPath轴图https://our.umbraco.org/media/upload/0562fd58-c6db-4fa8-a432-68b28f11c3f2/rs/7x1B0.gif中可以清楚地看出这一点。

xquery version "3.0";

declare namespace ncx="http://www.daisy.org/z3986/2005/ncx/";

declare function local:fix-playorder($nodes as item()*) {
    for $node in $nodes
    return
        typeswitch ($node)
            case element(ncx:navPoint) return
                <navPoint xmlns="http://www.daisy.org/z3986/2005/ncx/">{
                    $node/@*[not(name(.) = 'playOrder')],
                    attribute playOrder { count($node/ancestor::ncx:navPoint) + count($node/preceding::ncx:navPoint) + 1 },
                    local:fix-playorder($node/node())
                }</navPoint>
            case element() return
                element {node-name($node)} {$node/@*, local:fix-playorder($node/node())}
            default return
                $node
};

let $ncx := 
    <ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
        <head>
            <meta name="dtb:uid" content=""/>
            <meta name="dtb:depth" content="1"/>
            <meta name="dtb:totalPageCount" content="0"/>
            <meta name="dtb:maxPageNumber" content="0"/>
        </head>
        <docTitle>
            <text/>
        </docTitle>
        <navMap>
            <navPoint id="title-page" playOrder="1">
                <navLabel>
                    <text>Title Page</text>
                </navLabel>
                <content src="title-page.xhtml"/>
            </navPoint>
            <navPoint id="chapter-1.xhtml#anch1lev1" playOrder="@@@">
                <navLabel>
                    <text>ÚVOD</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch1lev1"/>
                <navPoint id="chapter-1.xhtml#anch2lev1" playOrder="@@@">
                    <navLabel>
                        <text>Přehled bádání nad nálezy terry sigillaty v Čechách</text>
                    </navLabel>
                    <content src="chapter-1.xhtml#anch2lev1"/>
                </navPoint>
                <navPoint id="chapter-1.xhtml#anch2lev2" playOrder="@@@">
                    <navLabel>
                        <text>Poválečné bádání nad nálezy terry sigillaty v evropském barbariku</text>
                    </navLabel>
                    <content src="chapter-1.xhtml#anch2lev2"/>
                </navPoint>
                <navPoint id="chapter-1.xhtml#anch2lev3" playOrder="@@@">
                    <navLabel>
                        <text>Terminologie a tvarová klasifikace</text>
                    </navLabel>
                    <content src="chapter-1.xhtml#anch2lev3"/>
                </navPoint>
            </navPoint>
            <navPoint id="chapter-2.xhtml#anch1lev1" playOrder="@@@">
                <navLabel>
                    <text>KATALOG</text>
                </navLabel>
                <content src="chapter-2.xhtml#anch1lev1"/>
                <navPoint id="chapter-2.xhtml#anch2lev1" playOrder="@@@">
                    <navLabel>
                        <text>Struktura a metodické pojetí katalogu</text>
                    </navLabel>
                    <content src="chapter-2.xhtml#anch2lev1"/>
                </navPoint>
                <navPoint id="chapter-2.xhtml#anch2lev2" playOrder="@@@">
                    <navLabel/>
                </navPoint>
            </navPoint>
        </navMap>
    </ncx>
return
    local:fix-playorder($ncx)

结果:

<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
    <head>
        <meta name="dtb:uid" content=""/>
        <meta name="dtb:depth" content="1"/>
        <meta name="dtb:totalPageCount" content="0"/>
        <meta name="dtb:maxPageNumber" content="0"/>
    </head>
    <docTitle>
        <text/>
    </docTitle>
    <navMap>
        <navPoint id="title-page" playOrder="1">
            <navLabel>
                <text>Title Page</text>
            </navLabel>
            <content src="title-page.xhtml"/>
        </navPoint>
        <navPoint id="chapter-1.xhtml#anch1lev1" playOrder="2">
            <navLabel>
                <text>ÚVOD</text>
            </navLabel>
            <content src="chapter-1.xhtml#anch1lev1"/>
            <navPoint id="chapter-1.xhtml#anch2lev1" playOrder="3">
                <navLabel>
                    <text>Přehled bádání nad nálezy terry sigillaty v Čechách</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch2lev1"/>
            </navPoint>
            <navPoint id="chapter-1.xhtml#anch2lev2" playOrder="4">
                <navLabel>
                    <text>Poválečné bádání nad nálezy terry sigillaty v evropském barbariku</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch2lev2"/>
            </navPoint>
            <navPoint id="chapter-1.xhtml#anch2lev3" playOrder="5">
                <navLabel>
                    <text>Terminologie a tvarová klasifikace</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch2lev3"/>
            </navPoint>
        </navPoint>
        <navPoint id="chapter-2.xhtml#anch1lev1" playOrder="6">
            <navLabel>
                <text>KATALOG</text>
            </navLabel>
            <content src="chapter-2.xhtml#anch1lev1"/>
            <navPoint id="chapter-2.xhtml#anch2lev1" playOrder="7">
                <navLabel>
                    <text>Struktura a metodické pojetí katalogu</text>
                </navLabel>
                <content src="chapter-2.xhtml#anch2lev1"/>
            </navPoint>
            <navPoint id="chapter-2.xhtml#anch2lev2" playOrder="8">
                <navLabel/>
            </navPoint>
        </navPoint>
    </navMap>
</ncx>

XQuery Update方法将使用相同的preceding轴技术。我的例子是eXist的XQuery Update实现,它要求将文件存储在数据库中。结果文件与上述结果相同。

xquery version "3.0";

declare namespace ncx="http://www.daisy.org/z3986/2005/ncx/";

let $ncx := 
    <ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
        <head>
            <meta name="dtb:uid" content=""/>
            <meta name="dtb:depth" content="1"/>
            <meta name="dtb:totalPageCount" content="0"/>
            <meta name="dtb:maxPageNumber" content="0"/>
        </head>
        <docTitle>
            <text/>
        </docTitle>
        <navMap>
            <navPoint id="title-page" playOrder="1">
                <navLabel>
                    <text>Title Page</text>
                </navLabel>
                <content src="title-page.xhtml"/>
            </navPoint>
            <navPoint id="chapter-1.xhtml#anch1lev1" playOrder="@@@">
                <navLabel>
                    <text>ÚVOD</text>
                </navLabel>
                <content src="chapter-1.xhtml#anch1lev1"/>
                <navPoint id="chapter-1.xhtml#anch2lev1" playOrder="@@@">
                    <navLabel>
                        <text>Přehled bádání nad nálezy terry sigillaty v Čechách</text>
                    </navLabel>
                    <content src="chapter-1.xhtml#anch2lev1"/>
                </navPoint>
                <navPoint id="chapter-1.xhtml#anch2lev2" playOrder="@@@">
                    <navLabel>
                        <text>Poválečné bádání nad nálezy terry sigillaty v evropském barbariku</text>
                    </navLabel>
                    <content src="chapter-1.xhtml#anch2lev2"/>
                </navPoint>
                <navPoint id="chapter-1.xhtml#anch2lev3" playOrder="@@@">
                    <navLabel>
                        <text>Terminologie a tvarová klasifikace</text>
                    </navLabel>
                    <content src="chapter-1.xhtml#anch2lev3"/>
                </navPoint>
            </navPoint>
            <navPoint id="chapter-2.xhtml#anch1lev1" playOrder="@@@">
                <navLabel>
                    <text>KATALOG</text>
                </navLabel>
                <content src="chapter-2.xhtml#anch1lev1"/>
                <navPoint id="chapter-2.xhtml#anch2lev1" playOrder="@@@">
                    <navLabel>
                        <text>Struktura a metodické pojetí katalogu</text>
                    </navLabel>
                    <content src="chapter-2.xhtml#anch2lev1"/>
                </navPoint>
                <navPoint id="chapter-2.xhtml#anch2lev2" playOrder="@@@">
                    <navLabel/>
                </navPoint>
            </navPoint>
        </navMap>
    </ncx>
let $store := xmldb:store('/db', 'test.ncx', $ncx)
let $doc := doc('/db/test.ncx')
for $navPoint in $doc//ncx:navPoint
return
    update value $navPoint/@playOrder with (count($node/ancestor::ncx:navPoint) + count($navPoint/preceding::ncx:navPoint) + 1)

答案 1 :(得分:1)

最后,在我的案例中,解决方案非常简单。我在收集epub的所有条目期间生成NCX文件(该文件在操作之前不存在)。对我来说,它可以从TOC文件中生成它,该文件是在此过程中的前面步骤中生成的。

为了为TOC生成nav我使用:

<nav epub:type='toc'>
  <ol>{
      for $sect in $chaps
      let $name := replace($sect/@name, "OEBPS/", "")
      return
        for $one in $sect//xhtml:h1
        return
          <li>
            <a href='{$name || '#' || $one/@id}'>{$one/text()[not(ancestor-or-self::xhtml:sup)]}</a>{
              if ($one/parent::*//xhtml:h2 and xs:integer($head-level) ge 2) then
                <ol>{
                    for $two in $one/parent::*//xhtml:h2
                    return
                      <li>
                        <a href='{$name || '#' || $two/@id}'>{$two//text()[not(ancestor-or-self::xhtml:sup)]}</a>{
                          if ($two/parent::*//xhtml:h3 and xs:integer($head-level) ge 3) then
                            <ol>{
                                for $three in $two/parent::*//xhtml:h3
                                return
                                  <li>
                                  … and so on.

对于NCX的navMap,我使用:

<navMap>
  <navPoint id='title-page' playOrder='1'>
    <navLabel>
      <text>Title Page</text>
    </navLabel>
    <content src='title-page.xhtml'/>
  </navPoint>{
    (: For every item in the $toc, I count preceding items + I add a number of parents, which are
    items also (I guess to, it simply works! :)
    for $h1 in $toc//xhtml:nav/xhtml:ol/xhtml:li
    return
      <navPoint id='{$h1/xhtml:a/@href}' playOrder='{count($h1/preceding::xhtml:li) + 2}'>
        <navLabel>
          <text>{$h1/xhtml:a/text()[not(ancestor-or-self::xhtml:sup)]}</text>
        </navLabel>
        <content src='{$h1/xhtml:a/@href}'/>{
          for $h2 in $h1/xhtml:ol/xhtml:li
          return
            <navPoint id='{$h2/xhtml:a/@href}' playOrder='{count($h2/preceding::xhtml:li) + 3}'>
              <navLabel>
                <text>{$h2/xhtml:a/text()[not(ancestor-or-self::xhtml:sup)]}</text>
              </navLabel>
              <content src='{$h2/xhtml:a/@href}'/>
              … and so on.

playOrder的具体解决方案是:

playOrder='{count($h1/preceding::xhtml:li) + 2}'

...这意味着我正在计算TOC中包含的所有前面的标题+添加一些我正在计算的特定级别。它可能会被父母的适当计数所取代,但这给了我总是正确的结果。我无法计算父母/祖先的工作。