最终HTML输出的XSLT 2.0修整文本长度

时间:2018-11-17 16:42:59

标签: xslt xslt-2.0

XSLT小提琴在这里:https://xsltfiddle.liberty-development.net/bFDb2Dh/2

在XSL 2.0中,我从eXist-db Lucene搜索功能接收到一小组节点,该功能返回原始XML,但将搜索词包装在<exist:match/>中。因此,我搜索了tei:seg,然后得到了以下内容(我将输出内容包装在一个额外的元素中,以供以后处理):

<doc>
  <url>http://localhost:8081/exist/apps/deheresi/doc/MS609-0454</url>
  <seg xmlns="http://www.tei-c.org/ns/1.0" type="dep_event" subtype="event" xml:id="MS609-0454-2" corresp="#MS609-0453-7">Item. Dixit<lb break="y" n="11"/>quod 
    <persName nymRef="#abbot_of_Saint_Papoul" role="npar">abbas de 
        <placeName nymRef="#Saint-Papoul_Aude">Sancto Papulo</placeName>
    </persName> ceperat 
    <persName nymRef="#heretics_not_named" role="par">duos hereticos</persName> et 
    <persName nymRef="#Arnald_Savauza_SML-AU" ana="#pFreeHer" role="par">Arnaldus de Savauza</persName> volebat manulevare dictos hereticos. Et rogavit 
    ipsum<lb break="y" n="12"/>testim et 
    <persName nymRef="#Arnald_Forner_SML-AU" ana="#pFreeHer" role="par">Arnaldum Fornier</persName> et 
    <persName nymRef="#Raimund_Forner_SML-AU" ana="#pFreeHer" role="par">Raimundum Fornier</persName>, fratres,  
    quod irent cum eo 
    ad abbatem de <placeName type="event_loc" nymRef="#Saint-Papoul_Abbey">Sancto<lb break="y" n="14"/>Papulo</placeName> 
    et manulevarent hereticos. Et dictus 
    <persName nymRef="#Arnald_Savauza_SML-AU" ana="#pFreeHer" role="ref">Arnaldus de Savauza</persName> 
    dixit quod dictus abbas promiserat ei quod redderet sibi dictos<lb break="y" n="15"/>
    hereticos pro mille <exist:match xmlns:exist="http://exist.sourceforge.net/NS/exist">solidis</exist:match> tholosanis. Et 
    <persName nymRef="#Bernard_Alzeu_SML-AU" ana="#pFreeHer" role="ref">Bernardus Alzeus</persName> et 
    <persName nymRef="#Ysarn_de_Gibel_SML-AU" ana="#pFreeHer" role="ref">Ysarnus de Gibel</persName> portabant illos denarios. 
    Sed non potuerunt dictos hereticos ma<lb break="n" n="16"/>nulevare. 
    <date type="event_date" when="1237">Et sunt anni VIIIor vel circa.</date>
  </seg>
</doc>

在XSLT中,我通过一些转换将其输出到HTML中。但是,输出看起来像这样:

<td>Item. Dixit quod 
   abbas de 
   Sancto Papulo
   ceperat 
   duos hereticos et 
   Arnaldus de Savauza volebat manulevare dictos hereticos. Et rogavit 
   ipsum testim et 
   Arnaldum Fornier et 
   Raimundum Fornier, fratres,  
   quod irent cum eo 
   ad abbatem de Sancto Papulo 
   et manulevarent hereticos. Et dictus 
   Arnaldus de Savauza 
   dixit quod dictus abbas promiserat ei quod redderet sibi dictos 
   hereticos pro mille <span class="search-hit">
   <a href="http://localhost:8081/exist/apps/deheresi/doc/MS609-0454"> 
   solidis</a></span> tholosanis. Et 
   Bernardus Alzeus et 
   Ysarnus de Gibel portabant illos denarios. 
   Sed non potuerunt dictos hereticos manulevare. 
   Et sunt anni VIIIor vel circa.
</td>

但是我希望用省略号将最终输出缩短:

<td>...dictus abbas 
  promiserat ei quod redderet sibi dictos 
  hereticos pro mille <span class="search-hit"><a 
  href="http://localhost:8081/exist/apps/deheresi/doc/MS609-0454"> 
  solidis</a></span> tholosanis. Et 
  Bernardus Alzeus et 
  Ysarnus de Gibel portabant illos...
</td>

<span class="search-hit"/>内容两边的文本输出限制为x个字符。 (如果可能,还可以应用normalize-space()来清除原始文档中字符间距的问题。)

仅在后期处理中,我还没有发现如何在当前XSL转换中实现此目标的任何想法。

非常感谢。

1 个答案:

答案 0 :(得分:1)

您可以将tei:seg的内容分别从现有代码中获取的td的结果存储在一个变量中,

<xsl:template match="tei:seg">
    <xsl:variable name="search-hit">
        <xsl:apply-templates/>
    </xsl:variable>
    <td>
        <xsl:apply-templates select="$search-hit" mode="trim"/>
    </td>
</xsl:template>

,然后您可以通过具有文本节点模板的另一种模式推送该内容,以进行修剪:

<xsl:param name="trim-to" as="xs:integer" select="60"/>

<xsl:template match="text()[1]" mode="trim">
    <xsl:variable name="normalized" as="xs:string" select="normalize-space(.)"/>
    <xsl:value-of select="concat('...', substring($normalized, string-length($normalized) - $trim-to))"/>
</xsl:template>

<xsl:template match="text()[last()]" mode="trim">
    <xsl:variable name="normalized" as="xs:string" select="normalize-space(.)"/>
    <xsl:value-of select="concat(substring($normalized, 1, $trim-to), '...')"/>
</xsl:template>

<xsl:template match="span[@class = 'search-hit']" mode="trim">
    <xsl:copy-of select="."/>
</xsl:template>

可以在文本节点模板中使用replace和/或tokenize和/或xsl:analyze-string来微调执行修整/归一化的代码,但是只有当修整所需的算法很明确。

提琴调整为https://xsltfiddle.liberty-development.net/bFDb2Dh/3