我尝试删除实体 §
之后的重复条目,如果在条目中包含 ,
,并且在 tokenize
之后 start-with
(
圆括号然后条目 eg (17200(b)(2), (4)–(6)
) s/b eg (<p>17200(b)(2)</p><p>17200(b)(4)–(6)</p>
).
输入 XML
<root>
<p>CC §1(a), (b), (c)</p>
<p>Civil Code §1(a), (b)</p>
<p>CC §§2(a)</p>
<p>Civil Code §3(a)</p>
<p>CC §1(c)</p>
<p>Civil Code §1(a), (b), (c)</p>
<p>Civil Code §17200(b)(2), (4)–(6), (8), (12), (16), (20), and (21)</p>
</root>
预期产出
<root>
<sec specific-use="CC">
<title content-type="Sta_Head3">CIVIL CODE</title>
<p>1(a)</p>
<p>1(b)</p>
<p>1(c)</p>
<p>2(a)</p>
<p>3(a)</p>
<p>17200(b)(2)</p>
<p>17200(b)(4)–(6)</p>
<p>17200(b)(8)</p>
<p>17200(b)(12)</p>
<p>17200(b)(16)</p>
<p>17200(b)(20)</p>
<p>17200(b)(21)</p>
</sec>
</root>
XSLT 代码
<xsl:template match="root">
<xsl:copy>
<xsl:for-each-group select="p[(starts-with(., 'CC ') or starts-with(., 'Civil Code'))]" group-by="replace(substring-before(., ' §'), 'Civil Code', 'CC')">
<xsl:text>
</xsl:text>
<sec specific-use="{current-grouping-key()}">
<xsl:text>
</xsl:text>
<title content-type="Sta_Head3">CIVIL CODE</title>
<xsl:for-each-group select="current-group()" group-by="replace(substring-after(., '§'), '§', '')">
<xsl:sort select="replace(current-grouping-key(), '[^0-9.].*$', '')" data-type="number" order="ascending"/>
<xsl:for-each
select="distinct-values(
current-grouping-key() !
(let $tokens := tokenize(current-grouping-key(), ', and |, | and ')
return (head($tokens), tail($tokens) ! (substring-before(head($tokens), '(') || .)))
)" expand-text="yes">
<p>{.}</p>
</xsl:for-each>
</xsl:for-each-group>
</sec>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
答案 0 :(得分:0)
您可以这样做,采用两步法,首先计算现有元素的列表,然后使用 for-each-group 删除重复项。
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="#all"
version="3.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/">
<xsl:variable name="listP">
<xsl:apply-templates select="root/p"/>
</xsl:variable>
<xsl:for-each-group select="$listP" group-by="p">
<p><xsl:value-of select="current-grouping-key()"/></p>
</xsl:for-each-group>
</xsl:template>
<xsl:template match="p">
<xsl:variable name="input" select="replace(substring-after(.,'§'),'§','')"/>
<xsl:variable name="chapter" select="substring-before($input,'(')"/>
<xsl:for-each select="tokenize(substring-after($input, $chapter),',')">
<p><xsl:value-of select="concat($chapter,replace(replace(.,' ',''),'and',''))"/></p>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>