与此问题相关Creating New Functions
我仍在使用一个优雅的函数来搜索特定的代码段,如果找到某些触发器,则返回substring-after,直到end触发器。示例:
<Data>Moby Dick [videorecording] / United Artists ; A Moulin Picture ; screenplay by Ray Bradbury and John Huston ; directed by John Huston.</Data>
<Data>Oliver Twist [videorecording] / Independent Producers ; screen play by David Lean and Stanley Haynes ; produced by Ronald Neame ; directed by David Lean.</Data>
<Data>Romeo + Juliet [videorecording] / Twentieth Century Fox presents a Bazmark production ; producers, Gabriella Martinelli, Baz Luhrmann ; screenplay, Craig Pearce, Baz Luhrmann.</Data>
期望的结果:
...
<writer>Ray Bradbury</writer>
<writer>John Huston</writer>
...
...
<writer>David Lean</writer>
<writer>Stanley Haynes</writer>
...
...
<writer>Craig Pearce</writer>
<writer>Baz Luhrmann</writer>
...
我的尝试:
<xsl:function name="foo:personSep">
<xsl:param name="string"/>
<xsl:param name="delim"/>
<xsl:choose>
<xsl:when test="not(contains($string,$delim))">
<writer>
<xsl:value-of select="$string"/>
</writer>
</xsl:when>
<xsl:when test="contains($string,$delim)">
<writer>
<xsl:value-of select="substring-before($string, $delim)"/>
</writer>
<xsl:sequence select="functx:personSep(substring-after($string, $delim), $delim)"/>
</xsl:when>
<xsl:otherwise>
<writer>
</writer>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
<xsl:template match="ss:Cell[3]/ss:Data" mode="writer">
<xsl:variable name="cell3Data" select="normalize-space(.)"/>
<xsl:variable name="writerFind" as="xs:string*"
select="('screenplay by ','screen play by ','screenplay, ')"/>
<xsl:for-each select="1 to count($writerFind)">
<xsl:variable name="x" select="."/>
<xsl:variable name="writer" select="substring-after($cell3Data, $writerFind[$x])"/>
<xsl:if test="$writer != ''">
<xsl:if test="contains($writer, ' and ')">
<xsl:sequence
select="foo:personSep(functx:right-trim(replace($writer, '[;\.].*$', '')),' and ')"
/>
</xsl:if>
<xsl:if test="contains($writer, ', ')">
<xsl:sequence
select="foo:personSep(functx:right-trim(replace($writer, '[;\.].*$', '')),', ')"
/>
</xsl:if>
</xsl:if>
</xsl:for-each>
</xsl:template>
我的boot-strappy kludgeriffic版本大部分都可以使用,但我确信有一个更精简的清洁解决方案...它也不会捕获任何包含逗号AND和类似的版本
“约翰史密斯,艾德琼斯和罗伯特丹弗斯的电影剧本”
答案 0 :(得分:1)
以下是与Data
匹配的模板,并提取writer
s:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="Data">
<xsl:analyze-string select="." regex="(screenplay by |screen play by |screenplay, )([^.;]+)(;|\.|$)">
<xsl:matching-substring>
<xsl:analyze-string select="regex-group(2)" regex="(\w+(\s*\w*))(\s*(,|and|$))">
<xsl:matching-substring>
<writer><xsl:value-of select="normalize-space(regex-group(1))"/></writer>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:template>
</xsl:stylesheet>
当我在输入
上应用Saxon 9.5时<Root>
<Data>Moby Dick [videorecording] / United Artists ; A Moulin Picture ; screenplay by Ray Bradbury and John Huston ; directed by John Huston.</Data>
<Data>Oliver Twist [videorecording] / Independent Producers ; screen play by David Lean and Stanley Haynes ; produced by Ronald Neame ; directed by David Lean.</Data>
<Data>Romeo + Juliet [videorecording] / Twentieth Century Fox presents a Bazmark production ; producers, Gabriella Martinelli, Baz Luhrmann ; screenplay, Craig Pearce, Baz Luhrmann.</Data>
</Root>
我得到了结果
<writer>Ray Bradbury</writer>
<writer>John Huston</writer>
<writer>David Lean</writer>
<writer>Stanley Haynes</writer>
<writer>Craig Pearce</writer>
<writer>Baz Luhrmann</writer>
如果你想编写一个函数,那就行了。
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:mf="http://example.com/mf"
exclude-result-prefixes="xs mf"
version="2.0">
<xsl:output method="xml" indent="yes"/>
<xsl:function name="mf:extract" as="element()*">
<xsl:param name="input" as="xs:string"/>
<xsl:param name="markers" as="xs:string*"/>
<xsl:param name="element-name" as="xs:string"/>
<xsl:analyze-string select="$input" regex="({string-join($markers, '|')})([^.;]+)(;|\.|$)">
<xsl:matching-substring>
<xsl:analyze-string select="regex-group(2)" regex="(\w+(\s*\w*))(\s*(,|and|$))">
<xsl:matching-substring>
<xsl:element name="{$element-name}"><xsl:value-of select="normalize-space(regex-group(1))"/></xsl:element>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:matching-substring>
</xsl:analyze-string>
</xsl:function>
<xsl:template match="Data">
<xsl:sequence select="mf:extract(., ('screenplay by ', 'screen play by ', 'screenplay, '), 'writer')"/>
</xsl:template>
</xsl:stylesheet>