Question

这是一段XML文档：

<book category="WEB">
    <title lang="en">XQuery Kick Start</title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
</book>

我被要求使用XPath找出姓氏以大写字母“C”开头的作者。这个问题很简单，因为只有一个限定，我可以在空格后使用函数substring-after（），然后检查它是否以“C”开头。但也有可能这个家伙有一个很长的名字，因此可以出现中间名，如Kurt Van Persie Cagle。如何在最后一个空格之后精确地删除子字符串？

请解释并使用XPath中的函数。

Answer 1

我被要求找出姓氏以首都开头的作者 “C”使用XPath。

通常，无法使用单个XPath 1.0表达式进行选择。当然，这可以使用XSLT 1.0来完成。

使用XPath 2.0 ：

/*/author[starts-with(tokenize(., ' ')[last()], 'C')]

基于XSLT 2.0的验证：

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:template match="/">
     <xsl:sequence select="/*/author[starts-with(tokenize(., ' ')[last()], 'C')]"/>
 </xsl:template>
</xsl:stylesheet>

将此转换应用于以下XML文档时：

<book category="WEB">
    <title lang="en">XQuery Kick Start</title>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt van Persy Cantor Bagle</author>
    <author>Kurt van Persy Cantor Cagle</author>
    <author>James Linn</author>
    <author>Vaidyanathan Nagarajan</author>
    <year>2003</year>
    <price>49.99</price>
</book>

评估XPath表达式并将选定的节点复制到输出中：

<author>Kurt van Persy Cantor Cagle</author>

Answer 2

你可以使用“mess”XPath，例如您在author中限制为4个字：

//author[
    (starts-with(substring-after(., ' '), 'C') and not(contains(substring-after(., ' '), ' ')))
    or
    (starts-with(substring-after(substring-after(., ' '), ' '), 'C') and not(contains(substring-after(substring-after(., ' '), ' '), ' ')))
    or
    (starts-with(substring-after(substring-after(substring-after(., ' '), ' '), ' '), 'C') and not(contains(substring-after(substring-after(substring-after(., ' '), ' '), ' '), ' ')))
]

输入：

<book>
    <author>James McGovern</author>
    <author>Per Bothner</author>
    <author>Kurt Cagle</author>
    <author>James Linn</author>
    <author>James Linn</author>
    <author>Kurt Van Persie Cagle</author>
</book>

以上XPath将选择2位作者：Kurt Cagle和Kurt Van Persie Cagle。您可以扩展此XPath以匹配具有5个单词的作者，依此类推......：）

Answer 3

关注@DimitreNovatchev的优秀解决方案，请注意，如果您的解析器能够使用EXSLT's string extension functions，则可以在XSLT 1.0中使用相同的tokenize概念。

例如，这个支持EXSLT的XSLT 1.0解决方案：

<?xml version="1.0"?>
<xsl:stylesheet
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  xmlns:str="http://exslt.org/strings"
  exclude-result-prefixes="str"
  version="1.0">
  <xsl:output method="xml" omit-xml-declaration="no" indent="yes"/>
  <xsl:strip-space elements="*"/>

  <xsl:template match="/">
    <xsl:copy-of
      select="/*/author[starts-with(str:tokenize(., ' ')[last()], 'C')]" />
  </xsl:template>

</xsl:stylesheet>

...在应用于@Dimitre修改后的输入XML时产生相同的预期结果：

<author>Kurt van Persy Cantor Cagle</author>

如何从XPath中的长字符串中选择限定文本

3 个答案: