Question

鉴于以下符合XML的HTML：

<div>
 <a>a1</a>
 <b>b1</b>
</div>

<div>
 <b>b2</b>
</div>

<div>
 <a>a3</a>
 <b>b3</b>
 <c>c3</c>
</div>

执行//a将返回：

[a1,a3]

上面的问题是第三列数据现在位于第二位，当找不到A时，它被完全跳过。

如何表达xpath以获取将返回的所有A元素：

[a1, null, a3]

//c的情况相同，我想知道是否有可能获得

[null, null, c3]

更新：考虑另一种情况，即没有共同父母<div>。

<h1>heading1</h1>
 <a>a1</a>
 <b>b1</b>


<h1>heading2</h1>
 <b>b2</b>


<h1>heading3</h1>
 <a>a3</a>
 <b>b3</b>
 <c>c3</c>

更新：我现在也可以使用XSLT。

Answer 1

XPath中没有空值。这里有一个半相关的问题，也解释了这一点：http://www.velocityreviews.com/forums/t686805-xpath-query-to-return-null-values.html

实际上，你有三个选择：

根本不要使用XPath。
使用此//a | //div[not(a)]，如果其中没有div，则返回a元素，并让您的Java代码处理任何div返回的'no a元素存在'。根据上下文，这甚至可以允许您在需要时输出更有用的内容，因为您可以访问div的全部内容，例如在div中找到错误'no a元素（某些标识符））”
使用XSLT预处理XML，该XSLT在任何a元素中插入div个元素，而这些元素中没有一个元素具有合适的默认值。

你的第二个案例有点棘手，说实话，我实际上建议不要使用XPath，但可以这样做：

//a | //h1[not(following-sibling::a) or generate-id(.) != generate-id(following-sibling::a[1]/preceding-sibling::h1[1])]

这将匹配任何a元素或任何h1元素，其中在下一个a元素或文档末尾之前不存在后续h1元素。正如Dimitre指出的那样，只有在XSLT中使用它时才有效，因为generate-id是一个XSLT函数。

如果你没有在XLST中使用它，你可以使用这个相当人为的公式：

//a | //h1[not(following-sibling::a) or count(. | preceding-sibling::h1) != count(following-sibling::a[1]/preceding-sibling::h1)]

它的工作原理是匹配h1元素，其中自身和所有前面的h1元素的数量与下一个h1之前的所有a元素的数量不同。在XPath中可能有一种更有效的方法，但是如果它比这更有用，我肯定建议不要使用XPath。

Answer 2

第一个问题的解决方案：

这个XPath表达式：

    /*/div/a
|
    /*/div[not(a)]

根据以下XML文档进行评估时：

<t>
    <div>
        <a>a1</a>
        <b>b1</b>
    </div>
    <div>
        <b>b2</b>
    </div>
    <div>
        <a>a3</a>
        <b>b3</b>
        <c>c3</c>
    </div>
</t>

选择以下三个节点（a，div，a）：

<a>a1</a>
<div>
    <b>b2</b>
</div>
<a>a3</a>

在您的java数组中，任何选定的非a元素都应被视为（或替换为）null。

以下是第二个问题的解决方案：

使用这些XPath表达式从每个组中选择a元素：

对于第一组：

/*/h1[1]
   /following-sibling::a
      [not(/*/h1[2])
     or
       count(.|/*/h1[2]/preceding-sibling::a)
      =
       count(/*/h1[2]/preceding-sibling::a)
      ]

对于第二组：

/*/h1[2]
   /following-sibling::a
      [not(/*/h1[3])
     or
       count(.|/*/h1[3]/preceding-sibling::a)
      =
       count(/*/h1[3]/preceding-sibling::a)
      ]

对于第3组：

/*/h1[3]
   /following-sibling::a
      [not(/*/h1[4])
     or
      count(.|/*/h1[4]/preceding-sibling::a)
      =
       count(/*/h1[4]/preceding-sibling::a)
      ]

如果：

count(/*/h1）

是$cnt，

生成$cnt个这样的表达式（对于i = 1 to $cnt）并评估所有这些表达式。它们每个选定的节点要么包含a元素，要么不包含$k元素。如果a - 组（从评估第k个表达式中选择的节点）包含$k，则使用其字符串值生成所需数组的null项 - - 否则为有用数组的$k项生成<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:output omit-xml-declaration="yes" indent="yes"/> <xsl:template match="/"> <xsl:variable name="vGroup1" select= "/*/h1[1] /following-sibling::a [not(/*/h1[2]) or count(.|/*/h1[2]/preceding-sibling::a) = count(/*/h1[2]/preceding-sibling::a) ] "/> <xsl:variable name="vGroup2" select= "/*/h1[2] /following-sibling::a [not(/*/h1[3]) or count(.|/*/h1[3]/preceding-sibling::a) = count(/*/h1[3]/preceding-sibling::a) ] "/> <xsl:variable name="vGroup3" select= "/*/h1[3] /following-sibling::a [not(/*/h1[4]) or count(.|/*/h1[4]/preceding-sibling::a) = count(/*/h1[4]/preceding-sibling::a) ] "/> Group1: "<xsl:copy-of select="$vGroup1"/>" Group2: "<xsl:copy-of select="$vGroup2"/>" Group3: "<xsl:copy-of select="$vGroup3"/>" </xsl:template> </xsl:stylesheet>。

以下是基于XSLT的上述XPath表达式验证：

<t>
    <h1>heading1</h1>
    <a>a1</a>
    <b>b1</b>

    <h1>heading2</h1>
    <b>b2</b>

    <h1>heading3</h1>
    <a>a3</a>
    <b>b3</b>
    <c>c3</c>
</t>

将此转换应用于以下XML文档时（OP未提供完整且格式良好的XML文档）：

 Group1:  "<a>a1</a>"

 Group2:  ""

 Group3:  "<a>a3</a>"

评估三个XPath表达式，并输出每个表达式的选定节点：

$ns1[count(. | $ns2) = count($ns2)]

<强>解释：

我们使用众所周知的 Kayessian公式作为两个节点集的交集：

$ns1

评估此表达式的结果恰好包含属于节点集$ns2 和节点集$ns1的节点。

剩下的就是用$ns2和$ns1替换与问题相关的表达。

我们将/*/h1[1] /following-sibling::a替换为：

$ns2

我们将/*/h1[2] /preceding-sibling::a替换为：

换句话说，第一个和第二个/*/h1之间的a元素是跟随/*/h1[1]和{的{1}}元素的a元素的交集。位于/*/h1[2]。

之前的{1}}个元素

此表达式仅对最后一个a元素后面的/*/h1元素有问题。这就是为什么我们添加一个额外的谓词，用下面的布尔表达式检查下一个/*/h1元素和or的不存在。

最后，作为Java实现的指导示例，这里是一个完整的XSLT转换，它执行类似的操作 - 生成序列化数组，并可以机械转换为相应的Java解决方案：< / p>

<xsl:stylesheet version="1.0"
         xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
         xmlns:my="my:my">
         <xsl:output method="text"/>

         <my:null>null</my:null>
         <my:Q>"</my:Q>

         <xsl:variable name="vNull" select="document('')/*/my:null"/>
         <xsl:variable name="vQ" select="document('')/*/my:Q"/>

         <xsl:template match="/">
           <xsl:variable name="vGroup1" select=
           "/*/h1[1]
               /following-sibling::a
                  [not(/*/h1[2])
                 or
                   count(.|/*/h1[2]/preceding-sibling::a)
                  =
                   count(/*/h1[2]/preceding-sibling::a)
                  ]
           "/>

           <xsl:variable name="vGroup2" select=
           "/*/h1[2]
               /following-sibling::a
                  [not(/*/h1[3])
                 or
                   count(.|/*/h1[3]/preceding-sibling::a)
                  =
                   count(/*/h1[3]/preceding-sibling::a)
                  ]
           "/>

           <xsl:variable name="vGroup3" select=
           "/*/h1[3]
               /following-sibling::a
                  [not(/*/h1[4])
                 or
                  count(.|/*/h1[4]/preceding-sibling::a)
                  =
                   count(/*/h1[4]/preceding-sibling::a)
                  ]
           "/>

         [<xsl:value-of select=
          "concat($vQ[$vGroup1/self::a[1]],
                  $vGroup1/self::a[1],
                  $vQ[$vGroup1/self::a[1]],
                  $vNull[not($vGroup1/self::a[1])])"/>
          <xsl:text>,</xsl:text>

         <xsl:value-of select=
          "concat($vQ[$vGroup2/self::a[1]],
                  $vGroup2/self::a[1],
                  $vQ[$vGroup2/self::a[1]],
                  $vNull[not($vGroup2/self::a[1])])"/>
          <xsl:text>,</xsl:text>

         <xsl:value-of select=
          "concat($vQ[$vGroup3/self::a[1]],
                  $vGroup3/self::a[1],
                  $vQ[$vGroup3/self::a[1]],
                  $vNull[not($vGroup3/self::a[1])])"/>]
         </xsl:template>
</xsl:stylesheet>

当在同一个XML文档（上面）上应用此转换时，会生成所需的正确结果：

     ["a1",null,"a3"]

<强> UPDATE2 ：

现在OP已经补充说他可以使用XSLT解决方案了。这是一个：

<xsl:stylesheet version="1.0"
         xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
         xmlns:my="my:my" exclude-result-prefixes="xsl">
         <xsl:output omit-xml-declaration="yes" indent="yes"/>

         <xsl:key name="kFollowing" match="a"
              use="generate-id(preceding-sibling::h1[1])"/>

         <my:null/>
         <xsl:variable name="vNull" select="document('')/*/my:null"/>

         <xsl:template match="/*">
           <xsl:copy-of select=
           "h1/following-sibling::a[1]
          |
            h1[not(key('kFollowing', generate-id()))]"/>

           =============================================

           <xsl:apply-templates select="h1"/>

         </xsl:template>

         <xsl:template match="h1">
           <xsl:variable name="vAsInGroup" select=
               "key('kFollowing', generate-id())"/>
           <xsl:copy-of select="$vAsInGroup[1] | $vNull[not($vAsInGroup)]"/>
         </xsl:template>
</xsl:stylesheet>

此转换实现了两种不同的解决方案。不同之处在于用什么元素来表示“null”。在第一种情况下，它是h1元素。建议不要这样做，因为任何h1已经有了自己的含义，这与“代表null”不同。第二种解决方案使用特殊的my:null元素来表示null。

将此转换应用于与上述相同的XML文档时：

<t>
        <h1>heading1</h1>
        <a>a1</a>
        <b>b1</b>

        <h1>heading2</h1>
        <b>b2</b>

        <h1>heading3</h1>
        <a>a3</a>
        <b>b3</b>
        <c>c3</c>
</t>

评估两个XPath表达式（包含XSLT key()引用）中的每一个，并输出所选节点（分别在“========”的上方和下方）：

<a>a1</a> <h1>heading2</h1> <a>a3</a> ============================================= <a>a1</a> <my:null xmlns:my="my:my" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"/> <a>a3</a>

关于效果的说明：

由于使用了密钥，因此在进行多次搜索时，此解决方案将更加高效 - 例如，a，b和c的相应数组时需要制作。

Answer 3

我建议您使用以下内容，可能会将其重写为xsl：function，其中父节点名称（此处为：div）已参数化。

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>

<xsl:template match="/">
    <root>
        <aList><xsl:copy-of select="$divIncludingNulls//a"/></aList>
        <bList><xsl:copy-of select="$divIncludingNulls//b"/></bList>
        <cList><xsl:copy-of select="$divIncludingNulls//c"/></cList>
    </root>
</xsl:template>

<xsl:variable name="divChild" select="distinct-values(//div/*/name())"/>

<xsl:variable name="divIncludingNulls">
    <xsl:for-each select="//div">
        <xsl:variable name="divElt" select="."/>
        <div>
            <xsl:for-each select="$divChild">
                <xsl:variable name="divEltvalue" select="$divElt/*[name()=current()]"/>
                <xsl:element name="{.}">
                    <xsl:choose>
                        <xsl:when test="$divEltvalue"><xsl:value-of select="$divEltvalue"/></xsl:when>
                        <xsl:otherwise>null</xsl:otherwise>
                    </xsl:choose>
                </xsl:element>
            </xsl:for-each>
       </div>
    </xsl:for-each>
</xsl:variable>

</xsl:stylesheet>

申请

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <div>
     <a>a1</a>
     <b>b1</b>
    </div>

    <div>
     <b>b2</b>
    </div>

    <div>
     <a>a3</a>
     <b>b3</b>
     <c>c3</c>
    </div>
</root>

输出

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <aList>
        <a>a1</a>
        <a>null</a>
        <a>a3</a>
    </aList>
    <bList>
        <b>b1</b>
        <b>b2</b>
        <b>b3</b>
    </bList>
    <cList>
        <c>null</c>
        <c>null</c>
        <c>c3</c>
    </cList>
</root>

如何在XSLT中使用XPath获取元素数组，包括缺少的元素？

3 个答案: