Question

我有一个XML文档，基于Excel在保存为“XML Spreadsheet 2003（* .xml）”时产生的内容。

电子表格本身包含一个带标签层次结构的标题部分：

 | A     B     C     D     E     F     G     H     I
-+-----------------------------------------------------
1| a1                                  a2
2| a11         a12         a13         a21   a22
3| a111  a112  a121  a122  a131  a132        a221  a222

此层次结构存在于工作簿中的所有工作表上，并且在任何地方看起来都差不多。

Excel XML与普通HTML表格完全相同。（<row> s包含<cell> s。我已经能够将所有内容转换为这样的树结构：

<node title="a1" col="1">
  <node title="a11" col="1">
    <node title="a111" col="1"/>
    <node title="a112" col="2"/>
  </node>
  <node title="a12" col="3">
    <node title="a121" col="3" />
    <node title="a122" col="4" />
  </node>
  <!-- and so on -->
</node>

但这是并发症：

有多个工作表，因此每个工作表都有一个树
每张纸上的层次结构可能略有不同，树木不相等（例如，纸张2可能具有“a113”，而其他纸张则不相同）
树深度没有明确限制
但是所有工作表上的标签都是相同的，这意味着它们可以用于分组

我想将这些单独的树合并为一个看起来像这样的树：

<node title="a1">
  <col on="sheet1">1</col>
  <col on="sheet2">1</col>
  <node title="a11">
    <col on="sheet1">1</col>
    <col on="sheet2">1</col>
    <node title="a111">
      <col on="sheet1">1</col>
      <col on="sheet2">1</col>
    </node>
    <node title="a112">
      <col on="sheet1">2</col>
      <col on="sheet2">2</col>
    </node>
    <node title="a113"><!-- different here -->
      <col on="sheet2">3</col>
    </node>
  </node>
  <node title="a12">
    <col on="sheet1">3</col>
    <col on="sheet2">4</col>
    <node title="a121">
      <col on="sheet1">3</col>
      <col on="sheet2">4</col>
    </node>
    <node title="a122">
      <col on="sheet1">4</col>
      <col on="sheet2">5</col>
    </node>
  </node>
  <!-- and so on -->
</node>

理想情况下，我希望能够在之前进行合并我甚至可以从Excel XML构建三个结构（如果你让我开始这个，它会很棒）。但由于我不知道如何做到这一点，树木建成后的合并（即：上述情况）将没问题。

感谢您的时间。：）

Answer 1

以下是XSLT 1.0中的一种可能解决方案：

<xsl:stylesheet version="1.0"
 xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="/*">
      <t>
        <xsl:apply-templates
           select="node[@title='a1'][1]">
          <xsl:with-param name="pOther"
            select="node[@title='a1'][2]"/>
        </xsl:apply-templates>
      </t>
    </xsl:template>

    <xsl:template match="node">
      <xsl:param name="pOther"/>

      <node title="{@title}">
        <col on="sheet1">
          <xsl:value-of select="@col"/>
        </col>
          <xsl:choose>
            <xsl:when test="not($pOther)">
              <xsl:apply-templates mode="copy">
                <xsl:with-param name="pSheet" select="'sheet1'"/>
              </xsl:apply-templates>
            </xsl:when>
            <xsl:otherwise>
              <col on="sheet2">
                <xsl:value-of select="$pOther/@col"/>
              </col>
              <xsl:for-each select=
                "node[@title = $pOther/node/@title]">

                <xsl:apply-templates select=".">
                  <xsl:with-param name="pOther" select=
                   "$pOther/node[@title = current()/@title]"/>
                </xsl:apply-templates>
              </xsl:for-each>

              <xsl:apply-templates mode="copy" select=
                "node[not(@title = $pOther/node/@title)]">
                <xsl:with-param name="pSheet" select="'sheet1'"/>
              </xsl:apply-templates>

              <xsl:apply-templates mode="copy" select=
                "$pOther/node[not(@title = current()/node/@title)]">
                <xsl:with-param name="pSheet" select="'sheet2'"/>
              </xsl:apply-templates>
            </xsl:otherwise>
          </xsl:choose>
      </node>
    </xsl:template>

    <xsl:template match="node" mode="copy">
      <xsl:param name="pSheet"/>

      <node title="{@title}">
        <col on="{$pSheet}">
          <xsl:value-of select="@col"/>
        </col>

        <xsl:apply-templates select="node" mode="copy">
          <xsl:with-param name="pSheet" select="$pSheet"/>
        </xsl:apply-templates>
      </node>
    </xsl:template>
</xsl:stylesheet>

当对此XML文档应用上述转换（在一个公共顶级节点下连接两个XML文档时 - 作为读者的练习:)：

<t>
    <node title="a1" col="1">
        <node title="a11" col="1">
            <node title="a111" col="1"/>
            <node title="a112" col="2"/>
        </node>
        <node title="a12" col="3">
            <node title="a121" col="3" />
            <node title="a122" col="4" />
        </node>
        <!-- and so on -->
    </node>
    <node title="a1" col="1">
        <node title="a11" col="1">
            <node title="a111" col="1"/>
            <node title="a112" col="2"/>
            <node title="a113" col="3"/>
        </node>
        <node title="a12" col="4">
            <node title="a121" col="4" />
            <node title="a122" col="5" />
        </node>
        <!-- and so on -->
    </node>
</t>

产生了想要的结果：

<t>
    <node title="a1">
        <col on="sheet1">1</col>
        <col on="sheet2">1</col>
        <node title="a11">
            <col on="sheet1">1</col>
            <col on="sheet2">1</col>
            <node title="a111">
                <col on="sheet1">1</col>
                <col on="sheet2">1</col>
            </node>
            <node title="a112">
                <col on="sheet1">2</col>
                <col on="sheet2">2</col>
            </node>
            <node title="a113">
                <col on="sheet2">3</col>
            </node>
        </node>
        <node title="a12">
            <col on="sheet1">3</col>
            <col on="sheet2">4</col>
            <node title="a121">
                <col on="sheet1">3</col>
                <col on="sheet2">4</col>
            </node>
            <node title="a122">
                <col on="sheet1">4</col>
                <col on="sheet2">5</col>
            </node>
        </node>
    </node>
</t>

请注意以下内容：

我们假设两个热门node元素都有"a1"作为其title属性的值。这很容易概括。
匹配node的模板有一个名为pOther的参数，它是另一个文档中名为node的对应元素。只有在存在$ pOther时才会应用此模板。
如果不存在名为node的相应元素，则会应用另一个匹配node但模式为copy的模板。此模板有一个名为pSheet的参数，其值是此元素所属的工作表名称（字符串）。

Answer 2

可调用模板如何将工作表编号作为参数进行检查，该模板检查输入并返回正确的“col”节点（如果它出现在该工作表的XML中），如果不存在则返回任何内容。在每个节点，为每张表调用一次。

要合并树，可能是一个模板，用于在任何工作表中查找当前节点的所有子节点，并为每个节点递归。

很抱歉没有示例代码，我发现编写XSLT非常慢，可能是因为我不经常这样做。所以我可能错过了至关重要的事情。但将它们放在一起会产生类似的结果：

获取“/ node”的标题。有了这个标题：
- 在所有工作表中搜索此标题，为每个工作表发出“col”节点
- 搜索具有此标题的节点子节点的所有工作表（丢弃重复项）
- 递归每个标题。

以下是一些以各种方式删除重复项的代码段：

http://www.dpawson.co.uk/xsl/sect2/N2696.html

读取多个文档是依赖于处理器的，但是如果所有其他文档都失败了，那么任何旧的脚本语言都可能会做一些切割，如果您知道它们都具有相同的编码，请不要使用冲突的ID，等等。

XSLT：合并一组树层次结构

2 个答案: