仅当所有属性相等时才合并节点

时间:2012-06-21 09:05:26

标签: xslt xslt-2.0

努力让以下工作:我正在尝试合并已翻译的节点,但由于有时节点集之间存在细微的差异,我无法做到这一点,只需蒙上眼睛并进行人工审核。然而,与此同时,我喜欢让自己的生活变得简单,所以我想尽可能地自动化。以下为例:

<root>
<chapter>
<string class="l1"><local xml:lang="en">Some English here</local></string>
<string class="p"><local xml:lang="en">Some other English here</local></string>
<string class="p"><local xml:lang="en">and some English here</local></string>
<string class="p"><local xml:lang="en">Some English here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="fr">Some English translated to French here</local></string>
<string class="p"><local xml:lang="fr">Some other English translated to French here</local></string>
<string class="p"><local xml:lang="fr">and some English translated to French here</local></string>
<string class="p"><local xml:lang="fr">Some English translated to French here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="de">Some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some other English translated to German here</local></string>
<string class="another_class"><local xml:lang="de">and some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some English translated to German here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="nl">Some English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">Some other English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local></string>
</chapter>
</root>

实际文件可以包含30种语言和数百个节点,因此上面的示例非常简单。

我想用这个例子来实现的是合并英语和法语,因为它们具有相同数量的元素,并且所有属性也相等。法语应该保持原样,因为并非所有属性都匹配,荷兰语应该保持原样,因为元素的数量不匹配。

因此输出应如下所示:

<root>
<!-- French has the same amount of elements, and a full sequential match of attributes, so we can merge -->
<chapter>
<string class="l1">
    <local xml:lang="en">Some English here</local>
    <local xml:lang="fr">Some English translated to French here</local>
</string>
<string class="p">
    <local xml:lang="en">Some other English here</local>
    <local xml:lang="fr">Some other English translated to French here</local>
</string>
<string class="p">
    <local xml:lang="en">and some English here</local>
    <local xml:lang="fr">and some English translated to French here</local>
</string>
<string class="p">
    <local xml:lang="en">Some English here</local>
    <local xml:lang="fr">Some English translated to French here</local>
</string>
</chapter>
<!-- German has same amount of elements, but different tag sequence, so we leave it for manual review -->
<chapter>
<string class="l1"><local xml:lang="de">Some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some other English translated to German here</local></string>
<string class="another_class"><local xml:lang="de">and some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some English translated to German here</local></string>
</chapter>
<!-- Dutch has same same tag sequence but less elements, so we leave it for manual review-->
<chapter>
<string class="l1"><local xml:lang="nl">Some English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">Some other English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local></string>
</chapter>
</root>

英语始终是主引用,因此我已经可以通过使用英语nodecount作为比较来排除大小不同的节点集,只是不知道如何检查所有属性值是否也相等。

有什么建议吗? (使用xslt2)

谢谢!

2 个答案:

答案 0 :(得分:1)

以下是一个示例XSLT 2.0样式表:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:variable 
  name="master" 
  select="root/chapter[string/local/@xml:lang = 'en']"/>


<xsl:variable 
  name="matches" 
  select="root/chapter[not(string/local/@xml:lang = 'en')]
    [count(string) eq count($master/string)
     and 
      (every $i in (1 to count($master/string))
       satisfies $master/string[$i]/@class eq string[$i]/@class)]"/>

<xsl:template match="@* | node()">
  <xsl:copy>
    <xsl:apply-templates select="@* , node()"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="chapter[. intersect $master]">
  <xsl:copy>
    <xsl:apply-templates select="string"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="string[local/@xml:lang = 'en']">
  <xsl:variable name="pos" select="position()"/>
  <xsl:copy>
    <xsl:apply-templates select="@* | local | $matches/string[$pos]/local"/>
  </xsl:copy>
</xsl:template>

<xsl:template match="chapter[. intersect $matches]"/>

</xsl:stylesheet>

当我将Saxon 9.4应用于您发布的输入时,我得到了结果

<root>
   <chapter>
      <string class="l1">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some other English here</local>
         <local xml:lang="fr">Some other English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">and some English here</local>
         <local xml:lang="fr">and some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
   </chapter>
   <chapter>
      <string class="l1">
         <local xml:lang="de">Some English translated to German here</local>
      </string>
      <string class="p">
         <local xml:lang="de">Some other English translated to German here</local>
      </string>
      <string class="another_class">
         <local xml:lang="de">and some English translated to German here</local>
      </string>
      <string class="p">
         <local xml:lang="de">Some English translated to German here</local>
      </string>
   </chapter>
   <chapter>
      <string class="l1">
         <local xml:lang="nl">Some English translated to Dutch here</local>
      </string>
      <string class="p">
         <local xml:lang="nl">Some other English translated to Dutch here</local>
      </string>
      <string class="p">
         <local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local>
      </string>
   </chapter>
</root>

答案 1 :(得分:0)

此转化

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>

 <xsl:variable name="vENSignature" select="string-join(/*/*[1]/*/@class, '+')"/>
 <xsl:template match="node()|@*">
     <xsl:copy>
       <xsl:apply-templates select="node()|@*"/>
     </xsl:copy>
 </xsl:template>

 <xsl:template match="/*">
  <root>
   <xsl:for-each-group select="chapter"
    group-adjacent="string-join(*/@class, '+') eq $vENSignature">
     <xsl:choose>
       <xsl:when test="current-grouping-key() eq true()">
             <chapter>
              <xsl:apply-templates select="*"/>
            </chapter>
        </xsl:when>
        <xsl:otherwise>
          <xsl:sequence select="current-group()"/>
        </xsl:otherwise>
    </xsl:choose>
   </xsl:for-each-group>
  </root>
 </xsl:template>

 <xsl:template match="chapter/*" >
  <xsl:variable name="vPos" select="position()"/>
  <xsl:copy>
    <xsl:sequence select="@*, current-group()/*[position() = $vPos]/*"/>
  </xsl:copy>
 </xsl:template>
</xsl:stylesheet>

应用于提供的XML文档时:

<root>
    <chapter>
        <string class="l1">
            <local xml:lang="en">Some English here</local>
        </string>
        <string class="p">
            <local xml:lang="en">Some other English here</local>
        </string>
        <string class="p">
            <local xml:lang="en">and some English here</local>
        </string>
        <string class="p">
            <local xml:lang="en">Some English here</local>
        </string>
    </chapter>
    <chapter>
        <string class="l1">
            <local xml:lang="fr">Some English translated to French here</local>
        </string>
        <string class="p">
            <local xml:lang="fr">Some other English translated to French here</local>
        </string>
        <string class="p">
            <local xml:lang="fr">and some English translated to French here</local>
        </string>
        <string class="p">
            <local xml:lang="fr">Some English translated to French here</local>
        </string>
    </chapter>
    <chapter>
        <string class="l1">
            <local xml:lang="de">Some English translated to German here</local>
        </string>
        <string class="p">
            <local xml:lang="de">Some other English translated to German here</local>
        </string>
        <string class="another_class">
            <local xml:lang="de">and some English translated to German here</local>
        </string>
        <string class="p">
            <local xml:lang="de">Some English translated to German here</local>
        </string>
    </chapter>
    <chapter>
        <string class="l1">
            <local xml:lang="nl">Some English translated to Dutch here</local>
        </string>
        <string class="p">
            <local xml:lang="nl">Some other English translated to Dutch here</local>
        </string>
        <string class="p">
            <local xml:lang="nl">and some English translated to Dutch here
                <br/>Some English translated to Dutch here
            </local>
        </string>
    </chapter>
</root>

会产生想要的正确结果:

<root>
   <chapter>
      <string class="l1">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some other English here</local>
         <local xml:lang="fr">Some other English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">and some English here</local>
         <local xml:lang="fr">and some English translated to French here</local>
      </string>
      <string class="p">
         <local xml:lang="en">Some English here</local>
         <local xml:lang="fr">Some English translated to French here</local>
      </string>
   </chapter>
   <chapter>
            <string class="l1">
                  <local xml:lang="de">Some English translated to German here</local>
            </string>
            <string class="p">
                  <local xml:lang="de">Some other English translated to German here</local>
            </string>
            <string class="another_class">
                  <local xml:lang="de">and some English translated to German here</local>
            </string>
            <string class="p">
                  <local xml:lang="de">Some English translated to German here</local>
            </string>
      </chapter>
   <chapter>
            <string class="l1">
                  <local xml:lang="nl">Some English translated to Dutch here</local>
            </string>
            <string class="p">
                  <local xml:lang="nl">Some other English translated to Dutch here</local>
            </string>
            <string class="p">
                  <local xml:lang="nl">and some English translated to Dutch here
                <br/>Some English translated to Dutch here
            </local>
            </string>
      </chapter>
</root>

<强>解释

  1. 我们定义并使用chapter的“签名”属性 - 这是其子级的class属性值的序列。

  2. 我们根据其签名是否与“英文签名”相同的事实对所有chapter元素进行分组。

  3. 我们合并了签名等于“英文签名”的群组中的chapter元素。

  4. 我们复制了另一组中chapter元素的未更改。