努力让以下工作:我正在尝试合并已翻译的节点,但由于有时节点集之间存在细微的差异,我无法做到这一点,只需蒙上眼睛并进行人工审核。然而,与此同时,我喜欢让自己的生活变得简单,所以我想尽可能地自动化。以下为例:
<root>
<chapter>
<string class="l1"><local xml:lang="en">Some English here</local></string>
<string class="p"><local xml:lang="en">Some other English here</local></string>
<string class="p"><local xml:lang="en">and some English here</local></string>
<string class="p"><local xml:lang="en">Some English here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="fr">Some English translated to French here</local></string>
<string class="p"><local xml:lang="fr">Some other English translated to French here</local></string>
<string class="p"><local xml:lang="fr">and some English translated to French here</local></string>
<string class="p"><local xml:lang="fr">Some English translated to French here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="de">Some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some other English translated to German here</local></string>
<string class="another_class"><local xml:lang="de">and some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some English translated to German here</local></string>
</chapter>
<chapter>
<string class="l1"><local xml:lang="nl">Some English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">Some other English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local></string>
</chapter>
</root>
实际文件可以包含30种语言和数百个节点,因此上面的示例非常简单。
我想用这个例子来实现的是合并英语和法语,因为它们具有相同数量的元素,并且所有属性也相等。法语应该保持原样,因为并非所有属性都匹配,荷兰语应该保持原样,因为元素的数量不匹配。
因此输出应如下所示:
<root>
<!-- French has the same amount of elements, and a full sequential match of attributes, so we can merge -->
<chapter>
<string class="l1">
<local xml:lang="en">Some English here</local>
<local xml:lang="fr">Some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">Some other English here</local>
<local xml:lang="fr">Some other English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">and some English here</local>
<local xml:lang="fr">and some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">Some English here</local>
<local xml:lang="fr">Some English translated to French here</local>
</string>
</chapter>
<!-- German has same amount of elements, but different tag sequence, so we leave it for manual review -->
<chapter>
<string class="l1"><local xml:lang="de">Some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some other English translated to German here</local></string>
<string class="another_class"><local xml:lang="de">and some English translated to German here</local></string>
<string class="p"><local xml:lang="de">Some English translated to German here</local></string>
</chapter>
<!-- Dutch has same same tag sequence but less elements, so we leave it for manual review-->
<chapter>
<string class="l1"><local xml:lang="nl">Some English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">Some other English translated to Dutch here</local></string>
<string class="p"><local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local></string>
</chapter>
</root>
英语始终是主引用,因此我已经可以通过使用英语nodecount作为比较来排除大小不同的节点集,只是不知道如何检查所有属性值是否也相等。
有什么建议吗? (使用xslt2)
谢谢!
答案 0 :(得分:1)
以下是一个示例XSLT 2.0样式表:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:variable
name="master"
select="root/chapter[string/local/@xml:lang = 'en']"/>
<xsl:variable
name="matches"
select="root/chapter[not(string/local/@xml:lang = 'en')]
[count(string) eq count($master/string)
and
(every $i in (1 to count($master/string))
satisfies $master/string[$i]/@class eq string[$i]/@class)]"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* , node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="chapter[. intersect $master]">
<xsl:copy>
<xsl:apply-templates select="string"/>
</xsl:copy>
</xsl:template>
<xsl:template match="string[local/@xml:lang = 'en']">
<xsl:variable name="pos" select="position()"/>
<xsl:copy>
<xsl:apply-templates select="@* | local | $matches/string[$pos]/local"/>
</xsl:copy>
</xsl:template>
<xsl:template match="chapter[. intersect $matches]"/>
</xsl:stylesheet>
当我将Saxon 9.4应用于您发布的输入时,我得到了结果
<root>
<chapter>
<string class="l1">
<local xml:lang="en">Some English here</local>
<local xml:lang="fr">Some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">Some other English here</local>
<local xml:lang="fr">Some other English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">and some English here</local>
<local xml:lang="fr">and some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">Some English here</local>
<local xml:lang="fr">Some English translated to French here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="de">Some English translated to German here</local>
</string>
<string class="p">
<local xml:lang="de">Some other English translated to German here</local>
</string>
<string class="another_class">
<local xml:lang="de">and some English translated to German here</local>
</string>
<string class="p">
<local xml:lang="de">Some English translated to German here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="nl">Some English translated to Dutch here</local>
</string>
<string class="p">
<local xml:lang="nl">Some other English translated to Dutch here</local>
</string>
<string class="p">
<local xml:lang="nl">and some English translated to Dutch here<br/>Some English translated to Dutch here</local>
</string>
</chapter>
</root>
答案 1 :(得分:0)
此转化:
<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:variable name="vENSignature" select="string-join(/*/*[1]/*/@class, '+')"/>
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>
<xsl:template match="/*">
<root>
<xsl:for-each-group select="chapter"
group-adjacent="string-join(*/@class, '+') eq $vENSignature">
<xsl:choose>
<xsl:when test="current-grouping-key() eq true()">
<chapter>
<xsl:apply-templates select="*"/>
</chapter>
</xsl:when>
<xsl:otherwise>
<xsl:sequence select="current-group()"/>
</xsl:otherwise>
</xsl:choose>
</xsl:for-each-group>
</root>
</xsl:template>
<xsl:template match="chapter/*" >
<xsl:variable name="vPos" select="position()"/>
<xsl:copy>
<xsl:sequence select="@*, current-group()/*[position() = $vPos]/*"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
应用于提供的XML文档时:
<root>
<chapter>
<string class="l1">
<local xml:lang="en">Some English here</local>
</string>
<string class="p">
<local xml:lang="en">Some other English here</local>
</string>
<string class="p">
<local xml:lang="en">and some English here</local>
</string>
<string class="p">
<local xml:lang="en">Some English here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="fr">Some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="fr">Some other English translated to French here</local>
</string>
<string class="p">
<local xml:lang="fr">and some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="fr">Some English translated to French here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="de">Some English translated to German here</local>
</string>
<string class="p">
<local xml:lang="de">Some other English translated to German here</local>
</string>
<string class="another_class">
<local xml:lang="de">and some English translated to German here</local>
</string>
<string class="p">
<local xml:lang="de">Some English translated to German here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="nl">Some English translated to Dutch here</local>
</string>
<string class="p">
<local xml:lang="nl">Some other English translated to Dutch here</local>
</string>
<string class="p">
<local xml:lang="nl">and some English translated to Dutch here
<br/>Some English translated to Dutch here
</local>
</string>
</chapter>
</root>
会产生想要的正确结果:
<root>
<chapter>
<string class="l1">
<local xml:lang="en">Some English here</local>
<local xml:lang="fr">Some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">Some other English here</local>
<local xml:lang="fr">Some other English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">and some English here</local>
<local xml:lang="fr">and some English translated to French here</local>
</string>
<string class="p">
<local xml:lang="en">Some English here</local>
<local xml:lang="fr">Some English translated to French here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="de">Some English translated to German here</local>
</string>
<string class="p">
<local xml:lang="de">Some other English translated to German here</local>
</string>
<string class="another_class">
<local xml:lang="de">and some English translated to German here</local>
</string>
<string class="p">
<local xml:lang="de">Some English translated to German here</local>
</string>
</chapter>
<chapter>
<string class="l1">
<local xml:lang="nl">Some English translated to Dutch here</local>
</string>
<string class="p">
<local xml:lang="nl">Some other English translated to Dutch here</local>
</string>
<string class="p">
<local xml:lang="nl">and some English translated to Dutch here
<br/>Some English translated to Dutch here
</local>
</string>
</chapter>
</root>
<强>解释强>:
我们定义并使用chapter
的“签名”属性 - 这是其子级的class
属性值的序列。
我们根据其签名是否与“英文签名”相同的事实对所有chapter
元素进行分组。
我们合并了签名等于“英文签名”的群组中的chapter
元素。
我们复制了另一组中chapter
元素的未更改。