XSLT:将兄弟文本节点移动到选定节点以进行XLIFF修复

时间:2013-01-09 00:11:22

标签: xml xslt xpath xliff

经过几个小时的XSLT研究,我承认失败了!我需要修复大量的.xlf XLIFF翻译文件,这些翻译文件已经从一个未命名的翻译工具中返回给我们。理想情况下,我会使用批处理工具将XSL转换应用于它们。

以下是其中一个XLIFF文件的片段:

<body>
    <trans-unit id="1" phase-name="pretrans" restype="x-h3">
        <source>Adding, Deleting or Modifying Notes in the Call Description</source>
        <seg-source>Adding, Deleting or Modifying Notes in the Call Description</seg-source>
        <target state="final">Добавление, удаление и изменение примечаний в описании звонка</target>
    </trans-unit>
    <trans-unit id="2" phase-name="pretrans" restype="x-p">
        <source>Description of Fields on RHS</source>
        <seg-source>Description of Fields on RHS</seg-source>
        <target state="final">Поле описания в правой части</target>
    </trans-unit>
    <trans-unit id="3" phase-name="pretrans" restype="x-p">
        <source>You can add descriptive text notes to a call recording, if you have the appropriate privileges to do so. These notes are visible to all users who have access to the call recording. It is recommended that each user add their initials to the notes to avoid potential confusion.</source>
        <seg-source>
            <mrk mtype="seg" mid="1">You can add descriptive text notes to a call recording, if you have the appropriate privileges to do so.</mrk>
            <mrk mtype="seg" mid="2">These notes are visible to all users who have access to the call recording.</mrk>
            <mrk mtype="seg" mid="3">It is recommended that each user add their initials to the notes to avoid potential confusion.</mrk>
        </seg-source>
        <target state="final">
          <mrk mtype="seg" mid="1" /><ph ctype="" id="1">&lt;MadCap:variable name="zoom_userdocs_variables.var_product_name" xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd" /&gt;</ph> позволяет находить телефонные взаимодействия, содержащие или не содержащие определенные фразы.
          <mrk mtype="seg" mid="2" />Каждая речевая метка содержит одну или несколько таких фраз.
          <mrk mtype="seg" mid="3" />Ядро <ph ctype="" id="3">&lt;MadCap:variable name="zoom_userdocs_variables.var_product_name" xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd" /&gt;</ph> индексирует медиафайлы и помечает места вхождения фразы (добавляет к ним метки).
          <mrk mtype="seg" mid="4" />Затем нужные медиафайлы можно искать по связанным с ними меткам.
        </target>
    </trans-unit>
    <trans-unit id="4" phase-name="pretrans" restype="x-p">
        <source>To add, delete, or modify text in the description field, click inside the description field.</source>
        <seg-source>To add, delete, or modify text in the description field, click inside the description field.</seg-source>
        <target state="final">Чтобы добавить, удалить или изменить текст в поле описания, щелкните это поле.</target>
    </trans-unit>
</body>

请注意第三个target节点中的trans-unit标记。 mrk标签应该包含现在已成为兄弟节点的文本节点(与之前的seg-source标签相比,这仍然是正确的),弄乱了结构。

因此,我尝试识别不包含文本节点的任何mrk标记,并将以下文本节点移回其中。

这是期望的结果:

<body>
    <trans-unit id="1" phase-name="pretrans" restype="x-h3">
        <source>Adding, Deleting or Modifying Notes in the Call Description</source>
        <seg-source>Adding, Deleting or Modifying Notes in the Call Description</seg-source>
        <target state="final">Добавление, удаление и изменение примечаний в описании звонка</target>
    </trans-unit>
    <trans-unit id="2" phase-name="pretrans" restype="x-p">
        <source>Description of Fields on RHS</source>
        <seg-source>Description of Fields on RHS</seg-source>
        <target state="final">Поле описания в правой части</target>
    </trans-unit>
    <trans-unit id="3" phase-name="pretrans" restype="x-p">
        <source>You can add descriptive text notes to a call recording, if you have the appropriate privileges to do so. These notes are visible to all users who have access to the call recording. It is recommended that each user add their initials to the notes to avoid potential confusion.</source>
        <seg-source>
            <mrk mtype="seg" mid="1">You can add descriptive text notes to a call recording, if you have the appropriate privileges to do so.</mrk>
            <mrk mtype="seg" mid="2">These notes are visible to all users who have access to the call recording.</mrk>
            <mrk mtype="seg" mid="3">It is recommended that each user add their initials to the notes to avoid potential confusion.</mrk>
        </seg-source>
        <target state="final">
            <mrk mtype="seg" mid="1"><ph ctype="" id="1">&lt;MadCap:variable name="zoom_userdocs_variables.var_product_name" xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd" /&gt;</ph> позволяет находить телефонные взаимодействия, содержащие или не содержащие определенные фразы.</mrk>
            <mrk mtype="seg" mid="2">Каждая речевая метка содержит одну или несколько таких фраз.</mrk>
            <mrk mtype="seg" mid="3">Ядро <ph ctype="" id="3">&lt;MadCap:variable name="zoom_userdocs_variables.var_product_name" xmlns:MadCap="http://www.madcapsoftware.com/Schemas/MadCap.xsd" /&gt;</ph> индексирует медиафайлы и помечает места вхождения фразы (добавляет к ним метки).</mrk>
            <mrk mtype="seg" mid="4">Затем нужные медиафайлы можно искать по связанным с ними меткам.</mrk>
        </target>
    </trans-unit>
    <trans-unit id="4" phase-name="pretrans" restype="x-p">
        <source>To add, delete, or modify text in the description field, click inside the description field.</source>
        <seg-source>To add, delete, or modify text in the description field, click inside the description field.</seg-source>
        <target state="final">Чтобы добавить, удалить или изменить текст в поле описания, щелкните это поле.</target>
    </trans-unit>
</body>

我通常会在Perl中使用LibXML或类似方法执行此操作,但我确信这对XSLT来说是一个简单的任务。我已经搜索过类似的解决方案,但找不到任何可以工作的方法。

另外需要注意的一点 - 虽然这里的'漂亮打印',最终的body节点定义都在一行上。

谢谢!我期待着学到新东西!

编辑:更新了上面的源代码,以显示<target>元素中必须保留的其他子标记。 编辑2:添加了所需的结果。

1 个答案:

答案 0 :(得分:2)

试试这个XSLT:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <xsl:output method="xml" indent="yes"/>

  <xsl:template match="@* | node()">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="trans-unit/target/mrk[following-sibling::text()]">
    <xsl:copy>
      <xsl:apply-templates select="@* | node()"/>
      <xsl:value-of select="following-sibling::text()"/>
    </xsl:copy>
  </xsl:template>

  <xsl:template match="trans-unit/target/text()"/>

</xsl:stylesheet>

可能会产生预期的结果:

<body>
    <trans-unit id="1" phase-name="pretrans" restype="x-h3">
        <source>Adding, Deleting or Modifying Notes in the Call Description</source>
        <seg-source>Adding, Deleting or Modifying Notes in the Call Description</seg-source>
        <target state="final" />
    </trans-unit>
    <trans-unit id="2" phase-name="pretrans" restype="x-p">
        <source>Description of Fields on RHS</source>
        <seg-source>Description of Fields on RHS</seg-source>
        <target state="final" />
    </trans-unit>
    <trans-unit id="3" phase-name="pretrans" restype="x-p">
        <source>You can add descriptive text notes to a call recording, if you have the appropriate privileges to do so. These notes are visible to all users who have access to the call recording. It is recommended that each user add their initials to the notes to avoid potential confusion.</source>
        <seg-source>
            <mrk mtype="seg" mid="1">You can add descriptive text notes to a call recording, if you have the appropriate privileges to do so.</mrk>
            <mrk mtype="seg" mid="2">These notes are visible to all users who have access to the call recording.</mrk>
            <mrk mtype="seg" mid="3">It is recommended that each user add their initials to the notes to avoid potential confusion.</mrk>
        </seg-source>
        <target state="final"><mrk mtype="seg" mid="1">При наличии соответствующих прав можно добавить описательные текстовые примечания к записи звонка.
            </mrk><mrk mtype="seg" mid="2">Эти примечания видны для всех пользователей, которые имеют доступ к записи звонка.
            </mrk><mrk mtype="seg" mid="3">Во избежание возможной путаницы каждому пользователю рекомендуется к примечаниям добавлять свои инициалы.
        </mrk></target>
    </trans-unit>
    <trans-unit id="4" phase-name="pretrans" restype="x-p">
        <source>To add, delete, or modify text in the description field, click inside the description field.</source>
        <seg-source>To add, delete, or modify text in the description field, click inside the description field.</seg-source>
        <target state="final" />
    </trans-unit>
</body>