需要XSL转换协助

时间:2014-10-08 18:54:49

标签: xml xslt transform

我正在尝试使用XSL 1.0将xml文件转换为另一个xml文件。

源文件包含我需要在输出中删除的不需要的重点元素。在以下示例中,您将看到<emphasis>用于描述<p>内发生的文本。在第一个实例中,<emphasis type="i">包含另一个<emphasis type="i">。在第二个实例中,<emphasis type="i">包含<emphasis>,其中包含另一个<emphasis type="i">

<!-- 1st instance --><para>Text text text text text <emphasis type="i">italicized text 
<emphasis type="i">more italicized text</emphasis>, text text text text text text</emphasis>. 
Text text text text text<!-- 2nd instance --><emphasis type="i"><emphasis>
<emphasis type="i">italicized text</emphasis>, text text text text text </emphasis> text text
text text text text</emphasis> text text text text text text text text text text text text text
text text text text text text text text text text text text text text text text text text text
text text text .</p>

在第一个例子中,我需要保留包装器,剥离第二个<emphasis type="i">并保留文本,使其显示为:

<p>文字文字文字文字<emphasis type="i">斜体文字更倾斜的文字,文字文字文字文字文字</emphasis>

在第二个实例中,我需要保留包装器<emphasis type="i">,但剥离<emphasis>和下一个<emphasis type="i">,以便它显示为:

<emphasis type="i">italicized text, text text text text text text text text text text
text</emphasis> text text text text text text text text text text text text text text text 
text text text text text text text text text text text text text text text text text text text
text .</p>

我尝试过匹配强调并选择@type ='i'时和Child :: emphasis [type ='i']

我尝试过匹配强调/强调,并在@ type ='i'

时选择

我尝试过匹配p并使用for-each select emphasis / emphasis [type ='i']并输出select值。

我尝试过匹配p和for-each select emphasis [type ='i'] / emphasis / emphasis [type ='i']

我不知道如何在没有不需要的<emphasis>的情况下输出文本。我可以提交我使用的代码,如果这将有所帮助,但我真的怀疑它会做任何好事:)

我真诚地感谢任何人都能提供的帮助。

除了斜体属性之外,粗体,下划线上限和引号也可能出现同样的情况。在所有实例中,我需要找到包装器<emphasis>的属性类型,如果嵌套的<emphasis>包含相同的属性,我需要删除它们。嵌套的,不受欢迎的<emphasis>正在我的发布系统中造成严重破坏。

这是一个实际句子结构的例子。我希望这会有所帮助:

`<!-- 1st instance -->``<p>`This is the text that is in the first paragraph that is not 
affected by the emphasis . . .; `<emphasis type="i">`however, this information is 
italicized`<emphasis type="i">`, and this information is also italicized text`</emphasis>`, 
and this is a continuation of the italicized information marked by the first 
emphasis`</emphasis>`.  The paragraph continues and this text is not affected by the 
emphasis . . . `<!-- 2nd instance -->``<emphasis type="i">`Although this is marked as italics,
in the original example, there wasn’t any text appearing here, but`<emphasis>`
`<emphasis type="i">`the text that follows is italicized and is contained within three sets of
emphasis`</emphasis>`, while this text is contained in two,`</emphasis>` this is contained in 
one,`</emphasis>` and finally, this is the remainder of the paragraph.`</p>`

3 个答案:

答案 0 :(得分:0)

主要猜测,但尝试:

XSLT 1.0

<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" encoding="UTF-8" indent="yes"/>
<xsl:strip-space elements="*"/>

<!-- identity transform -->
<xsl:template match="@*|node()">
    <xsl:copy>
        <xsl:apply-templates select="@*|node()"/>
    </xsl:copy>
</xsl:template>


<!--  first instance -->
<xsl:template match="emphasis[@type=parent::emphasis/@type]">
    <xsl:apply-templates select="node()"/>
</xsl:template>


<!--  second instance -->
<xsl:template match="emphasis[parent::emphasis[not(@type)] and @type=ancestor::emphasis[2]/@type]">
    <xsl:apply-templates select="node()"/>
</xsl:template>

<xsl:template match="emphasis[not(@type)]">
    <xsl:apply-templates select="node()"/>
</xsl:template>


</xsl:stylesheet>

应用于您的新示例,结果将是:

<?xml version="1.0" encoding="UTF-8"?>
<!-- 1st instance -->
<p>This is the text that is in the first paragraph that is not 
affected by the emphasis . . .; <emphasis type="i">however, this information is 
italicized, and this information is also italicized text, 
and this is a continuation of the italicized information marked by the first 
emphasis</emphasis>.  The paragraph continues and this text is not affected by the 
emphasis . . . <!-- 2nd instance --><emphasis type="i">Although this is marked as italics,
in the original example, there wasn’t any text appearing here, butthe text that follows is italicized and is contained within three sets of
emphasis, while this text is contained in two, this is contained in 
one,</emphasis> and finally, this is the remainder of the paragraph.</p>

答案 1 :(得分:0)

你可以尝试以下方法:

<xsl:stylesheet version="1.0" 
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>

    <xsl:template match="node()|@*">
        <xsl:copy>
            <xsl:apply-templates select="node()|@*" />
        </xsl:copy>
    </xsl:template>

    <xsl:template match="emphasis[@type=ancestor::emphasis/@type]">
        <xsl:apply-templates/>
    </xsl:template>
</xsl:stylesheet>

它将删除重点标签,其属性值等于其祖先的重点属性值。

答案 2 :(得分:0)

此解决方案由同事提供,似乎解决了我的问题:

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
 <xsl:output method="xml" omit-xml-declaration="yes" indent="yes"/>
 <xsl:template match="@* | node()">
  <xsl:copy>
   <xsl:apply-templates select="@* | node()"/>
  </xsl:copy>
 </xsl:template>
 <xsl:template match="*[name(..)=name()]">
  <xsl:apply-templates/>
 </xsl:template>
</xsl:stylesheet>

非常感谢你愿意提供帮助!