如何使用XSLT将richtext XML列表呈现为格式良好的HTML

时间:2016-10-17 22:06:46

标签: xml xslt richtext

我拥有从旧版Lotus Notes应用程序中提取的XML数据,并且嵌入了富文本格式。我很难将richtext列表渲染为格式良好的HTML。

问题是每个列表都没有结束标记来指示列表何时结束。但是,每个列表都有一个开始标记,其中包含指示列表开头的唯一ID,每个列表项都有一个与列表ID匹配的属性。 richtext有很多噪音(垃圾段落),经常散布在合法的列表项之间,需要被忽视。

我的XSLT受到来自@ Tim-C的this solution的启发,但是它没有用。

这是XML:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="NoBullet6.xslt"?>
<document>
    <item name="Unordered list">
        <richtext>
            <pardef/>
            <par def="20">
                <run>This is the first </run>
                <run>paragraph of the preamble.</run>
            </par>
            <par>
                <run>This is the second paragraph of the </run>
                <run>preamble.</run>
            </par>
            <pardef id="21" list="unordered"/>
            <par def="21">
                <run>This is the </run>
                <run>first bullet.</run>
            </par>
            <par def="20">
                <run/>
                <!-- This is an empty paragraph/garbage data -->
            </par>
            <par>
                <run>This is the second </run>
                <run>bullet.</run>
            </par>
            <par def="20">
                <run>This is the first </run>
                <run>paragraph of the conclusion.</run>
            </par>
            <par>
                <run>This is the second paragraph of the </run>
                <run>conclusion.</run>
            </par>
        </richtext>
    </item>
    <item name="Ordered list">
        <richtext>
            <pardef/>
            <par def="20">
                <run>This is the first </run>
                <run>paragraph of the preamble.</run>
            </par>
            <par>
                <run>This is the second paragraph of the </run>
                <run>preamble.</run>
            </par>
            <pardef id="46" list="ordered"/>
            <par def="46">
                <run>This is the </run>
                <run>first numbered item.</run>
            </par>
            <par def="47">
                <run/>
                <!-- This is an empty paragraph/garbage data -->
            </par>
            <par def="46">
                <run>This is the another </run>
                <run>numbered item.</run>
            </par>
            <par def="20">
                <run>This is the first </run>
                <run>paragraph of the conclusion.</run>
            </par>
            <par>
                <run>This is the second paragraph of the </run>
                <run>conclusion.</run>
            </par>
        </richtext>
    </item>
</document>

这是所需的输出:

<html>
  <body>
     <table border="1">
        <tr>
           <td>Unordered list</td>
           <td>
              <p>This is the first paragraph of the preamble.</p>
              <p>This is the second paragraph of the preamble.</p>
              <ul>
                 <li>This is the first bullet.</li>
                 <li>This is the second bullet.</li>
              </ul>
              <p>This is the first paragraph of the conclusion.</p>
              <p>This is the second paragraph of the conclusion.</p>
           </td>
        </tr>
        <tr>
           <td>Ordered list</td>
           <td>
              <p>This is the first paragraph of the preamble.</p>
              <p>This is the second paragraph of the preamble.</p>
              <ol>
                 <li>This is the first numbered item.</li>
                 <li>This is the another numbered item.</li>
              </ol>
              <p>This is the first paragraph of the conclusion.</p>
              <p>This is the second paragraph of the conclusion.</p>
           </td>
        </tr>
     </table>
  </body>

这是XSLT:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0">
    <xsl:output indent="yes"/>


    <xsl:key name="pars" match="par[not(@def)]" use="generate-id(preceding-sibling::par[@def][1])" />


    <xsl:template match="/*">
        <html>
            <body>
                <table border="1">
                    <xsl:apply-templates />
                </table>
            </body>
        </html>
    </xsl:template>

    <xsl:template match="item">
        <tr>
            <td><xsl:value-of select="@name"/></td>
            <td>
                <xsl:apply-templates select="richtext/par[@def]" />
            </td>
        </tr>
    </xsl:template>

    <xsl:template match="par[@def]">
        <xsl:variable name="listType" select="preceding-sibling::*[1][self::pardef]/@list" />
        <xsl:variable name="group" select="self::* | key('pars', generate-id())" />
        <xsl:choose>
            <xsl:when test="$listType = 'unordered'">    
                <ul>
                    <xsl:apply-templates select="$group" mode="list"/>
                </ul>
            </xsl:when>
            <xsl:when test="$listType = 'ordered'">    
                <ol>
                    <xsl:apply-templates select="$group"  mode="list"/>
                </ol>
            </xsl:when>
            <xsl:otherwise>
                <xsl:apply-templates select="$group" mode="para" />   
            </xsl:otherwise>     
        </xsl:choose>   
    </xsl:template>

    <xsl:template match="par" mode="list">
        <li>
            <xsl:value-of select="run" separator=""/>
        </li>  
    </xsl:template>

    <xsl:template match="par" mode="para">
        <p>
            <xsl:value-of select="run" separator=""/>
        </p>  
    </xsl:template>
</xsl:stylesheet>

1 个答案:

答案 0 :(得分:1)

当您使用XSLT 2.0时,您实际上可以在此处使用xsl:for-each-group,这可能会简化操作。

您可以按par属性(忽略“空”元素)对def元素进行分组,或者在没有def属性但def属性的情况下对 <xsl:for-each-group select="par[run[normalize-space()]]" group-adjacent="if (@def) then @def else preceding-sibling::par[run[normalize-space()]][@def][1]/@def"> 元素进行分组第一个前一个(非空)兄弟的属性与一个。

groups

您可以使用函数current-group()来获取当前组,而不是<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"> <xsl:output indent="yes"/> <xsl:template match="/*"> <html> <body> <table border="1"> <xsl:apply-templates /> </table> </body> </html> </xsl:template> <xsl:template match="item"> <tr> <td><xsl:value-of select="@name"/></td> <td> <xsl:apply-templates select="richtext" /> </td> </tr> </xsl:template> <xsl:template match="richtext"> <xsl:for-each-group select="par[run[normalize-space()]]" group-adjacent="if (@def) then @def else preceding-sibling::par[run[normalize-space()]][@def][1]/@def"> <xsl:variable name="listType" select="preceding-sibling::*[1][self::pardef]/@list" /> <xsl:choose> <xsl:when test="$listType = 'unordered'"> <ul> <xsl:apply-templates select="current-group()" mode="list"/> </ul> </xsl:when> <xsl:when test="$listType = 'ordered'"> <ol> <xsl:apply-templates select="current-group()" mode="list"/> </ol> </xsl:when> <xsl:otherwise> <xsl:apply-templates select="current-group()" mode="para" /> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:template> <xsl:template match="par" mode="list"> <li> <xsl:value-of select="run" separator=""/> </li> </xsl:template> <xsl:template match="par" mode="para"> <p> <xsl:value-of select="run" separator=""/> </p> </xsl:template> </xsl:stylesheet> 变量。

试试这个XSLT

df