拉出外部HTML内容并使用XSL创建XHTML

时间:2018-04-24 09:17:33

标签: html xml xslt xslt-2.0

我需要通过从外部html文件中获取信息来使用xslt转换创建html:

我的输入XML文件:

<topic xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/" xmlns:r="http://www.Corecms.com/Core/ns/metadata" class="- topic/topic " ditaarch:DITAArchVersion="1.2" domains="(topic hi-d) (topic indexing-d) (topic d4p_formatting-d) a(props d4p_renditionTarget) (topic d4p_math-d) (topic d4p_variables-d) (topic d4p_verse-d) (topic learningInteractionBase2-d learning2-d+learning-d) (topic learningBase+learningInteractionBase-d) (topic learningInteractionBase-d) (topic learningInteractionBase2-d) (topic xml-d) a(base CoreIdAtt) (topic sdClassification-d) a(base contentstore) " id="T1" outputclass="interactive" r:CoreId="4567">
       <title class="- topic/title "/>
       <body class="- topic/body ">
              <bodydiv class="- topic/bodydiv ">
                     <xref class="- topic/xref " format="html" href="1234.html" scope="external"/>
              </bodydiv>
       </body>
</topic>

我使用element调用了html文件。相应的 1234.html 文件代码为:

  <html>
    <head></head>
    <body>
       <h1>New HTML test with XSL transformation</h1>
        <p>this is a new test about rsuite-edit opening HTML files</p>
    </body>
  </html>

XSL我试过如下:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xd="http://www.oxygenxml.com/ns/doc/xsl"
        xmlns:r="http://www.Corecms.com/Core/ns/metadata" xmlns:exsl="http://exslt.org/common"
        xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns="http://www.w3.org/1999/xhtml"
        xmlns:ditaarch="http://dita.oasis-open.org/architecture/2005/"
        xmlns:df="http://dita2indesign.org/dita/functions"
        exclude-result-prefixes="xs xd r exsl xhtml ditaarch df" version="2.0">

        <xsl:output method="xml" indent="yes"/>

        <xsl:function name="df:class" as="xs:boolean">
            <xsl:param name="elem" as="element()"/>
            <xsl:param name="classSpec" as="xs:string"/>

            <xsl:variable name="normalizedClassSpec" as="xs:string" select="normalize-space($classSpec)"/>
            <xsl:variable name="result"
                select="matches($elem/@class, concat(' ', $normalizedClassSpec, ' | ', $normalizedClassSpec, '$'))"
                as="xs:boolean"/>

            <xsl:sequence select="$result"/>
        </xsl:function>

        <xsl:template match="/">
            <xsl:variable name="html">
                <xsl:apply-templates/>
            </xsl:variable>
            <xsl:copy-of select="$html"/>
        </xsl:template>

        <xsl:template match="*[df:class(., 'topic/topic')]">
            <div>
            <xsl:attribute name="contenteditable">true</xsl:attribute>
                <xsl:apply-templates select="@*"/>
                <xsl:apply-templates/>
                <xsl:choose>
                    <xsl:when test="not(ancestor::*[df:class(., 'topic/topic')])">

                        <xsl:apply-templates select="." mode="generate-comments"/>

                    </xsl:when>
                </xsl:choose>
            </div>
        </xsl:template>

        <xsl:template match="*[df:class(., 'topic/title')][parent::*[df:class(., 'topic/topic')]]">
            <xsl:variable name="headingLevel" select="count(ancestor::*[df:class(., 'topic/topic')])"
                as="xs:integer"/>
            <xsl:element name="h{$headingLevel}">
                <xsl:apply-templates select="@*"/>
                <xsl:apply-templates/>
            </xsl:element>
        </xsl:template>

        <xsl:template match="*[df:class(., 'topic/body')]">
            <div>
                <xsl:apply-templates select="@*"/>
                <xsl:apply-templates/>
            </div>
        </xsl:template>

        <xsl:template match="*[df:class(., 'topic/bodydiv')]">
            <div>
                <xsl:apply-templates select="@*"/>
                <xsl:apply-templates/>
            </div>
        </xsl:template>

        <xsl:template match="*[df:class(., 'topic/p')]">
            <p>
                <xsl:apply-templates select="@*"/>
                <xsl:apply-templates/>
            </p>
        </xsl:template>

        <xsl:template match="*[df:class(., 'topic/xref')]">
            <xsl:choose>
                <xsl:when test=". != ''">
                    <a>
                        <xsl:apply-templates select="@*"/>
                        <xsl:apply-templates/>
                    </a>
                </xsl:when>
                <xsl:otherwise>
                    <a>
                        <xsl:apply-templates select="@*"/>
                        NO URL PROVIDED
                    </a>
                </xsl:otherwise>
            </xsl:choose>
        </xsl:template>

    </xsl:stylesheet>

我期待输出如下:

<div contenteditable="true" data-coreid="4567" html-coreid="1234"> <!-- Where 1234 is the core ID of the HTML file and 4567 is the DITA topic core ID.  -->
      <article>
           <h1>New</h1>
           <p>this is a new test</p>
      </article>
</div>

我需要获取外部html内容(1234.html)并使用XSLT创建新的xhtml文件。我是XSL的新手。您的帮助将会很明显。提前致谢

2 个答案:

答案 0 :(得分:2)

通常document()用于检索外部节点。外部文档必须是可解析的。样式表中有很多部分,在我看来并没有多大意义,例如。您使用<xsl:apply-templates select="@*"/>来处理属性,但由于您没有属性模板,因此只会引用其文本节点。

答案 1 :(得分:1)

鉴于 1234.html 是可解析的,如Ferestes的答案所述

<article>
   <h1>New</h1>
   <p>this is a new test</p>
</article>

你可以使用

<xsl:copy-of select="document(@href)"/>

在您的*[df:class(., 'topic/xref')]模板中复制 1234.html href属性)文件中格式正确的内容

编辑(与编辑问题有关):

整个模板可能如下所示:

<xsl:template match="*[df:class(., 'topic/xref')]">
  <div contenteditable="true" data-coreid="{ancestor::topic/@r:CoreId}" html-coreid="{substring-before(@href,'.')}">
    <xsl:copy-of select="document(@href)"/>
  </div>
</xsl:template>

<强>输出:

<div xmlns="http://www.w3.org/1999/xhtml" contenteditable="true" data-coreid="4567" html-coreid="1234">
    <html xmlns="">
        <head/>
        <body>
            <h1>New HTML test with XSL transformation</h1>
            <p>this is a new test about rsuite-edit opening HTML files</p>
        </body>
    </html>
</div>