使用XSL 1.0解析URL将XML编码为UTF-8

时间:2017-01-05 18:10:53

标签: xml xslt encoding utf-8

我在解析以下XML文件时遇到问题:

<DataContent>

+金砖+++                         %C3%9Clkeleri + Esnek + EYF%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C016178%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C52 %26lt%3B%2Ftd%26gt%3B ++                         %26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B +++%26lt%3Btd%26gt%3BEmeklilik + Fonlar%C4%B1 +%28BES%29 + - +++噶+%26安培%3B + Esnek + Fonlar %26lt%3B%2Ftd%26gt%3B +++                         %26lt%3Btd%26gt%3B +%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B +%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr %26gt%3B +++%26lt%3Btd%26gt%3BAllianz +雅%C5%9Fam雾化+ ve + EM。                         +%C4%B0kinci +++ Esnek + EYF%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C050458%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B- 0%2C61%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++                         %26lt%3Btr%26gt%3B +++%26lt%3Btd%26gt%3BAnadolu +哈亚特+ EM。+ B%C3%BCY%C3%BC。+ AMA。+++%C4%B0ki。+居。+ EYF%26lt %3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C043109%26lt%3B%2Ftd%26gt%3B +++                         %26lt%3Btd%26gt%3B-0%2C20%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B +++%26lt%3Btd%26gt%3BFiba + EM 。+ VE +干草。+%C4%B0kinci +++斯坦达特+ EYF%26lt%3B%2Ftd%26gt%3B +++                         %26lt%3Btd%26gt%3B0%2C011639%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C16%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt %3B ++%26lt%3Btr%26gt%3B +++%26lt%3Btd%26gt%3BVak%C4%B1F + EM。                         + Gelir + PM。+ 2 + Esnek +++ EYF%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C020458%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt %3B-0%2C15%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B                         +++%26lt%3Btd%26gt%3BNN +干草。+ EM。+卡姆+博尔%C3%A7lanma +++阿糖胞苷%C3%A7lar%C4%B1 +斯坦达特+ EYF%26lt%3B%2Ftd%26gt%图3B +++%26lt%3Btd%26gt%3B0%2C033045%26lt%3B%2Ftd%26gt%3B +++                         %26lt%3Btd%26gt%3B-0%2C02%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B +++%26lt%3Btd%26gt%3BEmeklilik + Fonlar %C4%B1 +%28BES%29 + - +++                         吉%C4%B1L%C4%B1M + Fonlar%C4%B1%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B +%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt% 3B +%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B +++                         %26lt%3Btd%26gt%3BKat%C4%B1L%C4%B1M + EM。+ VE +干草。+++阿尔特%C4%B0kinci + Esnek%28D%C3%B6viz%29 + EYF%26lt%3B% 2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C011757%26lt%3B%2Ftd%26gt%3B +++                         %26lt%3Btd%26gt%3B-0%2C31%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B +++%26lt%3Btd%26gt%3BAsya + EM 。+ VE +干草。+ B%C3%BCY + PM。+吉。                         +++ Esnek + EYF%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C013884%26lt%3B%2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0 2C14%%26lt %3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B ++%26lt%3Btr%26gt%3B +++                         %26lt%3Btd%26gt%3BAsya + EM。+ VE +干草。+ B%C3%BCY +阿玛%C3%A7l%C4%B1 +++的Gr。+吉。+居。+ EYF%26lt%3B %2Ftd%26gt%3B +++%26lt%3Btd%26gt%3B0%2C013993%26lt%3B%2Ftd%26gt%3B +++                         %26lt%3Btd%26gt%3B-0%2C07%26lt%3B%2Ftd%26gt%3B ++%26lt%3B%2Ftr%26gt%3B%26lt%3B%2Ftable%26gt%3B ++%28Not%3A + Analiz + I %C3%A7erikleri + fonbul。                         COM%E2%80%98dan +人%C4%B1nm%C4%B1%C5%9FT%C4%B1R。

  </DataContent>

输出应为:

BricÜlkeleriEsnekEYF 0,016178 -0,52
EmeklilikFonları(BES)-Karma&amp; Esnek Fonlar
AllianzYaşamveEm。 İkinciEsnekEYF 0,050458 -0,61
。干草。 İkinciStandartEYF
0,011639-0,16VakıfEm。 Gelir Am。 2. Esnek EYF 0,020458 -0,15
NN Hay。 EM。 KamuBorçlanmaAraçlarıStandartEYF 0,033045

我的XSL样式表是:

<msxsl:script language="JScript" implements-prefix="user">

        function decode(s)
        {  
        var encodedHTML;

        var decodedString = decodeURIComponent(s.replace(/\+/g, ' '));
        decodedString = decodedString.replace(/&amp;lt;/g,'&lt;');
        decodedString = decodedString.replace(/&amp;gt;/g,'&gt;');
        decodedString = decodedString.replace(/&amp;amp;/g,'&amp;');
        decodedString = decodedString.replace(/&amp;apos;/g,'YY');
        decodedString = decodedString.replace(/&amp;quot;/g,'"');
        decodedString = decodedString.replace(/&amp;#39;/g,String.fromCharCode(39));
        decodedString = decodedString.replace(/&lt;br&gt;/g,'&lt;br&#47;&gt;');
        decodedString = '&lt;html&gt;' + decodedString + '&lt;&#47;html&gt;';

        decodedString = unescape(decodedString)

        decodedString = decodedString.replace(/&lt;style([\s\S]*?)&lt;\/style&gt;/gi, '');
        decodedString = decodedString.replace(/&lt;script([\s\S]*?)&lt;\/script&gt;/gi, '');
        decodedString = decodedString.replace(/&lt;\/div&gt;/ig, '\n');
        decodedString = decodedString.replace(/&lt;\/li&gt;/ig, '\n');
        decodedString = decodedString.replace(/&lt;li&gt;/ig, '  *  ');
        decodedString = decodedString.replace(/&lt;\/ul&gt;/ig, '\n');
        decodedString = decodedString.replace(/&lt;\/p&gt;/ig, '\n\n');
        decodedString = decodedString.replace(/&lt;br\s*[\/]?&gt;/gi, "\n");
        decodedString = decodedString.replace(/&lt;[^&gt;]+&gt;/ig, '');

        return decodedString;         
        }

<xsl:template match="/">

    <xsl:value-of select="user:decode(string(//DataContent))" />  

</xsl:template>

当我解析XML文档时,输出仍包含“&amp; amp;” ...我需要用“&amp;”代替,我缺少什么?我正在使用XSL 1.0版,转换引擎是MSXML或.Net 1.0

1 个答案:

答案 0 :(得分:0)

尝试将输出方法更改为text如果您正在输出XML,则处理器必须转义&符号(除非您指示它禁用输出转义)。