如何更正错误:第6行

时间:2015-07-01 15:35:34

标签: xml perl utf-8

我正在尝试使用perl代码将xml文件转换为html文件,但是得到xml格式不正确且令牌无效的错误。 代码:

#!/usr/bin/perl

use strict;
use warnings;
use XML::Parser;
use XML::LibXML;

my $parser = XML::Parser->new();
$parser->setHandlers(   Start   => \&start,
                        End     => \&end,
                        Char    => \&char,
                        Proc    => \&proc,
                );
my $header = &getXHTMLHeader();
print $header;
$parser->parsefile('output.xml');

my $currentTag = "";

sub start() {
        my ($parser, $name, %attr) = @_;
        $currentTag = lc($name);
        if ($currentTag eq 'doc') {
                print "<head><title>". "Output of snmpwalk for cpeIP4" . "</title></head>";
                print "<body><h2>" . "Output of snmpwalk for cpeIP4" . "</h2>";
                print '<table summary="' . "Output of snmpwalk for cpeIP4" . '"><tr><th>Tag Name</th><th>0</th><th>1</th><th>2</th><th>3</th><th>4</th><th>5</th><th>6</th><th>7</th><th>8</th><th>9</th><th>10</th><th>11</th><th>12</th><th>13</th><th>14</th><th>15</th><th>16</th></tr>';
        }
        elsif ($currentTag eq 'GI-eSTB-MIB-NPH') {
                print "<tr>";
        }
        #elsif ($currentTag !~ /^(0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16)$/)
        elsif ($currentTag ne '0' || $currentTag ne '1' || $currentTag ne '2' || $currentTag ne '3' || $currentTag ne '4' || $currentTag ne '5' || $currentTag ne '6' || $currentTag ne '7' || $currentTag ne '8' || $currentTag ne '9' || $currentTag ne '10' || $currentTag ne '11' || $currentTag ne '12' || $currentTag ne '13' || $currentTag ne '14' || $currentTag ne '15' || $currentTag ne '16' ) {
                print "<tr>";
        }
        else {
                print "<td>";
        }
}
sub end() {
        my ($parser, $name, %attr) = @_;
        $currentTag = lc($name);
        if ($currentTag eq 'doc') {
                print "</table></body></html>";
        }
        elsif ($currentTag eq 'GI-eSTB-MIB-NPH') {
                print "</tr>";
        }
        #elsif ($currentTag !~ /^(0|1|2|3|4|5|6|7|8|9|10|11|12|13|14|15|16)$/)
        elsif ($currentTag ne '0' || $currentTag ne '1' || $currentTag ne '2' || $currentTag ne '3' || $currentTag ne '4' || $currentTag ne '5' || $currentTag ne '6' || $currentTag ne '7' || $currentTag ne '8' || $currentTag ne '9' || $currentTag ne '10' || $currentTag ne '11' || $currentTag ne '12' || $currentTag ne '13' || $currentTag ne '14' || $currentTag ne '15' || $currentTag ne '16' ) {
                print "</tr>";
        }
        else {
                print "</td>";
        }
}
sub char() {
        my ($parser, $data) = @_;
        print $data;
}
sub proc() {
        my ($parser, $target, $data) = @_;
        if (lc($target) eq 'perl') {
                $data = eval($data);
                print $data;
        }
}
sub getXHTMLHeader() {
        my $header = '<?xml version="1.0" encoding="UTF-8"?>
        <!DOCTYPE html
          PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
          "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
          <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">';
        return $header;
}

output.xml的内容是(例如,实际文件包含更多类似格式的详细信息):

<?xml version="1.0" encoding="UTF-8"?>

<doc>
    <GI-eSTB-MIB-NPH>
        <eSTBGeneralErrorCode>
            <0>INTEGER: 0</0>
        </eSTBGeneralErrorCode>
        <eSTBGeneralConnectedState>
            <0> INTEGER: true(1) </0>
        </eSTBGeneralConnectedState>
        <eSTBGeneralPlatformID>
            <0> INTEGER: 2076 </0>
        </eSTBGeneralPlatformID>
    </GI-eSTB-MIB-NPH>
</doc>

实际错误是:

not well-formed (invalid token) at line 6, column 13, byte 112 at /usr/lib64/perl5/XML/Parser.pm line 187

我已经搜索了解决方案并找到了一些帖子,其中提到了相同的问题,但没有找到可以帮助我的解决方案。以下是我在各种在线论坛上发现的尝试内容:

  1. 检查xml的格式并在http://www.w3schools.com/xml/xml_server.asp
  2. 进行比较
  3. 添加空间&lt; 0&gt; INTEGER:0
  4. 已删除 - 编码=&#34; UTF-8&#34;
  5. 更改了&#34; UTF-8&#34;到&#34; UTF-16&#34;和&#34; ISO-8859-1&#34;
  6. perl -ne&#39; print $ _ if if m / ^ output_1.xml并使用output_1.xml
  7. perl -ne&#39;打印$ _ if! / ^
  8. 字符串output.xml&gt; output_1.xml并使用了output_1.xml
  9. 请建议我是否可以尝试更多。我不确定xml或我的代码中是否存在错误。

1 个答案:

答案 0 :(得分:0)

解决方案是遵循choroba的建议,避免元素名称以数字开头。

我决定使用一个XML元素作为eSTBGeneralErrorCode.0; eSTBGeneralConnectedState.0; etc,而不是创建两个单独的元素。我很抱歉延迟接受答案。

最后弄清楚了该怎么做。