Perl和HTML:UTF8在表单中不起作用

时间:2014-07-12 15:02:50

标签: html perl utf-8 character-encoding

我尝试将Perl / HTML文件更改为UTF-8格式。不幸的是我的表格有问题。我创建了一个小测试脚本来举例说明问题。它所做的只是重新加载,以便输入的文本将再次显示。它适用于ASCII字符。一进入德语“Umlaute”(ÄÖÜ),角色就会变形。它也无法处理俄语字符(ЭЯЮ)。这是脚本:

#!/usr/bin/perl

use utf8;
use Encode;
use open ':std', ':encoding(UTF-8)';

# Safe query-string in hash:
$querystring = $ENV{ 'QUERY_STRING' };
read(STDIN, $poststring, $ENV{CONTENT_LENGTH});
if (($querystring ne "") && ($poststring ne "")) { $querystring .= "&$poststring"; } 
    else { $querystring .= $poststring; }

$querystring =~ s/&/=/gi;
%query = split( /=/, $querystring );
foreach $key ( keys( %query ) ) {
    $query{$key} =~ tr/+/ /;
    $query{$key} =~ s/%([\da-f][\da-f])/chr( hex($1) )/egi;
    $uquer{$key} = decode_utf8( $query{$key} );
}

print "Content-Type: text/html; charset=\"UTF-8\"\n\n";
print <<END;
    <HTML>
        <HEAD>
            <META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
        </HEAD>
        <BODY>
            <FORM NAME="frmeing" METHOD="POST" ACTION="test0.cgi">
                <INPUT NAME="df_kurs" TYPE="TEXT" VALUE="$uquer{'df_kurs'}">
                <INPUT TYPE="SUBMIT">
            </FORM>
        </BODY>
    </HTML>
END

您也可以测试此脚本。它在这个地址在线: http://project-website.org/test/test0.cgi 有谁知道可能是什么问题?提前感谢您的帮助!

1 个答案:

答案 0 :(得分:5)

由于您的decode_utf8版本中存在错误。

$ perl -Mutf8 -MEncode -E'
   $u = $d = encode_utf8("é");
   utf8::upgrade($u);   # Changes how the string is stored internally
   say $u eq $d ?1:0;
   say decode_utf8($d) eq decode_utf8($u) ?1:0;
'
1
0

如您所见,$u$d相同,但您的decode_utf8版本对其进行了不同的解码。具体来说,它会使$u保持不变。

这已在较新版本的Encode中修复。 (2.53,我想。)

解决问题的更简单方法是修复自己的错误。使用use open,告诉你的程序在从UTF-8进行url编码和解码之前,从UTF-8解码STDIN。

修正:

#!/usr/bin/perl

use utf8;                      # Source code is encoded using UTF-8.
use open ':encoding(UTF-8)';   # Set default encoding for file handles.
BEGIN { binmode(STDOUT, ':encoding(UTF-8)'); }  # HTML
BEGIN { binmode(STDERR, ':encoding(UTF-8)'); }  # Error log

use Encode;

# Safe query-string in hash:
$querystring = $ENV{ 'QUERY_STRING' };
read(STDIN, my $poststring, $ENV{CONTENT_LENGTH});
if (($querystring ne "") && ($poststring ne "")) { $querystring .= "&$poststring"; } 
    else { $querystring .= $poststring; }

$querystring =~ s/&/=/gi;
%query = split( /=/, $querystring );
foreach $key ( keys( %query ) ) {
    $query{$key} =~ tr/+/ /;
    $query{$key} =~ s/%([\da-f][\da-f])/chr( hex($1) )/egi;
    $uquer{$key} = decode_utf8( $query{$key} );
}

print "Content-Type: text/html; charset=\"UTF-8\"\n\n";
print <<END;
    <HTML>
        <HEAD>
            <META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
        </HEAD>
        <BODY>
            <FORM NAME="frmeing" METHOD="POST">
                <INPUT NAME="df_kurs" TYPE="TEXT" VALUE="$uquer{'df_kurs'}">
                <INPUT TYPE="SUBMIT">
            </FORM>
        </BODY>
    </HTML>
END

但你真的应该使用CGI.pm。

#!/usr/bin/perl

use strict;    # Always!
use warnings;  # Always!

use utf8;                      # Source code is encoded using UTF-8.
use open ':encoding(UTF-8)';   # Set default encoding for file handles.
BEGIN { binmode(STDOUT, ':encoding(UTF-8)'); }  # HTML
BEGIN { binmode(STDERR, ':encoding(UTF-8)'); }  # Error log

use CGI qw( -utf8 );
use Encode;

my $cgi = CGI->new();
my %uquer = $cgi->Vars();

print $cgi->header('text/html; charset=UTF-8');
print <<END;
    <HTML>
        <HEAD>
            <META HTTP-EQUIV="Content-Type" content="text/html; charset=utf-8">
        </HEAD>
        <BODY>
            <FORM NAME="frmeing" METHOD="POST">
                <INPUT NAME="df_kurs" TYPE="TEXT" VALUE="$uquer{'df_kurs'}">
                <INPUT TYPE="SUBMIT">
            </FORM>
        </BODY>
    </HTML>
END