根据this post的建议:
我用过:
use utf8;
use open ':encoding(utf8)';
binmode(STDOUT, ":utf8");
use open IN => ":encoding(utf8)", OUT => ':utf8';
use Encode;
当我在法语
上搜索时,它会起作用http://french.godsplanforlife.org/cgi_use/search.html页面但在罗马尼亚语页面上失败。 http://romanian.godsplanforlife.org/cgi_use/search.html当我进行搜索时,特殊的罗马尼亚字符从正确切换为不正确。
以下是search.pl的Perl代码它在搜索页面底部搜索和打印搜索结果:
#!/usr/bin/perl
#search.pl
use utf8;
use open ':encoding(utf8)';
binmode(STDOUT, ":utf8");
use open IN => ":encoding(utf8)", OUT => ':utf8';
use Encode;
# The next three lines import special modules.
use CGI;
use CGI::Carp qw(fatalsToBrowser);
use File::Find;
$cgi=new CGI();
print $cgi->header();
$search_term = $cgi->param('search_term');
$page = $cgi->param('page');
#Make the search term utf8 encoded.
$search_term = decode_utf8( $search_term );
#The root directory is defined by the web hosting company.
# In this case it is Bluehost using Linux servers.
$root_dir = "/home2/godspla1/public_html/romanian";
$root_dir =~ s|/$||; #get rid of trailing slash
$html_lines= "";
#Specify directories to avoid searching.
$excluded = "cgi-bin|cgi_use|derived|images|_notes|_overlay|vti|_vti_cnf";
#Walk the directory tree;
#open the file and look for the term.
#See http://perldoc.perl.org/File/Find.html for the "find" function.
#\&search refers to the subroutine search() that will do the searching.
find( \&search, $root_dir ) if $search_term;
$html_lines ||= "<tr><td>No results found</td></tr>";
$search_results = qq{<table border="0" width="100%" align="center">}
.$html_lines.qq{</table>};
#Open the requested page to put in the results.
open (RESULTS, "$root_dir/$page")
or die "Can't open results page ($root_dir/$page): $!";
#Substitute the search results and replace the search term too.
# see http://www.gossland.com/perlcourse/intro/flow for while loops.
while ( <RESULTS> ) {
#Move the point of printing insertion down to the results area.
s{<!-- search_results -->}{$search_results};
s{name="search_term"\s*?value=""}
{name="search_term" value="$search_term"};
print;
}
close RESULTS;
#--This subroutine uses the find command on line 28 to find the search term.
sub search() {
$seen = 0;
$URL = $File::Find::name;
# !~ means not equal
# -f means the file is a normal file
#Exclude the exluded directories from the search. Files must be html.
if ( $URL !~ m/$excluded|sidebar|footer|vti/ and -f and /.html?/ ) {
$file = $_;
open FILE, $file;
@lines = <FILE>;
close FILE;
#Grab the title, and the file name.
#Each element ($_) of the @lines array is one paragraph from file.
for ( @lines ) {
$title = $1 if m|<title>(.*?)</title>|;
#The Q and the E are delimiters to escape interpretation.
#Increment $seen by one, which makes it true, if the match is seen.
$seen++ if /\Q$search_term\E/i;
$seen-- if m/\Q$search_term<\/a>\E/i;
}
if ( $seen ) {
$URL =~ s|$root_dir||;
#Format the found results into URL, title.
$html_lines .= qq{<tr><td><a href="$URL">$URL</a>};
$html_lines .= qq{</td><td>$title</td></tr>\n};
}
}
}
答案 0 :(得分:1)
要从浏览器中正确读取HTTP_POST中的UTF8数据,您可以使用use CGI;
并稍后解码:
use CGI;
binmode STDIN;
use Encode;
$search_term = $cgi->param('search_term');
$search_term = decode_utf8( $search_term );
或use CGI qw ( -utf8 );
:
use CGI qw ( -utf8 );
binmode STDIN;
$search_term = $cgi->param('search_term');
要正确读取,修改和打印(到STDOUT
)CGI脚本使用的UTF8编码模板文件以生成输出,您应该在文件读取时启用UTF8,并在输出到{{1}时启用}:
STDOUT
最后,您需要告诉浏览器接收的数据包含UTF8:
use open IN => ":encoding(utf8)";
binmode STDOUT, ":utf8";
从你的脚本看,问题似乎主要与最后一点有关..(你遗漏了$cgi->header(-type => 'text/html', -charset => 'utf-8');
)